
14 Apr
2010
14 Apr
'10
6:09 p.m.
Al Haines (shaw) wrote:
Actually, it's fairly common practice that if a paragraph/verse starts with some kind of graphical/illuminated character, the actual character it stands for is not included in the HTML version.
And that makes the HTML pretty useless for further processing like conversion to mobile formats. It should be made a requirement that the stream of non-markup-characters be identical in all versions of an ebook: lynx --dump should produce a text that wdiffs equal with the text version. -- Marcello Perathoner webmaster@gutenberg.org