
Hmm.... at a quick look, it appears to be problem in the original file. At the top of the file, we have this: Character set encoding: ISO-8859-1 And all transformations are done according to character-encoding standards. However, this file uses a representation of the apostrophe character which is not included in the ISO-8859-1 standard. If you want some history, basically you can blame microsoft. They developed their own character sets for use with Windows, which were _close_ to already-established standards, but not quite identical. Most often you see the effects of this cropping up when people use "curly quote" characters. For the Latin-1 texts in PG, we use just a plain ' character for an apostrophe. Automatic checking that is done when texts are submitted will flag this before a text is posted. However, there are some older text (such as this one) where this problem can still crop up. --Andrew On Sat, 8 May 2010, Joaquin Cuenca Abela wrote:
Hi,
the ebook http://www.gutenberg.org/etext/14155 is missing all the apostrophes in the generated versions (at least HTML and ePub). The two hand-crafted files (plain text and rtf) contain the apostrophes, for instance one of the very first lines in the plain text file is: Permettez-moi d’inscrire This has been converted in the HTML version to: <p id="id00015">Permettez-moi dinscrire Is this due to a bug in the epub-maker used to convert the file, or is there something buggy in the original text? Cheers, -- Joaquin Cuenca Abela