Re: [gutvol-d] Preferred diacritical mark

"Phil Hitchcock" writes:
I am currently preparing e-text versions of "W H Sleeman's, Rambles and Recollections of an Indian Official". The book describes life and customs in India in the 1830's.
Many of the place names, personal names, and various other words have a dash - placed over an a, e, i, or u, to indicate a long vowel.
When I produce the 7-bit ASCII plain text, these marks will be missing; in the 8-bit ASCII version I am planning to use a circumflex accent ^ to replace the diacritical mark.
However in a HTML version I could use the circumflex accent, or I could use the Unicode series starting with Ā to give a vowel with a dash over it, thus reproducing the original text form. However I have seen some present day publications using the circumflex accent on Indian place names.
Thus, I am wondering what the Project Gutenberg preferred form would be for the diacritical mark in the HTML version.
Always replicate what's in the book. There is sometimes a call for modernized versions, but we should have the original, unmodernized version first. I think using the circumflex in the Latin-1* version to be a suitable replacement, especially as that's what modern users are using, but put a transcriber's note at the top of the document noting that's what you've done. * There is no such thing as 8-bit ASCII. There's only 7-bit ASCII. There are 8-bit extensions to ASCII, literaly hundreds of them. Latin-1 (also known ISO standard 8859, part 1, or ISO-8859-1) is the one that PG usually uses; it's very similar to CP-1252 (for most purposes, a subset of CP-1252), the character set that Windows 95 on uses for Western Europe, and most likely you're most familiar with one of those two. -- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm
participants (1)
-
D. Starner