
Hi Lee, Am 25.01.2012 um 20:22 schrieb Lee Passey:
On Tue, January 24, 2012 6:33 pm, James Adcock wrote:
Don>Starting at the most basic level, is there any good reason not to use utf-8 as the basic encoding standard for everything including plain-text?
No.
BOM or no BOM?
Depends on the file. XML files (XHTML, TEI, etc.) are guaranteed to be ASCII in their first line, and that first line declares the encoding, so no BOM is Just to be picky. But, you err here. The above mentioned files are not guaranteed to be ASCII. only txt. Yet, as you state the first lines can contain encoding information. necessary (and would probably confuse some tools). Subtle markup languages like reStructuredText which have no prolog need some mechanism to indicate that they contain UTF-8 encodings (to distinguish between that, latin-1 or MacRoman) so may need to have a BOM. BOMs should generally effect processing unless one is acessing the the file on the byte level. Of course this depends on the system and how the programs are compile to interact with the file system.
regards Keith.