
Scott Lawton wrote:
What about the content of the tag? i.e. which is correct?
<language id="en-gb"></language> # lmiss.tei <language id="en-gb">British</language> # alice.tei
Both work. The contents of the tag does not matter. The lang attribute is and IDREF. If you say <foreign lang="fr"> then you must have an element somewhere in your TEI with and id of "fr" otherwise it will not validate. The <langUsage> section is just a bin to hold those elements.
Some formats have limitations. eg. PamlDoc bookmarks have a maximum of 16 characters. PDF bookmarks have to use iso-8859-1 chars. Moreover you don't always want the full <head> to appear in the contents.
So, the PalmDoc and PDF headers can be generated to conform to those limitations. I don't see the benefit of including these extra tags for every chapter of every document in the PG collection!
How do you go about to condense a longer title into 16 characters? There is no algorithm that can do that nearly as well as a human. A human will always choose to include the most important part. CONSULTATION OF DEVILS, AND BIRTH OF MERLIN. => Birth of Merlin -- Marcello Perathoner webmaster@gutenberg.org