
Brad Collins wrote:
For A Christmas Carol I would rather use Scott's approach:
<div id="ch1" type="chapter" n="1"> <head type="DivLabel">STAVE ONE.</head> <head>MARLEY’S GHOST.</head>
Rather than:
<div id="ch1" type="stave" n="1">
You'll also have to consider XPath queries. In a couple of years we'll likely put all of the PG TEI files into a giant XML database. No more files. You'll retrieve a book with an XPath query like (simplyfied): /org/gutenberg/etext/12345 You'll get the book title(s) with /org/gutenberg/etext/12345//titleStmt/title and the title of the first chapter with /org/gutenberg/etext/12345//div[@type="chapter"][@n=1]/head Of course this will only work if the first chapter always has attribute type="chapter" and attribute n=1 and not n="I" or n="Chapter 1" or n="Chapter I" ... -- Marcello Perathoner webmaster@gutenberg.org