
David Starner wrote:
But we know HTML, for one. We also have tools that help us with HTML, for two. For three and the strike-out, I have a host of tools that will help me edit, verify and view HTML, but there is no Debian packages for PGTEI.
Where's the debian package for guiguts? I had to actually edit the code to make it run on my debian/unstable. nxml-mode in emacs is all you'll ever need to edit and validate xml. Or use the TEI stylesheets in OpenOffice, if you must needs have WYSIWYG. Sheesh!
PG has an implementation of TEI. I know you don't like it because you haven't figured out how to produce pretty title pages.
Note: by "pretty title pages" Marcello means a title page that looks like any title page in an actual book. Once again, I grabbed the nearest books; I have ten books, by ten different publishers, including two in Esperanto and one in a mixture of Esperanto and Chinese, and with the exception of one of the English books which right-justifies its title page, they all follow the basic format of centered pages, title (new line) author (bottom of page) publisher. None of them look a darn thing like the title pages PGTEI prints out.
Ohh. Pleeeeease! Go here: http://www.gnutenberg.de/pgtei/0.5/examples/candide/4650-pdf.pdf and tell me what you don't like about the title page. And then go here: http://www.gnutenberg.de/pgtei/0.5/examples/candide/4650-h.html to verify that it looks the same in HTML. And then go here: http://www.gnutenberg.de/pgtei/0.5/examples/candide/4650-0.txt to see how it looks in TXT. All from ONE and the same TEI master.
If I saw a single document produced from PGTEI that was suitable for end-user consumption, I might support it.
http://www.gnutenberg.de/pgtei/0.5/examples/
The librarian is never the end-user. The librarian is the person who makes it available to the end-user. Nobody around here cares about the linguistic researcher as the end user, and we will never produce files that are marked up with the type of information--like distinguishing sentence ending punctuation from the same punctuation used other ways--that they need. The end user we're targeting is the reader.
(Distinguishing punctuation is very important for typesetters.) YOU are targeting the reader that reads on a desktop browser. I am targeting everybody on every platform of every size and every software that might want to use or convert our books in any way imaginable or not yet imaginable.
Yes, in fact, some PPers do want to produce an etext that replicates the original, includes the important illustrated dropcaps (that are frequently as much a part of the illustration of the book as any other illustration) and page numbers (that are crucial for much of the non-fiction that we reproduce, especially if you want to follow the web of references from one PG era book to another.)
And while they are busy `replicating the original´ they miss all opportunities of electronic text. Eg. the index entries are still linked to the *page* they reference, while it was technically possible for decades now to go directly to the word. So if the reader clicks on an indexed term, she must read all the page until she finds the reference instead of going directly to the reference (and maybe have it highlighted like on Wikipedia). This opportunity of making the books more accessible has been missed because DP is still producing electronic facsimiles instead of electronic books. Eg. speaker tagging. In a few years when everybody will have speech syntesis on their cell phones ebook readers people may want to listen to their books while driving. If you have quotes marked up you can assign different voices to different speakers. Eg. geografic tagging. While visiting someplace you may want to find all book references that refer to the place you are in. DP misses out again and again. But they make pretty facsimiles ...
And if you had produced TEI output that could do what people wanted to do, it's possible that we would have better output on the ePub devices.
If people had started using TEI instead of griping endlessly about minor shortcomings, we might have now a complete TEI workflow in place.
Right now, I would be surprised to find that PGTEI can output at all to ePub, and I wouldn't be surprised if the people who produced the DP output were happier with the results of their HTML translated to ePub than your HTML translated to ePub.
PGTEI outputs just fine to ePub. Just take the HTML output and convert it in Calibre or whatever you are using. Look here (this is PDF, not ePub): http://www.gnutenberg.de/pgtei/0.5/examples/pgtei-pdf-sony-reader.jpg -- Marcello Perathoner webmaster@gutenberg.org