
Joshua wrote: [keeping his whole reply intact]
My main involvement with PG texts comes from a DP background. I'm one of the folks that help put the PG texts in place. So my perspective is not as much from the point of reading the texts and it is producing the texts. This isn't to say I don't consider the reader, but everyone tries to scratch their own itches first, and my itches are from a producer's point of view.
When you create a PG text now a days, most people create multiple "versions." At the most basic, people usually create the text version and a HTML version. Text is because that is the minimum required at PG, and HTML because there is a lot of information that cannot be well represented by a plain text file opened in Notepad. Images are the first example that come to mind.
Then, there are some texts which require/practically beg for additional "versions". We have scientific texts that really need a latex master document that is rendered to PDF. Languages Other Than English (LOTE) texts that require a larger character set than ASCII, so you might do a UTF-8 encoded text.
The problem is, once you've create the first version (let's say it is the UTF-8 encoded plaintext format), you now have to do the manual work for the other formats. Sometimes this is trivial, sometimes it is not. But to make matters worse, it is not uncommon to notice a typo in the HTML that you didn't fix earlier. Now, you have to go back to the other versions and make the same "fix". This very quickly becomes an organizational nightmare as I'm sure you can imagine.
XML solves this to a large extent. I create one "master" document and then literally click a button and I get a UTF-8 encoded .txt file, a Latin-1 encoded .txt file, an ASCII encoded .txt file, a HTML encoded file, and a PDF file. I post all of them to the ww'ers in a fraction of the time. Plus, if someone down the road finds a problem in the text, the fix can be applied to the master XML and the others files can be regenerated.
We are not doing away with the .txt files you want. We are coming up with a more efficient way to create it (along with the many other document formats people want).
Oh, and yes, it is possible to create conversion routines for other formats as well. Marcello had a Palm format working at one point, if I remember correctly. A MS reader .LIT is possible (the specs are freely available and under a free license, we just need someone to take the time to create the converter). Rocket ebook reader and others should all be possible as long as the spec for the format is freely available.
Please feel free to ask any questions you want on the subject. I'll be happy to run at the mouth all you want! ;)
Kudos! This is by far the best reply I've yet seen on the practical benefits of XML for producing structured digital texts. Cogent, simple, and to the point, backed up by real-world experience. Joshua, you might consider submitting what you wrote to David Rothman's TeleRead blog as a guest blog article (his blog is one of the more popular blogs on the Internet, and by far the most read blog regarding ebooks and digital libraries.) Let me know -- I will be glad assist. Jon