
At 07:10 PM 10/19/2004 -0800, somebody wrote:
Steve Thomas writes:
Most users of PG don't go around grumbling about the lack of XML or the ability to output as PDF. They're just stoked to be able to find the text online.
That's why they're users of PG. If they needed XML or PDF, they go elsewhere.
That's not the point. People don't go to PG thinking, "hmmm, I wonder if they have any XML files". They go looking for a book. If you want the text of a particular book, you'll use it whatever format it comes in, so long as you have the software to handle that format. Nobody "needs" XML or PDF. They "need" the words of the book. Formats are secondary. One of the original ideals of PG was that there had to be a plain text version, on the basis that everyone had at least the tools to handle plain text. Now-a-days, almost everyone has a web browser, so HTML comes second on the accessibility list. Very few people, I imagine, have the necessary tools to work with a TEI or SGML file. Now, there's nothing wrong with the notion of converting all PG texts to some XML master format, and then exporting that to umpteen other formats on demand. Practically though, that's a lot of work -- a *lot* of work -- and I don't yet see any signs that progressing. Commercially (if one were to do this commerically -- this is a hypothetical), I'd estimate such a conversion task, for 10,000 books, to cost around $1,000,000 in salaries alone. Of course, there's always volunteer effort. But if volunteers are busy converting plain texts to XML so that they can be output as plain text (or HTML/PDF/...), does that reduce the effort put into scanning/OCR/proof-reading? Could it be better to put the PG effort into getting plain text editions out, and leave it to others to do the extra conversion to XML etc.? This is a model that has worked really very well for quite a few years, without complaint from any but a few tech-enthusiasts. -- Stephen Thomas, Senior Systems Analyst, Adelaide University Library ADELAIDE UNIVERSITY SA 5005 AUSTRALIA Tel: +61 8 8303 5190 Fax: +61 8 8303 4369 Email: stephen.thomas@adelaide.edu.au URL: http://staff.library.adelaide.edu.au/~sthomas/