
Lee>On the other hand, it may be possible to take a page from PG's book and route around the damage. I've looked at the HTML output you've provided from PG and I haven't seen anything that can't be repaired. It should be possible to build a web interface that sits in front of PG, forwards requests, rewrites the HTML to meet industry standards, then either delivers /that/ HTML or compiles it into ePub or .mobi. In practice if/when people do this kind of things they cache the results so as not to keep hitting the PG website repeatedly unnecessarily. If one does that then consider simply mob getting the books via http://www.gutenberg.org/wiki/Gutenberg:The_CD_and_DVD_Project so as not to have to hit the servers at all. One needs to have more knowledge/love-of DB than I have to make a strong website, but http://freekindlebooks.org still runs as an example of this approach that predates PG willingness to host EPUB and MOBI, and which still draws a couple hundred thousand downloads a month -- even though I've been trying to steer customers back to PG now that PG -- more or less -- hosts EPUB and MOBI file formats. Not clear to me that the PG legalize would allow one to retain the PG tm on the files once one has internally tweaked the formatting problems, though. Bottom line though, is that the books most often downloaded from PG *ought* to be reworked in any case -- whether one wants to read those books in txt70, html, epub or mobi format. It is crazy that in practice PG doesn't have a way to rework that which gets read most often -- and which needs a rework.