
Carlo Traverso wrote:
"Greg" == Greg Newby <gbnewby@pglaf.org> writes:
Greg> On Sun, Apr 18, 2010 at 05:05:09PM +0200, Carlo Traverso Greg> wrote: >> Is PG ready to accept Epub as submission format? (i.e. one >> submits a valid epub from which the other formats are derived)? >> If so, one can target Epub, otherwise at best one is forced to >> submit HTML or txt that converts not-too-badly with current PG >> tools, and this migh be extremely challenging. >> >> Carlo
Greg> I don't think we're ready for this except in rare cases Greg> where ePub is the best format for display for a particular Greg> item (we just released a book where PDF was the best format, Greg> believe it or not).
Greg> The challenge is that when books are fixed, someone Greg> (typically the whitewasher, seldom the original submitter) Greg> needs to regenerate all the files from that book.
Greg> Since there is not yet any standard processing stream to Greg> generate static ePub files, this makes it hard for fixes (to Greg> HTML & text) to be applied to ePubs.
Greg> I would, of course, love to see something become our Greg> "standard" conversion tool, usable by anyone. Right now, Greg> the closest for PG is Marcello's software to build the Greg> cached ePub files. It's wonderful and functional, but is it Greg> ready for all envisioned purposes? I think not, due at Greg> least in part to shortcomings of the input HTML.
That's the whole point of my proposal. Starting with hand-crafted HTML we are likely to end with poor ePub, since the inference of metadata might be wrong, and many features of HTML need to be tuned to ePub and might not turn out correct;
And what about users who download the HTML to view on a mobile? You must produce better HTML not for the sake of ePub but for the sake of universal usability. The metadata come directly from the PG database and are updated whenever the PG database changes. That makes our metadata far more consistent than your proposal would do.
While obtaining reasonable HTML from ePub is just unzipping and discarding metadata.
ePub HTML is often split into chapters, which may leave you with 50+ files after unzipping which you have to merge manually.
This is on my side an offer to work towards the production of a toolchain along these lines, if it is not discarded a priori.
Before that can happen a major `paradigm shift´ has to happen at DP. At DP the PPers enjoy to push their pet preferences down the readers throat: "What *I* See Is What You Get." And most PP time is spent in weaving those personal preference deep into the markup so as to make the markup pretty useless for anything but desktop devices with lots of screen, lots of cycles and lots of RAM. What the PPers should do is to produce light semantic markup that lets the user choose the presentation and device: "Get It The Way You Want." The PPers will have to relinquish their power of God -- or have it wrested from their hands -- and very strict guidelines will have to be put into place as to what markup is accepted. -- Marcello Perathoner webmaster@gutenberg.org