
greg said:
We're planning to include the scanned page images along with eBooks. In fact, this is part of the intent with the new directory structure for the PG servers (the /1/0/8/0/... structure).
We haven't done any (or many, anyway) because we're still trying to figure out how to best name the page files, and how to link them on a page-by-page basis into the (marked up?) eBooks. Jim Tinsley drafted some general guidelines for the image files themselves, but linking them to the eBooks is something we need to figure out still.
(BTW, the Million Books project at archive.org uses djvu for this purpose. It's not bad, but I like our intended solution of XML markup much better. Plus, of course, the MBP is mostly working with relatively poor quality proofreading. For PG, the text has taken the main emphasis, not the appearance.)
My notion is that the PGTEI and TEI lite solutions I've been reading about in this list will be easily adaptable to including links to specific page image files, so I've not mentioned it until now.
sometimes i feel like i'm talking to a wall... greg, i can give you this capability _right_now_, with your plain-text files (i.e., the whole library), if you would only make it your policy to: (1) include page-break information in the files, and (2) use a sensible and consistent naming standard; neither of these is difficult to realize in the slightest. (if you need some input on them, i'll be happy to give it.) if you'd like to see a demo program that does this -- using the page-scans and text-files over at d.p. -- say so publicly (before thursday) and i'll put one up. or continue delaying, it makes no difference to me... -bowerbird
participants (1)
-
Bowerbird@aol.com