
I can volunteer one of my development servers, or call Brewster myself, if it would help someone ready to do the work. As Juliet said, there are easy parts but also some non-trivial parts. An automation process, to pull all images from DP at a time an eBook is posted, is very much non-trivial. But just doing one or two titles as a sample would help. The post-10K file structure (described in GUTINDEX.ALL and elsewhere) allows specifically to include page scans. These don't need to exist separately from their eBook, though for first efforts it might make sense to store them elsewhere. As Juliet & others mentioned, the *archiving* is already being done. The next step is distribution. -- Greg On Thu, Jul 14, 2005 at 01:53:57PM +0200, collin@xs4all.nl wrote:
Jon:
There seems to be a limited, dichotomous view of the uses and users of structured digital texts: either casual reading by the average Joe, or hard-core academic use.
Undoubtedly that view exists, perhaps even at PG. Er, so what?
In the meanwhile, here's one system that could be tried until the more permanent system is developed:
1) Dial 415-561-6767 during working hours.
2) When Beatrice or Astrid answers, ask for Brewster Kahle.
3) Identify yourself as a DP person, and ask Brewster if he will archive and make available DP's page scans via a stable URL.
4) Await his answer, which may include an alternate suggestion. But I suspect it will be a positive reply. Brewster *loves* any and all high-quality public domain content.
This may take a half hour.
That doesn't seem too difficult to me.
And yet you seem completely unable to do this. I wonder why? In the time you took to write this lengthy e-mail, you could have set up a page scan archive at TIA. If this is so important to you, why haven't you done this already?
Jon, at the moment you come across like the nth of the Vapourware Kings that are regularly trolling this board. "Why don't you do X? Any idiot could do X in two working days!" Now I know you are not a Vapourware King, so what's with the act? The most likely reason why we have no page scan archive is because no-one has taken the time to set it up.
DP's page scans are accessible to anyone with an account. (Probably even to those without an account.) The only hard bit is knowing which PG posted text goes with which DP text ID, so that you can recombine them when necessary. I believe we even save bibliographical data with our texts, so that you could extract all kinds of metadata to go with the pagescans.
I'd do it, but I have other things to do.
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d