
Greg wrote:
It occurred to me that some people might think that page scans are forbidden or not welcome. While it's true that we don't have many (any?) eBooks with full page scans, we *are* willing & able & ready to take them.
This is excellent news! Yes, I think people were uncertain about how welcome page scans were by PG. (Whether PG should require page scans be submitted along with texts, with certain exceptions given, is a different issue.) Obviously, if the page scans existed for all the 10,000+ PG texts, the collection of scans would occupy a lot of space, but surprisingly not as much as one might think, at least by today's hardware standards. Assuming we have 15,000 texts, each of which has an average of 300 source pages (which may be a high estimate -- anyone?), and each page scan occupies about 60k (using an efficient lossless compression scheme -- this may also be a high estimate -- anyone?), this works out to approximately a little under 300 gigabytes. (My son recently bought two 200G hard drives for $100 each. There are 300G drives available, and it seems like year after year hard disk capacities continue to increase, while $/gig continues to drop.) I know Brewster Kahle at the Internet Archive will also be happy to receive file copies of these page scans and tuck them away into his archive (which is redundantly mirrored) for preservation and open online access. Of course, with one million scanned books, we are now talking about significant space, approximately 20 terabytes (using the assumptions above). But this is 1/5 of Brewster's "rack" (where 10 racks makes a petabyte) and again I know he'll be thrilled to store these away for safekeeping and open access. (PG should also store these scans itself and find others throughout the world willing to store them on hard disk, tape, etc., to assure redundant storage and preservation.) It would not surprise me to see in a few years high quality, durable, random access, compact, and very cheap storage in the ten to twenty terabyte range per unit -- enough to hold the original page scans for one million books. We then can start thinking about one billion books. So storage and access should NOT be an issue with regards to acquiring the original page scans for the PG Library. Jon Noring