Re: [PG-EU] INTERNET FOUNDER SAYS eBOOKS NEXT BIG THING

Michael Hart <hart@pglaf.org> writes:
Vint Cerf, known as the creator of the Internet said, [...] The result is that Project Gutenberg and its fellow supporters of The World eBook Fair have added more books for The Second World eBook Fair in the just about two months since the first of the World eBook Fairs than Google has been able to add in the just about two years since it announced The Google Print Library in the Fall of 2004.
Do you believe your own or his statements? With every new day I find new books in PDF format with Google Book Search, books not available somewhere else.
The Second World eBook Fair will open October 1, 2006, in honor of World Book Fair Month at
How is this site meant to work? I do not find a single book there. I directs me a searchportal with "sponsored links" and "top sites". If I search for, say, "Goethe", no single Goethe item seems to be available.
100,000+ Free eBooks Available from Project Gutenberg
Approx. 20,000 books are available from PG, that's very good--it will take several years until we can offer 100,000 books. -- http://www.gnu.franken.de/ke/ | ,__o | _-\_<, | (*)/'(*) Key fingerprint = F138 B28F B7ED E0AC 1AB4 AA7F C90A 35C3 E9D0 5D1C

Karl Eichwalder writes:
Michael Hart <hart@pglaf.org> writes:
Vint Cerf, known as the creator of the Internet said, [...] The result is that Project Gutenberg and its fellow supporters of The World eBook Fair have added more books for The Second World eBook Fair in the just about two months since the first of the World eBook Fairs than Google has been able to add in the just about two years since it announced The Google Print Library in the Fall of 2004.
Do you believe your own or his statements? With every new day I find new books in PDF format with Google Book Search, books not available somewhere else.
I personally find Michael Hart's counts of his World eBook Fair to be quite inflated, counting many Project Gutenberg books two or three times. I have personally located over 110,000 full view books at Google Books (I'm sure there's many more, but they locked down my domain a month ago to require CAPTCHA responses every 30 minutes or so, and I've been working on other things). Some of these books are duplicates also, where Google scanned two different copies of the same book. About 6-8 months ago, I ran a search that seemed to max out at about 50,000 books, so it's quite possible that Google is now scanning about 10,000 public domain books a month, which is far more impressive (and more valuable to the public domain) than finding a few more public domain archives and converting their contents to PDF (sorry Michael). On the plus side, Google now makes PDFs of most of the full view books with high resolution images and often has links to Open WorldCat. On the down side, they still seem to be skipping illustrations that don't have page numbers, and often lose a page or two around the illustrations. The PDFs are images only, so they're not really usable in low resolution devices like PDAs and cell phones. I'm still hoping to hear big things from the Open Content Alliance, but I haven't heard a word from them since their opening press releases. At that time, they were planning big things for October 2006, but I don't know if that's still their timeline.

Bruce writes:
I personally find Michael Hart's counts of his World eBook Fair to be quite inflated, counting many Project Gutenberg books two or three
I've seen this also. There seems to be the main PG site, PGCC, and the former blackmask.com which all seem to have very similar content. I would expect at least three copies of each PG book which adds up to around 50,000 to 60,000 ebooks. That's an impressive number but very much inflated. That doesn't count other overlapping collections that I'm not aware of such as the Widger Library, etc.
On the plus side, Google now makes PDFs of most of the full view books with high resolution images and often has links to Open WorldCat. On the down side, they still seem to be skipping illustrations that don't have page numbers, and often lose a page or two around the illustrations. The PDFs are images only, so they're not really usable in low resolution devices like PDAs and cell phones.
Yes, and there you point out the major difference between Google and PG. While PG has about 20,000 books and PGCC has more than that, the Google books are pdf images and the PG books are plain text and html. I am blind and really have no way to use pdf images. I can print them to a virtual printer which in turn converts them to text with an OCR engine but this is a pain at the least and often locks up my computer. I recently had a major drive crash and had almost nothing left. Thanks to PG, I at least had reading material. Downloading, unzipping and reading would be impossible with Google. I can extract text from pdf files but only if they are saved as text. Also, PG books have a very low error rate while Google apparently skips and duplicates pages for no reason. For those reasons, I still think that PG is more impressive. It also comes down to how you define an ebook. If an ebook is just a book scanned and turned into page images, your figure is correct. Google has far more than PG. If an ebook is supposed to be useful to the masses and have the same or better accuracy than a printed book, it sounds like PG has more than Google. Anyone can scan a book and put it online but it's much harder to proofread and fix errors. I've read lots of bad scans in my time because that's all I could get. There is another site, Bookshare.org, primarily for the blind. They also offer full texts of books. They always have scanning errors. For legal reasons, they can't go through rounds of proofreading like DP. I'll take PG any day except that almost all of the PG books are prior to 1923 because of the copyright laws. I download books from Bookshare because I have no way to read them otherwise. On the other hand, I've seen some Google book excerpts with reasonably good text quality but it looked like I would have to read the text online which I didn't want to do. -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.405 / Virus Database: 268.12.8/455 - Release Date: 9/22/06
participants (3)
-
Bruce Albrecht
-
Karl Eichwalder
-
Tony Baechler