
On Fri, 24 Dec 2004, Marcello Perathoner wrote:
Michael Hart wrote:
Project Gutenberg has already produced and distributed nearly 15,000 eBooks, with a budget that has yet to reach a significant total for all 33+ years, and is projected to reach a million eBooks without undue expense or effort.
PG produces books at a lower cost only if you neglect the cost of volunteer work. I'm sure a big organized corporation like Google can create eBooks way cheaper than a loosely organized group of volunteers like PG.
We'll find out, won't we? I'm still betting we will be first to 100,000. Then it'll be fun to see how it goes to 1,000,000. Of course, after 10,000,000, things will really slow down, in the sense that it will become hard to find more books.
We'll just have to wait and see if either Google Print, or any of the various "Million eBook Projects" will ever come up with even 1% of a million eBooks that you can carry with you on a one inch stack of plain homemade DVDs.
Whereas PG already has reached 1.5% of a million books with 98.5% still to go.
Hopefully more news on this front shortly.
If it hasn't been proofread, and if you can't take it with you, it is only of limited value. . .sort of like reading over someone's shoulder.
Depends on what you want to do with the book. If you only want to cite some work a page scan (that you cannot take with you but is error-free) is much better than a proofread eBook (which may contain OCR errors).
I have yet to read any paper book that is error free. . . . Eventually the eBook will be more accurate than the source, perhaps in your lifetime for many eBooks.
With Project Gutenberg eBooks, you OWN them. . .forever. . .and can save them in your own favorite formats, fonts, margination, pagination, or whatever, and you can search, quote, print, and do all the normal eBook fuctions.
Yours forever ... until new copyright laws separate you.
Luckily US and AU copyright changes are not retroactive, as are those of more olde worlde countries. . . .
I would say that an eBook has to be at least 99.9% accurate, and that it should then be a process as people read the eBooks, to send in corrections.
That is ~ 2 errors per page if you assume a line length of 55 and page length of 40 (~ 2000) chars.
The Library on Congress standard is 99.95%. . .one error per page. Of course, some people count a stray character in the margins as an error, or a typo in the header/footer/page#. . .I only count the authors's words.
Most of the Project Gutenberg and Distributed Proofeaders would say it has to be over 99.99% and perhaps even over 99.999%.
That is approx. one error every 5 pages or every 50 pages. Still not very good.
Reading one of Brewster's books with Greg the other day, it was obvious only the author's words had been proofed, the headers/footers/page# were often messy, but the book itself was quite readable. It had perhaps less than 1,000 characters per page, but only one real error. . .another was a capitalization error that may bother some and not others. . .in about 10 pages. That's at least one "hard" error, and one "soft" error per 10K, 99.99% or 99.98%. . .if you don't count header/footer/page# errors. . . . This is well beyond the Library of Congress standards of 99.95% if someone were to decided to "sew all the pages together, into a single file eBook, and eliminate the headers/footers/page#'s etc. I was quite impressed. . .and I will have to look at more of them.
Not only that, but, viewing the entire eBook effort as a 50 year process, of which I have walked 33+ years, I must state for the record that I think OCR, spellcheckers, grammarcheckers., etc. will be so much better a decade from now that doing the proofreading on the more obscure works will require so much less effort than it does today, that it will be a great trade-off.
Which poses the question: isn't Google's approach to just scan the books today and wait, better suited to achieve the 1 million target? Every progress in OCR technology automatically "proof-reads" all books Google has scanned.
This has been the approach of all the "quick and dirty" eBook projects, certainly all those that project a million eBooks in the next 10 years. Except, of course, Project Gutenberg. Thanks!!! So Nice To Hear From You! Happy Holidays!!! Michael Give FreeBooks!!! In 39 Languages!!! As of December 25, 2004 ~14,815 FreeBooks at: ~185 to go to 15,000 http://www.gutenberg.org http://www.gutenberg.net We are ~96% of the way from 10,000 to 15,000. Now even more PG eBooks In 104 Languages!!! http://gutenberg.cc http://gutenberg.us Michael S. Hart <hart@pobox.com> Project Gutenberg Executive Coordinator^M "*Internet User ~#100*" If you do not receive a prompt reply, please resend, keep resending.