
re: Jon Noring reply to "Re: !@!@!RE: [gutvol-d] Perfection: Jon brings up several points that are between the past and the future, and obviously he has some differing points of view as to when each of these events might be placed on the calendar. The obvious point right now is whether Project Gutenberg should be doing several possible editions of each eBook, or should be comparing several different editions and creating our own edition that we hope will eventually be better than any of the previous paper editions. Jon says we should be doing separate editions, due to advances in disk space, download speed, and the time when Distributed Proofing will be doing 1,000 eBooks per day. * If we presume this is going at a rate of about 10 per day [we are at just about 11 per day in reality] and that this rate should be doubling at Moore's Law rates, then we would have this scenario: Bks/ Day Years Date 10 0 2004 20 40 3 2007 80 160 6 2010 320 640 9 2013 1K+ 10 2014+ I agree that when all of these have been integrated into the world of 75% to 90% of even our own portion of the Internet for several years [enough time to do our first eBooks of most of these books] then it will certainly be time to start including variant editions, as we have already done with some of the great works such as those of Shakespeare, Dante, the Bible, etc. In fact, my own estimate of the time we will have 1,000,000 eBooks certainly lies within the realm of Jon's suggested 1,000 per day. By that time, we will probably be finding it harder and harder to track down all the editions we have yet to do, and it will be a matter of very good timing to start in on creating all variants of the editions Jon wants us to have. Hopefully by this time, OCR will be so accurate that the dream of simply using it as one would use a xerox machine, will be closer to reality. In the interim, perhaps we can simply make available various eBook editions that do and don't include any corrections of typos, missing words, lines, paragraphs, etc. This, along with perservation of the original scans, should allow for a timely revision of any and all eBooks we produce. With the aid of various "diff" and "compare" programs, editors can even proofread the same eBook into the various composite or non-composite editions Jon suggests we should have. Anyone who wishes to volunteer to assist Jon in his efforts should let us know, and we will work up a listserver and other support for this effort. Michael S. Hart P.S. The day should eventually come when such efforts are no longer required at the human level, and Jon can simply scan and OCR each separate edition with a sufficient level of accuracy that it could either stand immediately on its own, or do so with only a small amount of human intervention. . .less effort than it may take to work from a previous scan of a different paper variant.