
google will eventually have error-free text throughout. which is what i've said all along. it is fairly easy to see. and the 50,000-item p.g. library will merely be "quaint," a testament to an outdated workflow that wouldn't scale.
As one who has actually worked professionally on a number of different recognition system I respectfully disagree with your prediction of the future. No OCR system will ever be "error free." why would you "imagine" such a system, when you could find it described perfectly, with proof-of-concept demos, in the posts on this very listserve going back many years? proof-of-concept demos are a form of "imagining" rather than a reality, as anyone who has actually implemented such demos well knows. "Demos" often remain "demos" forever, because turning "demos" into "real world products accepted by real world users" is so danged hard. It would be curious to see which actually get more reads, PG's "quaint" collection of 50,000 items as distributed by PG and 100's of other sites, or Google's "Millions" of OCR texts. I read both - but personally I read PG more. And I am an agnostic omnivore not a "PG-centric" when it comes to reading - I even read some of the crap Murdoch publishes.