
Michael Hart wrote:
Google's monster speciality is SEARCH ENGINES!!!
They are MUCH more interested in writing a search engine that will read fuzzy OCR text than in increasing the accuracy of the text.
You mean a search engine that finds "I)arwin" when I search for "Darwin"? That search engine would have to automagically decide that "I)" looks quite a bit the same as "D". But that's the same thing an OCR software already does! to match characters against ink stains. If they come up with some better algorithm to do that, they would be foolish not to use it directly on the scanned texts. Somewhere they have to keep the OCRed text of their books. It would take much less cycles to clean up the text (once) instead of having the search engine do a fuzzy match every time a user does a search. -- Marcello Perathoner webmaster@gutenberg.org