
"Marcello" == Marcello Perathoner <marcello@perathoner.de> writes:
Marcello> I'm working on a similar project but I've opted for a Marcello> desktop application. I think that every proofing Marcello> application should do 99% of the work automatically and Marcello> pass only the 1% of dubious cases to the human Marcello> proofer. How are you supposed to recover when the application is absolutely sure of being right, but it is wrong instead? I see this frequently enough especially in accents and punctuation (for example comma/period errors driven by a capitalized word in the middle of a sentence). Marcello> ......................... Marcello> I wonder if putting a white-on-transparent OCR overlay Marcello> on top of the scanned layer lets you find errors just by Marcello> looking at the black spots. It is frequent enough to have cases in which the original print contains for example a broken l that looks like an i or a fuzzy i that looks like an l. The OCR will be wrong, and the overlay will look OK. There are also a lot of cases in which a character overlays a different one (for example, . with overlay any of ,!?:; or any unaccented letter with overlay the same letter with an accent). This would require moreover that the fonts match exactly the font in the original, both in form and size. Most of the unevenness in the display at edwardbetts.com depends on an imperfect matching of fonts and image. I see the interest of the application as a replacement of pgdp proofreading interface with a solid synchronization of image and text; this was one of the features of fadedpage that worked only occasionally, and works very well instead at edwardbetts.com. Of course, it will work with any correction workflow, and this reusability is an excellent feature. Carlo