
On 2011-12-30 22:00, Carlo Traverso wrote:
"Edward" == Edward Betts<edward@archive.org> writes:
>> The use of sans-serif proportional fonts gravely degrades the >> visibility of some kind of recognition errors (I and l, >> uppercase i vs. lowercase L; ri vs. n etc.) especially when the >> font is too large and the letters fall one above the other.
Edward> Good point, I can switch to serif.
>> I would suggest to display and edit line by line, with a >> fixed-width font. Moreover, one should show the difference >> between a soft and a hard hyphen, (this is a difference in >> whinh often the OCR is hopeless, as well a corrector of one >> line or one page: is to-day or today once the lines are >> rejoined?)
Edward> I'm not sure about your argument for a fixed-width Edward> font. You're right about hyphens.
The point on proportional vs. fixed point fonts is the following: some typical misrecognitions happen since some letter combinations are similar to other ones in a typical proportional font (e.g. n might recognized as ri, m as rn or even rri). Using a font that reproduces the typographical aspect of the original, it is easy to read "arid" as "and" if the context asks for "and", but it is much more difficult with a fixed width font.
It might be useful to have two displays, one with a font matching the original, easier to read and to find omissions, another with a proportional font making the misrecognitions more visible. Of course, switchable with javascript.
Good point. I should add javascript to switch to a monospace font. -- Edward.