
On 7/19/05, Juhana Sadeharju <kouhia@nic.funet.fi> wrote:
* The original print is more pleasant to read than the ascii or html text.
In some cases, but that generally indicates you're handling it wrong. In other cases, the Gutenberg edition may be the only transcription of the work that isn't in black-letter fonts or is easily legible, very common things when you're working with books available only in facsimile reprint of 16th-18th century copies.
* Text with figures is better in its original layout. * Math text is better in its original print than in the TeX or math-html equivalent.
Typewritten text with equations added in in pen is better than TeX? I think there's good reasons why Knuth made TeX.
However you archive the scans, please avoid 1-bit scans. 16 grey levels with 200 dpi would be better than 1-bit with 600 dpi in my experience.
That's what the OCR program likes. Distributed Proofreaders are very likely to continue producing B&W 300 dpi scans in most cases for the near future.