
Jon Ingram wrote:
Dave Fawthrop wrote:
Someone else wrote (attribution lost, was it Juhana?):
However you archive the scans, please avoid 1-bit scans. 16 grey levels with 200 dpi would be better than 1-bit with 600 dpi in my experience.
As this is to be, in part, a retroactive exercise, we will have to use whatever exists.
Indeed. The vast majority of scanning done for DP has been bitonal 300DPI. This provides perfectly adequate images for OCR. While low-dpi grayscale images may look prettier, higher resolution black-and-white images generally OCR better, as well as taking up much less disk space.
Does the 300 dpi bitonal rule apply for OCRing all text found in all books? What's the smallest point size text which 300 dpi bitonal will still allow reasonably accurate OCR (at least sufficient for the DP process)? Jon