modern methods of book composition

andre said:
The Internet Archive has 3 copies from 3 different sources.
i see 4. but who's counting? and of the ~15000 lines in this book, approximately 9,000 have identical o.c.r. across all 4 digitizations, with another 3,000 that have identical o.c.r. across 3 of the 4 digitizations, meaning that the "problem lines" can be boiled down to a fairly small subset... i'm busy with another project right now (thanks, alpha-testers!), so if someone else wants to do that research, go ahead please... -bowerbird

and of the ~15000 lines in this book, approximately 9,000 have identical o.c.r. across all 4 digitizations, with another 3,000 that have identical o.c.r. across 3 of the 4 digitizations, meaning that the "problem lines" can be boiled down to a fairly small subset...
Or one can actually go to archive.org, take a look at the txt ocr's, and evaluate for oneself how much work this is, or isn't going to be. [Not exactly sure how BB comes up with his measures!]
participants (2)
-
Bowerbird@aol.com
-
Jim Adcock