
Bowerbird wrote:
jon said:
all I'm simply doing is suggesting DP's scan contributors to consider higher-rez/full color -- a few may choose to take this route as they assess it for themselves.
ok, that's cool. :+)
the people who do a book every now and then might consider it.
the vast majority of the books are scanned by a small group of people, who don't think of creating archival scans as something they want to do, so i don't think your suggestion will carry much weight with them.
but it's fine for you to suggest it. even better to start scanning yourself. you've got too books under your belt now. and more on the way? :+)
The two books I've scanned, plus this discussion, is helping me clarify as to where to go next. And, yes, there could be a lot more books being scanned as a result of this discussion, possibly as part of a multi-person effort to scan authoritative copies of many of the top 500 to 1000 classics of the Public Domain. But before jumping in and just scanning a zillion books, I'd rather plan things more carefully, to understand all the important issues, so we don't waste the effort once we do get going. And once we get going, we'll probably scan a few books, then stop and analyze what we did, to get feedback from others to make sure we are on the right track, before proceeding further. This may seem slow, but the idea is not to compete with massive scanning projects such as IA's (and private efforts like David Reed's), but to complement it for a specific purpose.
just as a quick note on your workflow -- most scanning programs will automatically name the scans, incrementing the filename as needed, so there's no need to do that manually. so if you _begin_ with page 1 (or simply reset the auto-naming basename when you get to page 1) and scan every page from there until the end of the book, a test to see if you missed a page is to see if the final filename is the right one. if it's not, you goofed. if it is, you still need to check all the scans -- as you might have missed one page, and scanned another one twice.
Good point.
you might also find it goes much faster -- if you want it to go faster -- to scan all the pages in the first pass _without_ checking the quality of the scan on each page, and instead do that en masse after the fact. then you can go back and re-scan the occasional page that needs that; in this second pass, you can also rescan any images and/or color pages.
Well, as part of a multi-person effort to do high-quality scans of various books, I see setting up a sort of "Distributed Scanners" where volunteers will be able to deal with scan QC, filenaming, cleanup (deskewing/cropping), cataloging per library standards (e.g. MARC records), etc. There are definitely books out there that deserve a higher-level of scanning care and preservation. Then these scans will be made available in various derivative forms, as well as submitted to DP for conversion to structured digital texts. Hopefully by this process the most oft-used and classic Works in the PG collection (which are mostly found in the older pre-DP portion) will be redone in a rigorous way, using reasonably authoritative public domain sources. (Some Works may even have multiple authoritative editions that could all be scanned, such as various translations of classic foreign works.) Jon