Re: [gutvol-d] Re: 600 dpi vs. 300 dpi for text (a quickie visual experiment)

21 Jul 2005

      Bowerbird wrote:
...
jon said:
...
...
 all I'm simply doing is suggesting DP's scan contributors to
 consider higher-rez/full color -- a few may choose to take this route
 as they assess it for themselves.
...
ok, that's cool.             :+)
the people who do a book every now and then might consider it.
the vast majority of the books are scanned by a small group of people,
 who don't think of creating archival scans as something they want to do,
 so i don't think your suggestion will carry much weight with them.
but it's fine for you to suggest it.   even better to start scanning yourself.
 you've got too books under your belt now.   and more on the way?        :+)
The two books I've scanned, plus this discussion, is helping me
clarify as to where to go next. And, yes, there could be a lot more
books being scanned as a result of this discussion, possibly as part
of a multi-person effort to scan authoritative copies of many of the
top 500 to 1000 classics of the Public Domain.

But before jumping in and just scanning a zillion books, I'd rather
plan things more carefully, to understand all the important issues,
so we don't waste the effort once we do get going. And once we get
going, we'll probably scan a few books, then stop and analyze what we
did, to get feedback from others to make sure we are on the right
track, before proceeding further.

This may seem slow, but the idea is not to compete with massive
scanning projects such as IA's (and private efforts like David Reed's),
but to complement it for a specific purpose.
...
just as a quick note on your workflow -- most scanning programs will
 automatically name the scans, incrementing the filename as needed,
 so there's no need to do that manually.   so if you _begin_ with page 1
 (or simply reset the auto-naming basename when you get to page 1)
 and scan every page from there until the end of the book, a test to
 see if you missed a page is to see if the final filename is the right one.
 if it's not, you goofed.   if it is, you still need to check all the scans --
 as you might have missed one page, and scanned another one twice.
Good point.
...
you might also find it goes much faster -- if you want it to go faster --
 to scan all the pages in the first pass _without_ checking the quality
 of the scan on each page, and instead do that en masse after the fact.
 then you can go back and re-scan the occasional page that needs that;
 in this second pass, you can also rescan any images and/or color pages.
Well, as part of a multi-person effort to do high-quality scans of
various books, I see setting up a sort of "Distributed Scanners" where
volunteers will be able to deal with scan QC, filenaming, cleanup
(deskewing/cropping), cataloging per library standards (e.g. MARC records),
etc. There are definitely books out there that deserve a higher-level
of scanning care and preservation. Then these scans will be made
available in various derivative forms, as well as submitted to DP for
conversion to structured digital texts. Hopefully by this process the
most oft-used and classic Works in the PG collection (which are mostly
found in the older pre-DP portion) will be redone in a rigorous way,
using reasonably authoritative public domain sources. (Some Works may
even have multiple authoritative editions that could all be scanned,
such as various translations of classic foreign works.)

Jon

Re: [gutvol-d] Re: 600 dpi vs. 300 dpi for text (a quickie visual experiment)

Jon Noring