Re: [gutvol-d] Scan file naming -- another comment

23 Jul 2005

      On 7/22/05, Jon Noring <jon@noring.name> wrote:
...
One question of Robert and the other DPers: how were blank pages handled?
If it has a page number (or fits into the page sequence) I scan it. If
it the verso or obverse of an unnumbered illustration, I skip it. The
goal is to capture the content; grabbing an unnumbered blank page
doesn't accomplish much.
...
As an aside, I've always thought the best system would be to separate
the DP proofing system from the scanning portion. In essence, to setup
a separate (autonomous) "Distributed Scanners" which will encourage
the scanning of older books, set minimum quality requirements, QC,
standardized cataloging (possibly MARC-XML), clean up the scans to
form working sets (deskewing, cropping, color depth reduction, etc.),
and do so in a semi-distributed environment akin to DP. Then the work
product would be archived at IA (with public access to some of the
derivative scan sets if not the masters). And of course DO would
generate a derivative scanset optimized for DP's process. If the
system works well, DP could encourage submitters to go through the
DS system for submitting scans.
Frankly, I think I lean more towards the other DPers.. the scan images
are a means to an end; the end being a correct, well-formatted eBook.
HTML, especially, is very flexible in output formats.. I can view it
anywhere from a graphing calculator to a Sun Workstation, or run it
through a TTS engine and listen to it. Page images are a useful
reference to compare the etext against, but are not very flexible.

Now, in general, I _would_ like to see an improvement in the average
image quality of included illustrations. I personally preprocessed all
of the Potter illustrations (main reason I haven't finished is I don't
have the time right now to do it right) before uploading them.. but it
is a lot of work, and a skill that takes practice to get right. I
still consider myself only an intermediate photoshop user.

R C

Re: [gutvol-d] Scan file naming -- another comment

Robert Cicconetti