jose menendez has created a "digital reprint"
of the "my antonia" scan-set as a .pdf:
> http://www.ibiblio.org/ebooks/Cather/
aside from the facts that he created it using
ms-word and delivered it as a pdf, this demo
is remarkable in that it _replicates_ the scans
but does so with _digital_text_ (which can be
copied and searched), and only uses 2 megs
instead of the 30+ required by the scanset.
in addition, the text is completely clean, with
none of the faded or incomplete characters
that are all-too-common with the scans...
you'll note that -- in order to replicate scans --
it's necessary to retain the original linebreaks,
so i once again request that p.g./d.p. do that.
it's just a shame to throw meaningful info away.
also note that the .pdf file uses _soft_hypens_
in those words that were end-line hyphenated;
these soft-hyphens can be eliminated to rejoin
the words so that searches are done correctly
and so that the text can be reflowed if desired.
this demo can be improved by removing the
proprietary ms-word from the process, but
nonetheless this .pdf indicates a crucial step
in the process of "round-tripping", namely
to be able to replicate the look of a scan-set.
"round-tripping" wasn't jose's objective, but
his example is instructive anyway...
this fluency of text to morph effortlessly between
the frozen state of 20th-century paper-book and
the fluidity of the 21st-century electronic-book is
exactly what we should strive for in our efforts...
i'll be posting more on the topic of "round-tripping"
as i do further work expanding on jose's demo, but
i wanted to give you a heads-up on it right away...
-bowerbird