so, while you guys have yet another discussion,
i'm doing some work on "pride and prejudice"...

***

here's the version i'm using, from archive.org:
>   http://archive.org/details/harvardclassicss03elio

***

you'll find my work in this folder on my site:
>   http://zenmagiclove.com/prhpr

***

the original o.c.r., from archive.org, is here:
>   http://zenmagiclove.com/prhpr/prhpr-000.zml

subsequent edits of that file bump up the filename:
>   http://zenmagiclove.com/prhpr/prhpr-001.zml
>   http://zenmagiclove.com/prhpr/prhpr-002.zml
>   http://zenmagiclove.com/prhpr/prhpr-003.zml

the first two of those were just to get the text into
my standard page-separated skeleton, which marks
the beginning of each page of o.c.r. text with a line
that has _double-braces_ with the scan-name inside.

the ending of each page is signified by a line with
_brackets_ that hold the page-number of the page.

***

"prhpr-003.zml" was the first file where i did edits.
these consisted primarily of global-type changes...
>   http://zenmagiclove.com/prhpr/prhpr-003.zml

here is a program which documents the changes
that were made from the "002" to the "003" files.
>   http://zenmagiclove.com/prhpr/docomp23.py

as you can see, if you run that script, there were
7,607 lines edited by these global-type changes.

***

i've now done more edits, creating another version:
>   http://zenmagiclove.com/prhpr/prhpr-004.zml

most of these changes were of a "global" style, but
i also fixed idiosyncratic errors when i spotted 'em.

as before, you can view all lines that were changed:
>   http://zenmagiclove.com/prhpr/docomp34.py

as shown, 515 lines changed from "003" to "004".

***

also, like yesterday, we can convert the .zml to .html:
>   http://zenmagiclove.com/prhpr/prhpr-004h.html

this "entire-book-on-one-page" is in the style of
the way most of the books from pg/dp are made,
manifesting the "scroll" methodology of the web...

the fact that you can get a rather-nicely formatted
e-book using barely-reworked o.c.r. output would
-- i believe -- be somewhat surprising to many of
the post-processors from d.p., who seem to believe
that this is work which requires a modicum of effort.

(but note that the text remains unrefined o.c.r., and
all the inline text-styling still has yet to be applied.)

***

the new twist for today is that we have also spit out
files that show each page of text alongside its scan:
>   http://zenmagiclove.com/prhpr/prhprp222.html

you will have to adjust the zoom-level and the size
of your browser-window for the best presentation.
(advanced versions of this will give better control,
using javascript to customize the elements better.)

and of course, these pages connect with each other,
via the "prev" and "next" links on each one. (you can
also advance to the next page by clicking the scan.)

this is the "design" you're used to seeing from me...
a paginated display like this one lends itself more to
smoothreading or, as i call it, "continuous proofing".

but we are not ready for smoothreading yet, because
we still have bad o.c.r. text that needs to be corrected.

more later...

-bowerbird

p.s. did anyone notice the thing that i hinted about?