james said:
>   I started doing the page-at-a-time thing and gave up.
>   Your pages are already better than mine because
>   I used Tesseract and archive.org uses ABBY Fine Reader.

except the o.c.r. from archive.org, in this case, is screwed up.
it's missing its em-dashes.  i've dealt with this problem before,
and it's less work to re-do the o.c.r. than to fix the em-dashes.

will someone with a good version of abbyy please re-do this o.c.r.?


>   2).  This book really requires a way to enter UTF-8 characters.

if someone does the o.c.r. for you, they can specify utf8 output...


>   If I could just stick a circumflex above a's, u's, and i's
>   (both lower and upper case) that would be 99% of what I need.

if you can pull out a list of the words that require circumflexes,
we can create a script that does a global change in one swoop.


>   (after de-hyphenating

do not dehyphenate!  the program will do that for you.


>   re-wrapping

do not rewrap!  if you need to rewrap, the program can do it.

rewrapping is evil.  it just makes it harder for the next guy...

-bowerbird