this thread explains how to digitize a book quickly and easily.
search archive.org for "booksculture00mabiuoft" for the book.
***
i forgot to tell you i fixed that bug in the search routine.
and i'm running it with the latest edited version, which --
because we're out of sequence a bit -- is grapes006.txt...
> http://zenmarkuplanguage.com/grapes111.py
> http://zenmarkuplanguage.com/grapes111.txt
everything that's returned in the "notfound" list looks ok,
which means we've finally got clean text.
what we need to do now is to include the "notfound" words
in the "specials" dictionary, so we can create the dictionary
that will constitute _the_custom_dictionary_ for this book...
***
after that, the next step is a complete visual run-through;
that is, we'll go through every page of our text and check
it against its pagescan. there are a couple of objects here:
1) to check that all our _formatting_ is correct throughout,
such as paragraphs, blockquotes, headers, and so on, and
2) to find and insert any _styling_ that appears in the book,
such as italics, bold, and so on, and 3) a final clean-up q/c.
for this task, it's best to use the page-viewer:
> http://zenmarkuplanguage.com/grapes203.py
-bowerbird