these grapes are sweet -- lesson #24

this thread explains how to digitize a book quickly and easily. search archive.org for "booksculture00mabiuoft" for the book. *** first, let's clear up that loose end about the final spellcheck... to recall, we approached our unique dictionary with grapes116:
http://zenmarkuplanguage.com/grapes116.py http://zenmarkuplanguage.com/grapes116.txt
you'll find it as the long list of words at the end of the output. we saved that file here:
http://zenmarkuplanguage.com/grapes.all.unique.dictionary.txt
now we essentially repeat grapes116.py with grapes120.py:
http://zenmarkuplanguage.com/grapes120.py http://zenmarkuplanguage.com/grapes120.txt
the difference is, we're no longer using the regular dictionary, the "specials" dictionary, or the dictionary with british words... (if you look, you will see that they have been commented out.) so we _only_ use the all.unique dictionary -- words in this book. (we split it up into a "specials" part and a "regular" part, because _specials_ are case-sensitive, while regulars are case-insensitive.) now, the first time (or two, or three) that you run grapes120.py, there still might turn up a "notfound" word (or two, or three), but you just add 'em to all.unique and run the script again, until you finally get an output that tells you that no words were "notfound". that's the signal that your all.unique.dictionary is now complete... (if you're curious about what didn't turn out right in the first run, it was the roman numerals i, v, x, and xi, which should have been case-sensitive, along with a couple of initial-capped possessives.) *** and yet another loose end, brought to the surface by roger's questions about coding a page-by-page viewing apparatus... here's code that creates such a capability:
http://zenmarkuplanguage.com/grapes124.py http://zenmarkuplanguage.com/grapes124.txt
season to taste... you'll find the set of page-by-page webpages in this folder:
*** ok, so on to .pdf, and our finish-line... i explained why i'm not sharing my code for .pdf creation. but you can see a .pdf which i created:
this .pdf was output using one specific set of preferences. i might generate others, to demonstrate customization... or i might not. but either way... this ends our task of generating our trio of output-formats, which also completes the purpose of the thread in general... we've gone from raw o.c.r. to finished e-books, using only your wordprocessor and code written by this python virgin. we'll recap the whole thread in our next, and final, lesson... so if you have questions, now would be the time to ask 'em. -bowerbird
participants (1)
-
Bowerbird@aol.com