a review of some digitization tools -- 014

ok, to review, let's consider the list of structures that we found in the set of 4 books digitized by jim adcock. they were:
a. headers -- [h1]-[h6] b. paragraphs and breaks -- [p] and [br] c. styling -- [i] and [b], plus cover-page prettyness d. blocks -- [pre] and [blockquote] e. horizontal rules -- [hr] f. images -- [img] g. links -- [a name] and [a href], t.o.c., and index
so, we've covered headers, paragraphs and breaks, styling, blockquotes, horizontal rules, and links... which means that we just have to take care of images, and we've handled our first set of "simple" structures. so i inserted some images, just for fun... for good measure, i threw in a poem as well, and a little index too, so i could do the coding for those... the index is a "mere" search routine, but it's useful because it reveals important keywords in the book, with an explicit list of the number of hits returned... (i'll do more work on indexing, but this is a first pass.) the index functionality required that each paragraph be addressable, so now each "chunk" has an .html "id". you'll find all of this new stuff at the end of the book. z.m.l. input file:
python script to create .html output:
-bowerbird
participants (1)
-
Bowerbird@aol.com