a review of some digitization tools -- 014

8 Dec 2011

      ok, to review, let's consider the list of structures that
we found in the set of 4 books digitized by jim adcock.

they were:
...
a.    headers -- [h1]-[h6]
  b.    paragraphs and breaks -- [p] and [br]
  c.    styling -- [i] and [b], plus cover-page prettyness
  d.    blocks -- [pre] and [blockquote]
  e.    horizontal rules -- [hr]
  f.    images -- [img]
  g.    links -- [a name] and [a href], t.o.c., and index
so, we've covered headers, paragraphs and breaks,
styling, blockquotes, horizontal rules, and links...

which means that we just have to take care of images,
and we've handled our first set of "simple" structures.

so i inserted some images, just for fun...

for good measure, i threw in a poem as well, and a
little index too, so i could do the coding for those...

the index is a "mere" search routine, but it's useful
because it reveals important keywords in the book,
with an explicit list of the number of hits returned...
(i'll do more work on indexing, but this is a first pass.)

the index functionality required that each paragraph
be addressable, so now each "chunk" has an .html "id".

you'll find all of this new stuff at the end of the book.

z.m.l. input file:
...
http://zenmagiclove.com/grapes009.txt
python script to create .html output:
...
http://zenmagiclove.com/tday2011.py
-bowerbird

Bowerbird＠aol.com

tags

participants (1)