ok, today we'll look at another book that was digitized by jim adcock.
thankfully, this one has more meat on its bones than the previous two.
this one is pg#31103, a.k.a., "new juveniles for 1864", by james miller.
as before, i've appended the tags that jim used. we have new ones!
first, there's the standard [p] tags for paragraphs, and [br] for breaks.
we also have the usual [i] tags jim uses for italics. strangely, though,
we also have a few [em] emphasis tags as well, also meaning italics...
in addition, we have a [strong] tag, to indicate text rendered in bold.
both the [em] and the [strong] tags have a _class_ attribute as well,
and later on we see there are some [div] tags used too, which makes
me very suspicious that this file was generated by using _calibre_...
calibre is a tool that does conversions among many e-book formats,
shuttling them all into its own intermediate format which uses these
"class" attributes extensively -- usually prefixing them with "calibre".
sure enough, jim has (atypically) used a .css stylesheet on this book,
and the stylesheet looks just like the one that is created by calibre...
so i suspect that jim just did a global change of "calibre" into "cal"...
i was going to review calibre later, but i suppose we might as well
get it out of the way right now, so i'll give a quick-and-dirty on it.
calibre works fine for people who don't want to "look at the source",
and who'll likely never have any need to ever _change_ that tagging.
but if you have any need, or even a desire, to _understand_ the file,
calibre is not a good solution, because its tagging is _very_ obtuse.
(a good class-name will explain its purpose; "calibre-14" does not.)
i don't know if calibre's tagging is any good, as it's so obtuse that
i cannot even force myself to look at it long enough to evaluate it.
i think it's probably _not_ very good, because obtuse stuff rarely is,
but in most cases, on most machines, its output is "good enough".
it does the job, and it does it fairly well, and that's "good enough".
because of that, i think calibre might be _good_ for readers who
want to convert from one format to another, for their private use.
but it's _not_ the kind of product we could build a system around.
***
meanwhile, let's look at the additional tags required by this book.
first, we have [blockquote] tags. blockquotes are a structure that
is used often in e-books, so we'll need to add it to our repertoire.
the blockquote tag has some similarity to the [pre] tag we've used,
as [pre] preformatted material is often indented on both sides, just
like we'll want our blockquotes to be indented, but blockquotes --
unlike poetry, for instance -- can often be rewrapped if necessary.
jim also used some [hr] horizontal rules in this book. fair enough.
we also see various structures -- [div] and [p] and [span] -- that're
used with the "class" attributes, to perform various types of styling.
this gets perilously close to "presentational" markup, which is often
contrasted with "semantic" markup, but i won't get in that cesspool.
we also see that a half-dozen images were scattered in this book,
so we will need [img] image markup to indicate their presentation.
finally, as with the earlier books, we have the expected _headers_.
jim uses header-levels 1 through 5, mostly for the forward-matter,
probably only to set the various elements to an "appropriate" size.
there's nothing "semantic" about the many levels, except [h1] title,
so again, this tagging is "presentational" and not "structural", but...
while i'm at it, for the record, i'll note that some so-called "purists"
might make an objection about the fact the table-of-contents lines
are tagged as "paragraphs", saying they aren't _really_ "paragraphs".
they'll call this "tag abuse", and probably get all indignant about it...
i don't think we need to even think twice about such minor matters,
let alone pay attention if people get all huffed-up about them, but
i thought i'd mention it, so you know the crap that we're up against.
and that about sums up today's book...
***
so, once we ignore the unnecessary complexity introduced by calibre,
we see that this book has introduced three new structures to handle:
a. blockquotes
b. horizontal rules
c. images
thanks, once again, to jim adcock for digitizing this book, and indeed,
thanks to _all_ the people who work hard to digitize all the p.g. books.
-bowerbird
p.s. here are the .html tags that jim used to digitize pg#31103...
> [p][/p]
> [br]
> [i][/i]
>
> [em class="cal3"][/em]
> [strong class="cal8"][/strong]
>
> [blockquote class="sgc-12"][/blockquote]
>
> [hr width="25%"]
> [hr width="50%"]
>
> [div][/div]
> [div class="sgc-15"][/div]
> [div][span class="sgc-11"][/span][/div]
>
> [p class="sgc-2 sgc-4"][/p]
> [p class="sgc-4"][/p]
> [p class="sgc-18"][/p]
>
> [span class="sgc-8"][/span]
> [span class="sgc-11"][/span]
> [span class="sgc-13"][/span]
>
> [img src="31103-h_files/img0002.png" alt="The Dream of Little Tuk."]
> [img src="31103-h_files/img0003.png" alt="Children Dancing."]
> [img src="31103-h_files/img0004.png" alt="Man Carrying Firewood."]
> [img src="31103-h_files/img0005.png" alt="Mother Praying with Angel
Overhead."]
> [img src="31103-h_files/img0006.png" alt="The Little Match Girl."]
> [img src="31103-h_files/img0007.png" alt="Mother Holding Mistletoe
Above Infant."]
>
> [h1 align="center"]New Juveniles for 1864[/h1]
> [h2 align="center"]JAMES MILLER,[/h2]
> [h3 align="center"]MAGNET STORIES,[/h3]
> [h4 align="center"]For Summer Days and Winter Nights.[/h4]
> [h5 align="center"]SECOND SERIES.[/h5]
>
> [p align="center"][a name="ch_TheOldHouse" href="#a_TheOldHouse"]I.
The Old House[/a][/p]
> [p align="center"][a name="ch_TheDropOfWater"
href="#a_TheDropOfWater"]II. The Drop of Water[/a][/p]
> [p align="center"][a name="ch_TheHappyFamily"
href="#a_TheHappyFamily"]III. The Happy Family[/a][/p]
> [p align="center"][a name="ch_TheStoryOfAMother"
href="#a_TheStoryOfAMother"]IV. The Story of a Mother[/a][/p]
> [p align="center"][a name="ch_TheFalseCollar"
href="#a_TheFalseCollar"]V. The False Collar[/a][/p]
> [p align="center"][a name="ch_TheShadow" href="#a_TheShadow"]VI. The
Shadow[/a][/p]
> [p align="center"][a name="ch_TheOldStreetLamp"
href="#a_TheOldStreetLamp"]VII. The Old Street-Lamp[/a][/p]
> [p align="center"][a name="ch_TheDreamOfLittleTuk"
href="#a_TheDreamOfLittleTuk"]VIII. The Dream of Little Tuk[/a][/p]
> [p align="center"][a name="ch_TheNaughtyBoy"
href="#a_TheNaughtyBoy"]IX. The Naughty Boy[/a][/p]
> [p align="center"][a name="ch_TheTwoNeighboringFamilies"
href="#a_TheTwoNeighboringFamilies"]X. The Two Neighboring Families[/a][/p]
> [p align="center"][a name="ch_TheDarningNeedle"
href="#a_TheDarningNeedle"]XI. The Darning Needle[/a][/p]
> [p align="center"][a name="ch_TheLittleMatchGirl"
href="#a_TheLittleMatchGirl"]XII. The Little Match-Girl[/a][/p]
> [p align="center"][a name="ch_TheRedShoes" href="#a_TheRedShoes"]XIII.
The Red Shoes[/a][/p]
> [p align="center"][a name="ch_ToTheYoungReaders"
href="#a_ToTheYoungReaders"]XIV. To The Young Readers[/a][/p]
>
> [a name="a_TheOldHouse"][/a][h2][span class="sgc-11"]THE OLD
HOUSE.[/span][/h2]
> [a name="a_TheDropOfWater"][/a][h2][span class="sgc-11"]THE DROP OF
WATER.[/span][/h2]
> [a name="a_TheHappyFamily"][/a][h2][span class="sgc-11"]THE HAPPY
FAMILY.[/span][/h2]
> [a name="a_TheStoryOfAMother"][/a][h2][span class="sgc-11"]THE STORY
OF A MOTHER[/span][/h2]
> [a name="a_TheFalseCollar"][/a][h2][span class="sgc-11"]THE FALSE
COLLAR.[/span][/h2]
> [a name="a_TheShadow"][/a][h2][span class="sgc-11"]THE
SHADOW.[/span][/h2]
> [a name="a_TheOldStreetLamp"][/a][h2][span class="sgc-11"]THE OLD
STREET-LAMP.[/span][/h2]
> [a name="a_TheDreamOfLittleTuk"][/a][h2][span class="sgc-11"]THE DREAM
OF LITTLE TUK.[/span][/h2]
> [a name="a_TheNaughtyBoy"][/a][h2][span class="sgc-11"]THE NAUGHTY
BOY.[/span][/h2]
> [a name="a_TheTwoNeighboringFamilies"][/a][h2][span class="sgc-11"]THE
TWO NEIGHBORING FAMILIES.[/span][/h2]
> [a name="a_TheDarningNeedle"][/a][h2][span class="sgc-11"]THE
DARNING-NEEDLE.[/span][/h2]
> [a name="a_TheLittleMatchGirl"][/a][h2][span class="sgc-11"]THE LITTLE
MATCH GIRL.[/span][/h2]
> [a name="a_TheRedShoes"][/a][h2][span class="sgc-11"]THE RED
SHOES.[/span][/h2]
> [a name="a_ToTheYoungReaders"][/a][h2][span class="sgc-11"]TO THE
YOUNG READERS[/span][/h2]