Lee,
I don't remember seeing your numbered rules. Where are they? They all look
right to me so far.
Do you want a blog id on readingroo.ms? This would be easier to follow if
it accumulate somewhere.
Otherwise I'll start a new category and copy them over.
Don
On 10/10/2012 9:09 AM, Lee Passey wrote:
[snip]
This file obviously needs a lot of work, and probably is more deserving
of attention than many of the recent obscure and esoteric works for
which raw page scans are adequate to serve the academic community. I'll
try to fix it as best I can, but this will probably take me several
weeks. I'll post iterations on my web server as I go along, and provide
comments here as iterations are completed.
Iteration one is available at http://www.passkeysoft.com/~lee/pg14668.html.
In this iteration I did the following:
Removed all "style=''" attributes. Added a link to "gutenberg.css" for all styling.
Encapsulated the Gutenberg boilerplate in <div class="gutenblurb">...<div>. My style sheet sets that class to "display:none" so I don't ever have to look at it again.
Encapsulated the transcribers notes into <div class="fm notes">.
Encapsulated the title page presentation into <div class="tp">. Changed book title headers to <h1>, pursuant to rule 4. Removed <p> from non-paragraphs on the title page. Some header normalization throughout the remainder of the file has been performed, but is not yet complete.
Encapsulated the copyright information into <div class="copyright">.
Made some headway in applying rule no. 1: <p> is reserved for paragraphs. e.g. "Preface" was changed from <p> to <h3>; the dateline at the end of the preface was changed from <p> to <div class="closer">.
The blob of text starting at <h2 id="id00032> through the end of <p id="id00035">, which is obviously a table of contents, was converted to a table of contents using <ul>, <ol> and <li> elements, pursuant to rule no. 2.
Ersatz paragraphs id00041, id00043, id00045, id00047, id00049, and id00051 were obviously tabular data, so I converted them to tables, pursuant to rule no. 3.
As best I could, I restored diacritical markings to the words contained in these 6 tables. Letters containing diacritical marks can be represented in a number of ways in unicode. I chose to represent them as ASCII characters + combining diacritical marks because some of the characters had no other representation in unicode and I wanted to use a consistent code page. If the combining diacritical marks are ignored, the collation order of the words will be preserved (for whatever that's worth). The drawback to this approach is that I don't know how widespread support for combining diacritical marks is on HTML user agents.
<p id="id00062">[Illustration:...]</p> was changed to <div id="id00062" class="illustration" title="..."><img src="" alt="..."/>. When an image cannot be located (which is always for this file), the alternate text will be displayed; when an image becomes available, the alternate text will not be displayed. This ought to be a rule, but probably not in the top 20. Some further illustrations have been converted, but not yet all.
The vocabulary word list at the beginning of each chapter (e.g. ersatz paragraphs id00065, id00066, id00067, id00068) is problematic. Technically this is a list, but HTML has really bad support for floating columns. For the time being I have implemented them as tables, but I don't feel good about it. Suggestions are solicited.
Words in the word list are obviously divided on syllables; the apostrophe in those words is a replacement for an acute accent indicating stress. I replaced those apostrophes with the unicode acute accent symbol (´).
In the reading exercises, the paragraphs are numbered. I was tempted to place each paragraph as an item in an ordered list for automatic numbering, but decided not to. Opinions are welcomed.
Other changes were made, relating to handwriting and correspondence, but I'm running out of time. More to follow in the next installment.
_______________________________________________
gutvol-d mailing list
gutvol-d@lists.pglaf.org
http://lists.pglaf.org/mailman/listinfo/gutvol-d