
Good afternoon, everyone! I have a few things I want to try to get a consensus on as to HOW we want to handle some aspects of the PG TEI master document. Some of the questions will pertain to specific people (such as the automatic inclusion of the PG header/footer) and some will pertain to everyone interested in PG TEI. The TEI master I am using for the basis of this discussion is available at http://home.alltel.net/hutch2000/sunny/start.xml. So, without further ado! 1 - Currently, Marcello's online converter (TEI -> HTML) automatically adds a PG standard header and footer. (http:\\home.alltel.net/hutch2000/sunny/sunny.html) It looks nicer (to my eyes anyway) than the monospaced header and footer that the whitewashers currently use. However, is this a "bad thing" in the eyes of the whitewashers? As near as I can tell, the only people of information that will need to be manually added by the whitewashers is the EBook number that is assigned to this text. If this is placed in the TEI master, then it is automatically put into the HTML version when it is run through the TEI -> HTML converter. If this is an "ok thing" but needs some work ... what needs changed? Jim, you're a vocal whitewasher! Rip this apart! (This question also includes suggestions for style improvements to the header/footer, too.) *** 2 - The version I have posted above has two rather significant CSS changes from the style used in Marcello's converter. A) The margins have been set to 10% whitespace on the right and left. This is a fairly arbitrary number arrived at because it is the "defacto" standard at DP. Suggestions/comments? B) The paragraph markup has been changed back to HTML standard. Marcello's original style more closely resembles TeX formatting, where there is no white space between paragraphs and each paragraph is indented. This was jarring to me, hence the change. Again, suggestions/comments? The rest of the style is as Marcello's converter made it. It is a bit verbose by some people's standards (almost everything has a class attribute), but this can be a very good thing because it now allows CSS to affect the layout/look of nearly every aspects of the document. *** 3 - The TEI master uses rend="indent" markup in the poetry. This validates fine, but currently the TEI -> HTML converter basically ignores the indent markup. What I want to address here is how we want to have those indents converted. TEI master markup: <lg> <l>"I thank the goodness and the grace</l> <l rend="indent">That on my birth have smiled,</l> <l>And made me in these Christian days</l> <l rend="indent">A happy English child."</l> </lg> Option #1 - Convert the rend="indent" markup to & emsp ; & emsp ; (remove spaces for use). Pro: Degrades gracefully on non-CSS enabled browsers like Lynx. Con: Treats the indent as content. Option #2 - Convert the rend="indent" markup to CSS markup equivalent (my mind is going blank right now or I'd give an example). Option #3 - Any other ideas how to handle this? *** 4 - I used <quote rend="display"> markup for blockquotes. This looks fine to me. However, in previous discussions, some people did not like the rend="display" for this purpose. As far as I am concerned, it works and doesn't seem to be a problem, but I'm willing to hear opposing arguments. *** 5 - I used <lb /> to indicate a blank line of text (commonly called a thoughtbreak over at DP). Marcello's documentation indicates this isn't what it is truly meant for, though. Anyone see a problem with this implementation? Or see an improvement we should use instead? *** 6 - This work has a small example of drama markup. It is very simple markup (verse with no partial lines), but it seems to work well. I don't have any problems with it, but I also know that my experience with drama markup is extremely limited. Any suggestions/concerns? *** 7 - The only other thing I can remember that was at all out of the ordinary with this text was the retention of small caps. I used the rend="sc" markup and it worked just as I expected it to in the TEI -> HTML converter. Any suggestions/comments/improvements? *** I'm sure I'll remember something on the way home tonight that I forgot to mention, but that's what I can think of right now for discussion. I'm looking forward to everyone's input. Josh

Joshua Hutchinson wrote:
The rest of the style is as Marcello's converter made it. It is a bit verbose by some people's standards (almost everything has a class attribute), but this can be a very good thing because it now allows CSS to affect the layout/look of nearly every aspects of the document.
Everything has a class attribute because this way you can use the generated html in a web site -- eg. for an online reader -- and the book and site style will not clash. TODO: all generated styles should have the same prefix: pgtei.
3 - The TEI master uses rend="indent" markup in the poetry. This validates fine, but currently the TEI -> HTML converter basically ignores the indent markup. What I want to address here is how we want to have those indents converted.
I'm working on implementing indent and a few other rend attribute gimmicks. It will understand rend="indent" and rend="indent(n)" where n can be any positive or negative number.
5 - I used <lb /> to indicate a blank line of text (commonly called a thoughtbreak over at DP). Marcello's documentation indicates this isn't what it is truly meant for, though. Anyone see a problem with this implementation? Or see an improvement we should use instead?
<lb ed="first folio"> is meant to record line breaks in a certain edition like <pb>, not to output ones. To get a thought break enclose both "thoughts" in <divs>. <div type="chapter"> <head>1.</head> <div> <p></p> ... <p></p> </div> <!-- thought break will be inserted here --> <div> <p></p> ... <p></p> </div> </div> -- Marcello Perathoner webmaster@gutenberg.org

1 - Currently, Marcello's online converter (TEI -> HTML) automatically adds a PG standard header and footer. (http:\\home.alltel.net/hutch2000/sunny/sunny.html) It looks nicer (to my eyes anyway) than the monospaced header and footer that the whitewashers currently use.
One advantage of monospaced: it clearly distinguishes the long PG footer from the book's content. One could instead use a smaller size and sans serif font. (Personally, I would prefer omitting the license and just including a link.) Also, as noted in http://classicosm.com/xml/feedbackonpgtei.html: In the PG license, section numbers such as "1.A." should appear on the same line as the text that follows -- per the original and to avoid wasting space.
A) The margins have been set to 10% whitespace on the right and left. This is a fairly arbitrary number arrived at because it is the "defacto" standard at DP. Suggestions/comments?
Looks good to me.
B) The paragraph markup has been changed back to HTML standard.
As you say, it's the HTML standard and thus appropriate for the default CSS.
The rest of the style is as Marcello's converter made it. It is a bit verbose by some people's standards (almost everything has a class attribute), but this can be a very good thing because it now allows CSS to affect the layout/look of nearly every aspects of the document.
A few notes based on a quick look: - class=dgp does seem to be overused. - span class="hi" style="font-variant: small-caps;" is a bit much; how about span class="smallCaps"? I also hate that the HTML is wrapped at 78 (or whatever) chars. I suppose few people will edit the output, but it seems like a wasteful throwback. Don't people have editors that wrap text???
3 - The TEI master uses rend="indent" markup in the poetry. This validates fine, but currently the TEI -> HTML converter basically ignores the indent markup. What I want to address here is how we want to have those indents converted.
TEI master markup:
<lg> <l>"I thank the goodness and the grace</l> <l rend="indent">That on my birth have smiled,</l> <l>And made me in these Christian days</l> <l rend="indent">A happy English child."</l> </lg>
Option #1 - Convert the rend="indent" markup to & emsp ; & emsp ; (remove spaces for use). Pro: Degrades gracefully on non-CSS enabled browsers like Lynx. Con: Treats the indent as content.
I think the XHTML version should be completely modern, e.g. here's one way to indent using CSS: .indent {margin-left:40px; margin-right:40px} There are benefits to an "old fashioned HTML" version, but let's make that a different file, probably 4.01 transitional.
4 - I used <quote rend="display"> markup for blockquotes. This looks fine to me. However, in previous discussions, some people did not like the rend="display" for this purpose. As far as I am concerned, it works and doesn't seem to be a problem, but I'm willing to hear opposing arguments.
The issue as I understand it: q is for words spoken, quote is for text attributed to an outside source. Either may occur inline or set off in an indented block. So, a long "speech" by a character should (I think) be <q rend="display">. I think the TEI tags and explanation are confusing, but that's perhaps a different issue.
5 - I used <lb /> to indicate a blank line of text (commonly called a thoughtbreak over at DP). Marcello's documentation indicates this isn't what it is truly meant for, though. Anyone see a problem with this implementation? Or see an improvement we should use instead?
Marcello suggested that a closing and opening div creates a blank line; I'm not convinced that's a good idea in general. ===== Misc. questions: * The following looks like a (minor) error: <head>Letter XVII</head> <p>LETTER XVII.</p> The latter looks redundant. * Was the italics in the original here? <p><hi rend="sc">Andover</hi>, <emph>May</emph> 30, 1854.</p> * Does the original really have several pages with no paragraph breaks? -- Cheers, Scott S. Lawton http://Classicosm.com/ - classic books http://ProductArchitect.com/ - consulting
participants (3)
-
Joshua Hutchinson
-
Marcello Perathoner
-
Scott Lawton