
Jim Tinsley wrote:
On Tue, Dec 20, 2005 at 11:07:09PM -0700, Wally Thompson wrote:
[snip]
I would like to make it clear to the reader weather or not a new paragraph begins after a poem. But I would also like to be consistent with other Gutenberg Ebooks.
Thanks for the question, Wally. It's a good one.
You can't do both.
I think's it's obvious, Wally, that what you need is a markup language. The problem is that historically Project Gutenberg has been considered a NMA (No Markup Allowed) zone. In recent years, with the addition of HTML-formated works, this "standard" has been relaxed, but it is still required that works submitted to PG be reduced, in at least one instantiation, to a non-marked-up format. The way to get around this requirement it to create a markup language (perhaps only powerful enough to deal with the one problem you have encountered) that doesn't _look_ like a markup language, and thus might slip through unnoticed. These types of markup languages have been variously referred to as "unobtrusive markup languages" or "smart ASCII." Examples include ReStructured Text (http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html) and Bowerbird's Zen Markup Language (unpublished).
In the same circumstances, I would change the normal conventions either
a) to indent the first line of each actual paragraph
or
b) to introduce two blank lines, rather than one, after a poem where a new para begins after it.
and clarify what I was doing, and why I was doing it, by means of a Transcriber's Note at the top of the file.
This was my first inclination also. Essentially, the Gutenberg Markup Language defines the end of a text block (which may be a paragraph or may be something else) as text which ends with two consecutive newline sequences. Thus, to indicate that the poem is contained by the paragraph, it should be a simple matter of including only a single newline sequence before the beginning (and after the end) of the poem. Unfortunately, there is some uncertainty about whether single newline sequences have significance in GML. One of the biggest complaints that users of handheld devices have about GutenTexts is that with displays that have widths of less than 80 characters (and usually not a convenient multiple thereof) if newlines are significant you will get texts that have three full lines of text followed by a line of just one or two words, followed three full lines of text, followed by one or two words, and so on. Thus, many user agents which are designed to display GutenTexts consider a single newline sequence as insignificant, and treat it simply as a space, as do many utilities which have been written to convert GutenTexts to more consumer-friendly formats. In these cases, in addition to losing the "lininess" of the poem, you will also lose its "blockiness" as well. So what you need is a way to indicate "this is a block which does not end the previous block, but is encapsulated within it" as well as a way to indicate "this is a significant line ending which must not be removed." Solving the second problem may also solve the first. Let's assume that for a mandatory line endings we use the ASCII sequence "<br/>." You could encode your second example as: <example> satisfy the best soodra society,--<br/> "With the yellow torches gleaming,<br/> And the scarlet mantles streaming,<br/> And the canopy above<br/> Swaying as they slowly move."<br/> Karlee has assured me that neither his </example> In this case, the mandatory newline sequences convey the meaning you desire, both as to the stanzas of the verse and as to the fact that it is part and parcel of the enveloping paragraph. Of course, I wouldn't use the "<br/>" sequence as the mandatory newline indicator, as it smacks to much of XHTML, and may draw the attention of the markup police. Instead you could use something less intrusive, such as the unix newline code ("\n"), some unusual sequence that has no semantic overloading (such as "^!"), or something more descriptive (and thus less "codey") such as "{end line here}." Now these proposals all sacrifice the Gutenberg consistency for textual expressiveness. It is also possible to sacrifice the textual expressiveness for the sake of simplicity of text--and this may be the better choice. Almost all texts coming from Distributed Proofreaders these days are submitted (and available) in XHTML format, which clearly has the expressive power to satisfy this type of construct. Perhaps the right thing to do is to consider the XHTML version as the canonical version, and simply place a Transcriber's Note at the beginning of the simplified text version to the effect that, "Some structures in this book are not expressible in a simple text format, and have therefore been omitted here. Readers interested in a more accurate representation of this publication should refer to the file '17217.html'."
There are probably other reasonable ways of indicating the necessary distinction, but the general formula of "choose one, and leave a Transcriber's Note that applies to the whole text" will work for any of them.
This is good advice. In the end, it probably doesn't matter as much what you do, as it does that you simply inform people what you have done, and why you did it.