
James Adcock wrote: [snip]
What I don't understand is why PG continues to be wedded to plain-text as an *input* encoding format demanded of people submitting texts to PG. Plain-text is too constrained to do the job well.
I find that you are generally correct in everything you have said to date. But the reality is that PG <em>does</em> continue to be wedded to plain (impoverished) text. This topic has come up regularly over the years, and in every case has ended without any improvement to PG. While I hesitate to say that your advocacy is futile, your advocacy is futile.
HTML is too ambiguous, and too ill-matched to books to do well. We need something else, something that CAN be correctly and automagically converted "correctly" to one or another formats including plain-text, and Unicode, and HTML, and mobi, etc.
HTML, <i class="foreign>per se</i>, is indeed too ambiguous, although I have successfully developed a fairly complete set of standard usages and class definitions (encapsulated in a CSS file) that allow me to do lossless translation back and forth between HTML and TEI. For PG to adopt such a scheme, however, would require that PG adopt a set of standards, and Mr. Hart has been adamant that PG will <em>never</em> adopt <em>any</em> standard, fearing that it may alienate or intimidate some speculative volunteer that would otherwise contribute an as-yet-unarchived impoverished text file. (Obviously, the implicit standards developed and enforced by the Council Of Whitewashers cannot be considered as <i class="socalled">true</i> standards.) I have concluded that Project Gutenberg is impervious to improvement. While Bowerbird rejects the notion, I am not afraid to say that for what you are attempting to do Project Gutenberg may not be the correct archive. I would suggest, rather, perfecting your HTML file, uploading it to the Internet Archive (http://www.archive.org/create/) and then posting a message here indicating where it can be found if any other volunteer wants to create a degraded version of your master copy.