
It seems we are essentially in agreement. I would be ok with using any variant of html, but it has some some shortcomings in my mind. 1. Everything is built on those <div>s and <span> (and <some others, like <p> and <a>, but they are inherently unsemantic so getting people to restrict themselves to marking up a master text with only semantic tags will be difficult. 2. Similarly, I find it helpful to have an easy visual distinction between structural markup and presentation markup. 3. Also, since it's only divs and spans, the markup tends to become verbose and obscure pretty quickly, with a lot of noise to the signal. 4. And consequently marked-up XHTML is less transparent when you want to see only the text. 5. I would prefer to be able to tell at a glance whether I'm looking at a master format text or an output format text. 6. (This is a nit.) I think begin and end tag matches should visually match; XHTML only provides endiing </div> and </span> Yes, it's nice (I would say necessary) to be able to have your master text previewable. I can do that with my markup because the mapping to HTML is built into WordPress, accessible by hitting the Preview button. Probably what I'm using is closer to TEI, but with less overhead. If PG were to standardize on XHTML I would not need to abandon mine because the mapping to XHTML is there from the start. I'm not advocating that PG accept any format; only saying that any qualifying format is OK to me as a master format; just pick one. Your point about not caring how a paragraph is marked: I agree, except that HTML' <p> has some implications that shouldn't apply to every usage of a paragraph; and if people were to see the <p> markup and believe it can only be an HTML-type <p> then that's a problem. On Tue, Oct 9, 2012 at 9:05 AM, Lee Passey <lee@passkeysoft.com> wrote:
On 10/8/2012 5:03 PM, don kretz wrote:
On Mon, Oct 8, 2012 at 12:56 PM, Lee Passey <lee@passkeysoft.com
<mailto:lee@passkeysoft.com>> wrote:
On 10/6/2012 8:36 AM, Greg Newby wrote:
This is, more or less, exactly what I said we needed. There is no resistance to any of this. I even asked for input on figuring out what the requirements & enforcement would look like.
The first, and I think non-negotiable, requirement is that whatever standard is selected it must have a reasonably complete set of markup to capture all the features of a book. Figuring out just what this "reasonably complete" set of features is is nonetheless problematic.
This is the first and only requirement. And not reasonably
complete, absolutely complete. You can't format what you can't identify. But, if you identify everything, you can format it any way you want with software.
I do not believe a markup language exists that can capture the complete essence of a book, thus my use of the "weasel word," 'reasonably.' Rest assured, my bar of "reasonableness" is quite high, but it is not unrealistic.
And the list of things to identify can be done easily if your
markup is extensible, because you keep adding markup identifiers until you don't need any more.
A very salient point, with which I completely agree. Because XHTML is the basis of all modern e-book formats, whatever markup is chosen must at some point be reducible to XHTML. XHTML has two generic elements, <div> and <span> to which semantic inflection can be added by use of the "class" attribute, and the "class" attribute can be added to any other element allowing refinement of their semantics. For this reason, XHTML meets your requirement of an extensible markup language, but also satisfies the goal of being a base language which can be used directly without transformation.
Note that TEI also has generic block-level and inline elements, and semantic inflection can be added using the "type" attribute. Thus, TEI is also and extensible language even though the core elements are predefined and presumably immutable.
The rest of your requirements is just details and there are any
number of equivalent schemes; they are interchangeable as long as the things requiring identification are unambiguously tagged or otherwise clearly identifiable.
True, but I obviously have not made myself clear. The primary purpose for a standard is to be predictable. To allow documents to be submitted in /any/ markup language makes it virtually impossible to develop tool sets to generate common output, or to maintain those documents. Further, standards provide a yard-stick that can not only measure compliance, but which can become a learning and training tool, so that when someone like Mr. Salzer comes along and asks, "how [does one] properly prepare HTML files for PG?" we can say, "here you go, follow these rules and you will be compliant, and if something doesn't make sense or isn't covered, we will clarify or modify the rules so it /is/ covered."
Development of a standard is primarily a political endeavor, not a technical one. While there /are/ a number of equivalent schemes a standard means that you pick one and stick with it. Frankly, if I ruled the world, that standard would be TEI as it is the most complete of markup languages for text encoding, and being XML is easy to work with. But as a general rule, people's irrational fear of TEI is even greater than their irrational fear of XHTML, so as a practical matter HTML is a better /political/ choice.
I don't care if paragraphs are marked with <p> or <para> or {\pard} or two [CR/LF] pairs following a non-whitespace character and terminated by a [CR/LF] pair, just so long as I know that when I encounter that markup I am guaranteed that the text is /always/ a paragraph and /nothing but/ a paragraph. Whatever the consensus is, I will happily adopt it and develop tools for it. But I must have a single rule.
This kind of a process will require compromise, and I'm afraid that those who are unwilling to compromise will simply have to be left out of the process.
If you don't like my rules, fine, suggest alternatives. We'll go with whatever gets the most support. I don't mind losing so long as at the end of the day everyone gets behind the winner.
(For a very interesting exploration of the value of crowd-sourcing, listen to the Radio Lab episode "Emergence" at http://www.radiolab.org/2007/** aug/14/ <http://www.radiolab.org/2007/aug/14/>).
______________________________**_________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/**mailman/listinfo/gutvol-d<http://lists.pglaf.org/mailman/listinfo/gutvol-d>