
I looked again at Bowerbird's ZML "11 Rules" (forgot they were given in his latest beta demo-app, "give", which has been sitting on my hard drive for a few weeks.) Overall they are written pretty clearly albeit in Bowerbird's quirky yet entertaining style (there were a couple minor ambiguities, but nothing major enough to mention here.) What I found odd is that there appears to be no way in the current ZML spec by which paragraph-level blocks of text, which are other than basic paragraphs or headers or drama, may be specifically identifiable by machine processing. Essentially these structures are all lumped together, and are "visually" styled using tabs/spaces. They can include diverse document structures such as verse (poetry), block quotes, a letter, an epigraph (which in turn can include verse), a newspaper article, and various other things -- even "visual art" (such as the "river of text" in "Alice In Wonderland.") The most important of these non-paragraph blocks of text is verse. We'd *at least* like to see verse blocks to be unambiguously identifiable by machine processing, with some rule to tell authors "if you want to include some poetry, this is what you do...".) This will be useful for improving visual presentation options, will enhance accessibility (that is, the verse can be presented to the listener as verse and not as generic prose in text-to-speech engines) and may even aid more advanced searching mechanisms (which can be focused to search only in particular structures such as verse, for example.) In the ZML paradigm, one may think that one can identify verse by making a line within a block of text to be indented and non-reflowable. But this, in and of itself, is not sufficient since there are a host of other things found in texts which are not verse but which look something like it (e.g, the "river of text" mentioned above.) However, what might work in ZML is to "flag" a non-reflowable line as being part of verse by prefacing it with a specific sequence of white space characters, such as "space-tab-space", or whatever makes sense. Then a machine will know that a particular line of text is verse, yet the digital text will visually follow the ZML plain text paradigm. In addition, this technique could also be used to identify a few other non-paragraph structures by using a different sequence of white space. TEI devotes a whole chapter on the basic markup for Verse: http://www.tei-c.org/P4X/VE.html This shows the power of XML to markup verse in pretty fine detail. Another area in ZML that creates various problems (including for text-to-speech engines), and in general is quite limiting, is the use of italics, bold, etc. for emphasized text. We really want a system where we semantically explain *why* text is emphasized (since by conventions text emphasis is used for a host of things and people have to "figure" it out by textual context, something which can't yet be done by machine (requires human-level intelligence, something which does not appear to be on the horizon for quite a few more years.) There are a few reasons why in the master text it is good to say why something is emphasized, and not just say that it is emphasized. One reason is text-to-speech engines. With understood semantic information (the "why"), the text-to-speech engine (TTS) can present the emphasized text appropriate to the semantics. For example, if the emphasized text is simply linguistic or literary emphasis, the TTS can inflect the voice to reflect such emphasis. However, if the emphasized text is a title to a book or manuscript, or is a foreign phrase, a name of a ship, or one of several other things that visual text emphasis is used for, the TTS can present the text appropriately. Who wants their TTS to generically say "this is italicized ... text ... end of italics"? There are also other advantages to specifying "the why" of text empahsis, such as enabling more power search routines, more automated linking, etc. -- for example, we'll know where to find the titles of books and manuscripts since they will be identified rather than just "italicized" or made "bold". Definitely it will be nigh impossible within the ZML paradigm to explain the "why" of empahsized text, but trivial to do with XML since XML has *no inherent limitations* on the extent and depth of structuring content. (XML also solves the issue of identifying the exact structure of paragraph-level blocks of text as discussed earlier in this message.) ***** In summary, ZML is a good attempt at trying to bring order to plain text -- and as such is a viable format for plain text documents used only as plain text. But as a multi-purpose *master* text format it is too limiting. ZML is also not very accessible, at least to the level the accessibility community would prefer. Jon Noring