[gutvol-d] O.k., looked more closely at the current ZML spec -- my specific thoughts

25 Sep 2005

      I looked again at Bowerbird's ZML "11 Rules" (forgot they were given
in his latest beta demo-app, "give", which has been sitting on my
hard drive for a few weeks.)

Overall they are written pretty clearly albeit in Bowerbird's quirky
yet entertaining style (there were a couple minor ambiguities, but
nothing major enough to mention here.)

What I found odd is that there appears to be no way in the current ZML
spec by which paragraph-level blocks of text, which are other than
basic paragraphs or headers or drama, may be specifically identifiable
by machine processing. Essentially these structures are all lumped
together, and are "visually" styled using tabs/spaces. They can
include diverse document structures such as verse (poetry), block
quotes, a letter, an epigraph (which in turn can include verse), a
newspaper article, and various other things -- even "visual art"
(such as the "river of text" in "Alice In Wonderland.")

The most important of these non-paragraph blocks of text is verse.
We'd *at least* like to see verse blocks to be unambiguously
identifiable by machine processing, with some rule to tell authors
"if you want to include some poetry, this is what you do...".) This
will be useful for improving visual presentation options, will enhance
accessibility (that is, the verse can be presented to the listener as
verse and not as generic prose in text-to-speech engines) and may even
aid more advanced searching mechanisms (which can be focused to search
only in particular structures such as verse, for example.)

In the ZML paradigm, one may think that one can identify verse by
making a line within a block of text to be indented and non-reflowable.
But this, in and of itself, is not sufficient since there are a host
of other things found in texts which are not verse but which look
something like it (e.g, the "river of text" mentioned above.)

However, what might work in ZML is to "flag" a non-reflowable line as
being part of verse by prefacing it with a specific sequence of white
space characters, such as "space-tab-space", or whatever makes sense.
Then a machine will know that a particular line of text is verse, yet
the digital text will visually follow the ZML plain text paradigm. In
addition, this technique could also be used to identify a few other
non-paragraph structures by using a different sequence of white space.

TEI devotes a whole chapter on the basic markup for Verse:

   http://www.tei-c.org/P4X/VE.html

This shows the power of XML to markup verse in pretty fine detail.

Another area in ZML that creates various problems (including for
text-to-speech engines), and in general is quite limiting, is the use
of italics, bold, etc. for emphasized text. We really want a system
where we semantically explain *why* text is emphasized (since by
conventions text emphasis is used for a host of things and people have
to "figure" it out by textual context, something which can't yet be
done by machine (requires human-level intelligence, something which
does not appear to be on the horizon for quite a few more years.)

There are a few reasons why in the master text it is good to say why
something is emphasized, and not just say that it is emphasized. One
reason is text-to-speech engines. With understood semantic information
(the "why"), the text-to-speech engine (TTS) can present the
emphasized text appropriate to the semantics. For example, if the
emphasized text is simply linguistic or literary emphasis, the TTS can
inflect the voice to reflect such emphasis. However, if the emphasized
text is a title to a book or manuscript, or is a foreign phrase, a
name of a ship, or one of several other things that visual text
emphasis is used for, the TTS can present the text appropriately. Who
wants their TTS to generically say "this is italicized ... text ...
end of italics"?

There are also other advantages to specifying "the why" of text
empahsis, such as enabling more power search routines, more automated
linking, etc. -- for example, we'll know where to find the titles of
books and manuscripts since they will be identified rather than just
"italicized" or made "bold".

Definitely it will be nigh impossible within the ZML paradigm to
explain the "why" of empahsized text, but trivial to do with XML since
XML has *no inherent limitations* on the extent and depth of structuring
content. (XML also solves the issue of identifying the exact structure
of paragraph-level blocks of text as discussed earlier in this message.)

*****

In summary, ZML is a good attempt at trying to bring order to plain
text -- and as such is a viable format for plain text documents used
only as plain text. But as a multi-purpose *master* text format it is
too limiting. ZML is also not very accessible, at least to the level
the accessibility community would prefer.

Jon Noring