re: [gutvol-d] Re: unluckily for us (gutvol-d Digest, Vol 13, Issue 30)

lee said:
Sure, but you don't get to call a lemon purple, just because that's the way you see it.
and if i _do_ call a lemon purple? or you call it chartreuse? who cares? if it ain't important in rendering the e-book, it makes absolutely no difference what color anyone calls it.
I can't see why this particular wheel needs re-inventing.
and your obsession with paragraphs strikes me as being as unimportant and trivial as the number of angels that can dance on the head of a pin. so, lee, are you up for our usability challenge on alice? or is this little exercise your attempt to deflect attention? -bowerbird

Bowerbird wrote:
lee said:
I can't see why this particular wheel needs re-inventing.
and your obsession with paragraphs strikes me as being as unimportant and trivial as the number of angels that can dance on the head of a pin.
It is NOT trivial as Lee explained. During high-quality ebook presentation, an end-user may want to style the output so true paragraphs are indented, but other blocks of standalone stuff (non-paragraphs) are not indented (and maybe presented with certain typographic differences.) So identifying the structure we call a true paragraph is important. Furthermore, the example given earlier of a block of verse within a paragraph is very interesting and illustrates the importance of identifying what is and is not a paragraph, and how far it extends. I'll repeat the example here in "plain text": [plain text example] ********************************************************************** The cows swung placidly down the lane, and Anne followed them dreamily, repeating aloud the battle canto from "Marmion" -- which had also been part of their English course the preceding winter and which Miss Stacy had made them learn off by heart -- and exulting in its rushing lines and the clash of spears in its imagery. When she came to the lines stubborn spearsmen still made good Their dark impenetrable wood, she stopped in ecstasy to shut her eyes that she might the better fancy herself one of that heroic ring. When she opened them again it was to behold Diana coming through the gate that led into the Barry field and looking so important that Anne instantly divined there was news to be told. But betray too eager curiosity she would not. ********************************************************************** (end of example text) Now, it is clear that we have only one paragraph here, so if we rely upon regularized text to perfectly structure that text, how does one identify, without machine textual analysis (such as analysis of upper/ lower case, which in Bowerbird's writings would fail!) that the portion starting with "she stopped in ecstasy..." is part of the prior paragraph and not the start of a new paragraph? In the scenario where the end-user wants paragraphs indented, if the last portion is misidentified as being the start of a new paragraph, then "she stopped in ecstasy..." will be indented -- not exactly the presentation result desired. The reader will be distracted upon seeing such an obvious "typo". It is clear that stand-alone regularized plain text (such as following the ZML system -- there are other systems) is limited in the number and type of document structures it can communicate to the processor. Now obviously one can add textual analysis (case and punctuation analysis which has to be language/country/era specific) to try to handle "exceptions" (and no doubt with thought the number of exceptions will become quite large), but now one requires fairly complex AI-like analysis of the content itself (it will have to be extensible to work for Han script, too!) to what would otherwise be simple to unambiguously communicate with XML markup. I suppose some would say that such fine structural detail capable in XML is still not important. Fine. But as I said before, one gives up the ability at fine (and essentially unlimited) structural (and semantic) detail possibilities when relying on regularized plain text to communicate document structure to a processor (human beings, if they understand the language, can take regularized plain text and discern finer structural detail by content and contextual analysis, namely human intelligence.) There's only so much that can be done with plain text when machine processed using today's level of AI. Since the major players in PG and DP (this includes Greg Newby) appear to be gung-ho on formatting the master etexts in XML, the burden is on those advocating plain text solutions to demonstrate that regularized plain text *is sufficient* for the purposes of high-quality presentation. If a "tool" is built to demonstrate the supremacy of regularized plain text, it better be able to handle the many exceptions such as the one above -- when the end-user wants paragraphs to be indented during presentation, "she stopped in ecstasy..." better not be indented. Jon Noring
participants (2)
-
Bowerbird@aol.com
-
Jon Noring