Re: DP output is technically obsolete

carlo said:
Another issue is to automate the creation of txt from HTML.
why do it backwards? when it's done correctly, the .txt file can create the .html... and an xhtml file, if that's what you want. and your .epub. plus it can generate whatever kind of .pdf you might want... don't you realize how stupid you sound when you say this? -bowerbird

Hypothesis: A good paradigm for proofing and marking up a book is an outline. Several assumptions that help this to work. 1. Without any exceptions I can think of, any comprehensible printed text can be completely, unambiguously outlined. We know from experience it works. Any XML document, including an XHTMLdocument, complies by definition. It's not just a good idea, it's the law. 2. An outline is easy to define and easy to understand. Conceptually, it's simply a regular hierarchical structure, with every syntactic element completely embedded within another below a simple sequential list of top-level elements. 3. Any syntactic element can be structurally identified as one of three types. a.) A section. b.) A sequence of characters. c.) A position offset from the start (of the text, and/or of an element.) We know from experience that this works. Any HTML element can be bound by one of only two types: a <div> or a <span>. What we need to do is to associate logical divs and spans with syntax. ================================================== Benefits: A book that has been outlined is probably simultaneously easier to build, to read, to comprehend, to verify visually, to verify grammatically with software, and to transform into ebook markups than any other format. And structurally, it's self-validating. Low barrier to entry. Anyone can proof with confidence from the start, with a brief introduction and a list of syntax elements. ================================================== Proofing interface. Notice that the proofing representation can be entirely separate from the serialized representation - i.e. how it's stored in a file for instance. What might it look like? We have lots of history for this - there are not many ways to represent language that are more universal than an outline. Almost all of us come pre-trained. Say the convention is to start an element with a newline, a plus sign, and a syntax tag, on a line by themselves. Paragraphs are so common that they can just start with, say, two blank lines. An element's content continues with indented content. An element ends with the start of another element at the same indentation level, two blank lines (another paragraph), or outdented content. +chapter +chapter-heading Chapter The First It was a dark and windy ... I think I'll play with this a bit and see how far it goes. Is anyone familiar with other attempts in this line?

Oh, and ... Yes, indenting every line of text is a bitch. So don't do it. If the tagging is done properly (probably an adaptation of the example), software can indent it automatically.
participants (2)
-
Bowerbird@aol.com
-
don kretz