
Marcello Perathoner <marcello@perathoner.de> wrote:
Lee Passey wrote:
On the other hand, I don't see how Mr. Hutchinson's second example could validate if the first does not, particularly given the fact that DTD's are not structured in such a way to permit a validator to make that kind of a judgment ("if a <div> contains a <div> it must be the last element of the first <div>" or "if a <div> contains a <div> it may be preceded by a <p>, but not followed by one").
This simple declaration does exactly that:
<!ELEMENT div (p*, div*)>
"A div may contain zero or more p followed by zero or more div."
Well, I carefully decomposed the TEI DTD and discovered that you're absolutely right (but you knew that already, didn't you :-)). As I understand it, a <div> can contain just about any other element, but once you include another <div> you can't include anything else (almost). What the hell were they thinking? I don't see anything in the English spec that would have led me to this conclusion, and I can't think of any rationale why it should be this way. Is it possible that the DTD has incorrectly implemented the TEI spec? Or did the authors really intend this inane result? I have to admit, this requirement (and the fact that <div> is not allowed inside <p>) really makes me have second thoughts about the usefulness of TEI as an encoding (because it hinders you from making a level-one, incomplete, encoding). I would really like to know what the rationale for this rule is.
In this particular case, I suspect a bug in the validator program. I mean, writing validators is hard, and I am aware of at least one bug in the W3C's online HTML validator.
No bug. The TEI dtd is broken as designed.
Well, at least I was able to figure out that _something_ was broken.