More than you ever wanted to know about XHTML and CSS

26 Apr 2005

      gutvol-d-request@lists.pglaf.org wrote:
...
On Sun, Apr 24, 2005 at 04:20:15PM -0400, Bowerbird@aol.com wrote:
...
i'm trying to look at the .html version of #15701, which i
downloaded as a zip file to my own machine, and it seems to require
an open internet connection. it wants to call the w3 or something.
why?
I don't know. It doesn't for me. It's conceivable--just-- that your
 browser is trying to pre-fetch the W3 DTD as defined in the DOCTYPE
 declaration, but it's the first time I ever heard of something like
 that happening. And the same declaration is in lots of texts; nothing
 new or strange about this one.
Internet Explorer 5.5 is the only browser I have for which my firewall
is configured to prevent outgoing access without permission. Opening
this file in IE 5.5 does not create any outgoing connections.
Examination of the file reveals that there are no resources referenced
in the file external to the file itself except: 1. the DTD declaration,
and 2. an image of the Burke coat of arms. Given the fact that your
browser is attempting to contact the W3C (the "owners" of the XHTML DTD)
I would agree with Mr. Tinsley that your browser seems to be attempting
to fetch the declared DTD. In fact, given that Opera seems to have
fairly good support for most XML vocabularies other than XHTML, I would
bet that you're seeing this behavior when using the Opera browser.

When you refuse the outgoing connection, is the document displayed
anyway? (not that it will make any difference, but I _am_ curious).

[snip]
...
...
and the .html version of #15698 won't work in either one...
That one is more interesting; it doesn't have a terminating HTML
 comment mark after the <style>. However, the W3C validators have no
 problem with it, and the parse tree is recognized, and I've given up
 trying to track all the ways that foreign command-sets or languages
 can be embedded in HTML. Maybe a newer browser will help. What is
 Opera on now? 8?
The problem is, indeed, the unterminated comment. The XHTML DTD defines
the <style> element as containing #PCDATA, which is to say textual
'stuff' which may or may not be HTML. An HTML User Agent should _not_
attempt to parse any of the data between <style> and </style>, but
should pass that text on to the stylesheet parser.

It has become common to embed an internal style declaration inside HTML
comments () for compatibility with older browsers which did not
support style sheets.  If a browser did not support style sheets it
would encounter the <style> tag and ignore it, as all good browsers are
designed to do. It would then encounter the HTML comment tag and ignore
everything until the closing tag was encountered. That way the browser
wouldn't display the style definitions as just more text. On the other
hand, stylesheet parsers are designed to ignore the comment tags
themselves, so all the stylesheet goodness is visible to a stylesheet
parser.

While the lack of a closing comment tag in the <style> element is a bug
in the document, the failure of your browser(s) to ignore comment tags
in a <style> element is also a bug in those programs. While I don't have
a working installation of IE prior to 5.5, which does _not_ have this
problem, the problem also presents itself in Opera 7.11, but has been
fixed in Opera 7.51. My experience has been that in the past Opera has
been somewhat slavishly devoted to mimicing the behavior of IE, even
when that behavior is contrary to internet standards (Javascript
implementations come to mind). It is therefore not surprising that early
versions of Opera should have the same behavior as early versions of IE.

Despite the bugginess of your browsers, the HTML text at Project
Gutenberg really should be fixed, as this will cause the failure to
display the text in any browser which does not support the <style> element.

Because the contents of a <style> element is #PCDATA, HTML validators
will generally not be able to catch this type of error. I have examined
the source code for HTML Tidy, and when it encounters a <style> tag it
simple creates a text node for the entire text up to the </style> tag.
No validation of the actual style sheets is performed. I suspect that
the W3C validator operates the same way. Validators are good tools, but
satisfying a validator does not mean that the HTML is, in fact, valid --
only that there are no errors of the type that the validators are
designed to catch.

On a related note, let me say that I view internal style declarations as
just plain rude. Style sheets are indeed A Good Thing, but someone
imposing their quirky notions of style on me is not. By placing style
definitions in an external style sheet and simply linking that style
sheet into the main document with a <link> element, it makes it easy for
me to strip away the suggested styles, and return to browser defaults,
by simply deleting or renaming the style sheet. And if the suggested
styles are mostly good, and need only a slight tweaking, it is safer and
easier to edit an external style sheet than the main document. I would
strongly encourage all PG volunteers who are creating HTML documents to
consider putting suggested style definitions in an external style sheet
rather than embedding those styles in the main document.

Lee Passey

Marcello Perathoner

tags

participants (2)