re: [gutvol-d] Re: here ya go, joey

just so y'all understand, i sent joey the .html version of my "alice" that passes the validator -- backchannel -- and that's what his post was about... and now i realize that "his" user agent is one that he's programming himself... i'm happy to help another programmer track down some idiosyncrasies, joey, and i'll send you a report backchannel.
Now, just to check my understanding, this is a document that went from PG-TEI -> zml -> HTML?
<center><h3><a href="#toc"> chapter one </a></h3></center>
i started with plain-ascii text-files -- the origin of which is irrelevant -- and compared them all to tweak a consensus copy that was good .zml. i then ran that through my text-to-html converter. for alice, i did this long ago, and might have hand-tweaked things after that conversion, as my converter is still a work-in-progress, i can't remember. last friday, you might or might not recall, i turned my focus to "passing the validator" -- only because i'm preparing a submission to project gutenberg for a friend -- which informed me of some small tweaks needed to jump through that hoop. i applied those tweaks to the alice .html when marcello tried to make "validation" an issue here... to give you an idea of the necessary changes, passes the validator, but
<h3><center><a href="#toc"> chapter one </a></center></h3> does not. it wants the header tag nested inside the center tag, not vice-versa. since "the validator knows html better than i do", i'm sure there is some "good" reason for this discrepancy, but i don't know -- or care -- what it might be, because both ways work fine in 99% of the browsers out there in the installed base.
call me a heretic, but i don't care about the technoids and their "seal of approval" that my .html is written in their preferred style. as long as it works, i'm done with it, and any more time invested is a waste of my time. as i said, i understand why project gutenberg requires validation -- because they cannot scrutinize every .html file that is submitted to them to ensure that "it works" -- so when i am submitting a file i'll make sure that it validates. but otherwise, i will make sure my converter outputs files that do _not_ validate, since i don't want any of my "antagonists" here using _my_program_ to further their technoid aims which are directly contrary to my own... -bowerbird

Bowerbird@aol.com wrote:
Now, just to check my understanding, this is a document that went from PG-TEI -> zml -> HTML?
i started with plain-ascii text-files -- the origin of which is irrelevant --
No, it is not. Because you claim to be able to convert the existing texts of the pg archive with almost no markup work at all. Yet, to demonstrate your `converter' you take a file that was produced by a machine. Of course this file is much more regular, and thus easier to cope with, than most of the human-edited files in the pg archive. If you were by any means serious about your `converter', you would take a random selection of at least 100 files out of the pg archive and tweak them so they work in your `converter'. You would then post the source code of your `converter' and the diffs documenting how many changes were necessary to the files to make them work. We could then run those files thru your `converter' and see for ourselves if it works as claimed or doesn't work
i then ran that through my text-to-html converter. for alice, i did this long ago, and might have hand-tweaked things after that conversion,
So the 159 errors went away by `hand tweaking' the output and not by fixing the bugs in the `converter' ? Are the end-users also expected to hand-tweak the output before reading the book?
last friday, you might or might not recall, i turned my focus to "passing the validator"
Was this before or after? you shouted to the world:
i don't give a whit about validation. ... for the time being, my conversion routines will _not_ give .html that passes the validator. and that's a conscious decision.
You should decide on one lie and stick to it. Or people will start to think about foxes and sour grapes.
i don't know -- or care -- what it might be, because both ways work fine in 99% of the browsers out there in the installed base.
How do you prove your claim that your broken html works in "99% of the browsers out there" ? Did you check your work with browsercam? Do you even *know* how many different user-agents there are "out there" ? And even if it should work in 99% (which it does not!) still you would leave 1% out in the rain. This is not an option if you want to post your texts on PG.
i will make sure my converter outputs files that do _not_ validate, since i don't want any of my "antagonists" here using _my_program_ to further their technoid aims which are directly contrary to my own...
Who should want to use a program that makes 159 blunders in a short text and requires `hand tweaking' the output to pass the validator ? Don't lose any of your sleep over that. I think you are *absolutely* safe from that direction. -- Marcello Perathoner webmaster@gutenberg.org

On Fri, Sep 23, 2005 at 08:55:38PM +0200, Marcello Perathoner wrote:
Bowerbird@aol.com wrote:
Now, just to check my understanding, this is a document that went from PG-TEI -> zml -> HTML?
i started with plain-ascii text-files -- the origin of which is irrelevant --
No, it is not. Because you claim to be able to convert the existing texts of the pg archive with almost no markup work at all. Yet, to demonstrate your `converter' you take a file that was produced by a machine. Of course this file is much more regular, and thus easier to cope with, than most of the human-edited files in the pg archive.
Bowerbird: I disagree with your assessment that the origin of the source data doesn't matter. As Marcello so eloquently points out above, I find that there are several orders of magnitude of difference between "converting" plain text to HTML when a program generated the plain text from extremely rigorous markup language and converting a plain text file created by the collaborative efforts of several different humans (each with their own text editing software and habits). Perhaps you can explain why you think it is irrelevant?

Bowerbird wrote:
to give you an idea of the necessary changes, <center><h3><a href="#toc"> chapter one </a></h3></center> passes the validator, but <h3><center><a href="#toc"> chapter one </a></center></h3> does not. it wants the header tag nested inside the center tag, not vice-versa. since "the validator knows html better than i do", i'm sure there is some "good" reason for this discrepancy, but i don't know -- or care -- what it might be, because both ways work fine in 99% of the browsers out there in the installed base.
<h3> (and the other header elements) may not contain any block level content, only textual content (PCDATA) and other inline tags (e.g. <i>, <span>, <a>, etc.) <center> is itself a block level tag, identical to <div align="center">. Thus, <center> should not be used within <h3>. Here's the link to the HTML 4 spec, which is very useful to understand the many HTML tags, their semantics, their attributes, and content models (what they may contain): http://www.w3.org/TR/REC-html40/present/graphics.html#edef-CENTER Since <center> is deprecated (and no longer appears in XHTML 1.1), it is best not to use it. Use CSS instead. For example, CSS can be directly applied to the <h3> (and other header) elements. Now, if you gotta have certain stuff center in the absence of CSS, then you can do the following. Instead of: <center><h3> ... </h3></center> do: <h3 align="center"> ... </h3> The 'align' attribute can be used for a lot of block-level elements: http://www.w3.org/TR/REC-html40/present/graphics.html#adef-align It, too, is deprecated (and removed from the newest XHTML), but if you were going to use <center> come hell or high water, then using the 'align' attribute instead leads to simpler markup. Plus now you can specify values of 'left', 'right', 'center' and 'justify'. It should work in all HTML 3.2/4.0 browsers.
as long as it works, i'm done with it, and any more time invested is a waste of my time. as i said, i understand why project gutenberg requires validation -- because they cannot scrutinize every .html file that is submitted to them to ensure that "it works" -- so when i am submitting a file i'll make sure that it validates. but otherwise, i will make sure my converter outputs files that do _not_ validate, since i don't want any of my "antagonists" here using _my_program_ to further their technoid aims which are directly contrary to my own...
<laugh/> Jon
participants (4)
-
Bowerbird@aol.com
-
joey
-
Jon Noring
-
Marcello Perathoner