
On Sun, January 22, 2012 8:53 pm, hmonroe.pglaf@huntermonroe.com wrote: [snip]
Why aren't WWers send back projects that include destructive layout tagging, or don't include important structural tagging? I can think of any number of reasons for rejection that are less disruptive to the reader's satisfaction.
Because we have automated checks for validity and good spelling. We don't have automated checks for (mis-) use of HTML for layout. If we had some sort of automated and relatively unambiguous checks for such things, I'm sure that many submitters would strive to comply.
-- Greg
There is a golden opportunity here for someone to create an automated tool to assess the use/misuse of HTML which PG could use to screen submissions--see my earlier post with some simple tests written in Perl.
The problem here is two-fold: 1. PG has no standards for submission of HTML (other than the obvious one that it must be valid HTML), and no one can code to a non-existent standard. If PG had unambiguous published standards I'm sure that most submitters would strive to comply even /without/ automated checks. 2. The evaluation of a file for the existence of destructive layout tagging and the non-existence of structural tagging cannot be automated. (The first part of this statement is probably an exaggeration. I could probably easily write a tool that would check for "style" attributes, but Perl wouldn't be the best language for the job). Tidy can check of well-formedness, and there are tools like Jing (http://www.thaiopensource.com/relaxng/jing.html) that can test for compliance with a schema, but I can't think of a single tool that can tell you that "<p align="center">Chapter One</p>" is wrong on so many levels. The solution is also two-fold: 1. Develop a consensus HTML coding style for PG. Heck, it doesn't even need to be a consensus, a mandate from TPTB would serve just as well, but a consensus is more likely to be adopted. 2. Build a small set of individuals who are familiar with PG's HTML coding style and could review HTML submissions. For example, I'm familiar enough with the use of HTML for encoding e-books that I'll bet I could judge whether a file is acceptable in less than 10 minutes. You could call this group of examiners "white washers" for lack of a better term. But without solution #1, nothing else is possible.