Re: [gutvol-d] Producing epub ready HTML

23 Jan 2012

      On Sun, January 22, 2012 8:53 pm, hmonroe.pglaf@huntermonroe.com wrote:

[snip]
...
...
Why aren't WWers send back projects that include destructive layout
tagging, or don't include important structural tagging? I can think of any
number of reasons for rejection that are less disruptive to the reader's
satisfaction.
Because we have automated checks for validity and good spelling. We don't have
automated checks for (mis-) use of HTML for layout. If we had some sort of
automated and relatively unambiguous checks for such things, I'm sure that
many submitters would strive to comply.
-- Greg
...
There is a golden opportunity here for someone to create an automated tool to
assess the use/misuse of HTML which PG could use to screen submissions--see my
earlier post with some simple tests written in Perl.
The problem here is two-fold:

1. PG has no standards for submission of HTML (other than the obvious one that
it must be valid HTML), and no one can code to a non-existent standard. If PG
had unambiguous published standards I'm sure that most submitters would strive
to comply even /without/ automated checks.

2. The evaluation of a file for the existence of destructive layout tagging
and the non-existence of structural tagging cannot be automated. (The first
part of this statement is probably an exaggeration. I could probably easily
write a tool that would check for "style" attributes, but Perl wouldn't be the
best language for the job). Tidy can check of well-formedness, and there are
tools like Jing (http://www.thaiopensource.com/relaxng/jing.html) that can
test for compliance with a schema, but I can't think of a single tool that can
tell you that "<p align="center">Chapter One</p>" is wrong on so many levels.

The solution is also two-fold:

1. Develop a consensus HTML coding style for PG. Heck, it doesn't even need to
be a consensus, a mandate from TPTB would serve just as well, but a consensus
is more likely to be adopted.

2. Build a small set of individuals who are familiar with PG's HTML coding
style and could review HTML submissions. For example, I'm familiar enough with
the use of HTML for encoding e-books that I'll bet I could judge whether a
file is acceptable in less than 10 minutes. You could call this group of
examiners "white washers" for lack of a better term.

But without solution #1, nothing else is possible.

Re: [gutvol-d] Producing epub ready HTML

Lee Passey