I took a look at the source for the recent handsome re-release of
PG's edition of A Christmas Carol (46-h).
The code is bit old, <p> tags are not terminated and the formating
could be formated a bit better to make it more readable.
For example, the first paragraph looked like this:
<p>
<span class="caps">Marley</span> was dead: to begin with. There is no
doubt whatever about that. The register of his burial was signed by
the clergyman, the clerk, the undertaker, and the chief
mourner. Scrooge signed it: and Scrooge’s name was good upon
’Change, for anything he chose to put his hand to. Old Marley
was as dead as a door-nail.
I ran the file through HTML-Tidy which turned it into this:
<p><span class="caps">Marley</span> was dead: to begin with.
There is no doubt whatever about that. The register of his burial
was signed by the clergyman, the clerk, the undertaker, and the
chief mourner. Scrooge signed it: and Scrooge's name was good
upon 'Change, for anything he chose to put his hand to. Old
Marley was as dead as a door-nail.</p>
It took about ten seconds to open the, file run the file through tidy
and save it. This resulted in a file which is consistent, standards
compliant and far easier to read and process.
Open tags in HTML are an artifact of SGML which can confuse some
browsers, processing software and limit what you can do with CSS.
I suggest that all PG html files be run through Tidy before being
released.
If anyone wants the tidy'd version let me know.
b/
--
Brad Collins <brad(a)chenla.org>, Bangkok, Thailand