
So, what you are telling me hre, ist hat while a human can muddle through ok, it takes a computer to really maess things up. On Sat, 28 Jan 2006 Bowerbird@aol.com wrote:
michael said:
I read the PG eBooks with all sorts of plain text viewers and have no problems with inconsistencies, much less in the various browsers that have a wider range of options.
the inconsistencies are ones that a person "wouldn't notice", but which trip up any automated processing by a program...
an obvious example would be that most section-headings (e.g., chapter headings) are preceded by four blank lines, but the occasional one might have three, or five, instead...
nobody would claim that, in terms of a human reader, this inconsistency is meaningful -- it's not -- but when it comes to a program analyzing the file, it might make a difference...
if there's only one level of header, then 3 or 4 or 5 blank lines might be equally good at signaling that there is a new section.
but if a book has three different levels of headers, as some do, then you could use 5 blank lines to indicate the major sections, 4 to indicate regular sections, and 3 to indicate the subsections.
if the number of blank lines isn't consistent, the program has to become much more sophisticated (and thus prone to failure) to try and determine the _actual_ level of each header.
another example involves lines which should not be rewrapped, such as the lines in a table, or the lines in a letter's address-block. if these are consistently prefaced with one or more leading spaces, then a rewrap routine is easy to _write_ and easy to _comprehend_, and a programmer can spend time on more productive pursuits that add value and functionality, not ones that just resolve inconsistencies.
lots of programmers have _started_ programs for the p.g. library. the vast majority of them have given up before long, in frustration. the inconsistencies in the formatting are the main source of difficulty.
someday someone will set up a shadow version of the p.g. library where all the inconsistencies are resolved, and you will see then how much value is added by the ingenuity of programmers who are able to take consistent formatting of the e-texts for granted...
-bowerbird