>but even those are bearable, _if_ they can be
easily removed.
The linebreaks are removable if PG enforces standards on txt
files submitted. When people make mistakes on those submissions, and they
will, then the linebreaks will not be easily removed correctly. Books of
poetry or containing poetry are one common counterexample. Make a copy of
your linebreak removal routine public in the common computer formats BB, and
let us test it and see just how easily it works on the existing PG txt files.
The Unicode txt efforts are not too bad because at least then
people can choose to represent the glyphs the typesetter chose if they choose
to do so, rather than guessing and reinterpreting intent. Italic and SC
is then still clearly a loss, as is graphics. Most books use a least
italics, so I’d hate to see a PG file format that doesn’t even
support that. If you wanted to implement even a Unicode txt+ file format
then you’ve got to provide renderers for the different machines. Or
you auto-translate Unicode txt+ files to HTML for submitters and use the
ubiquitous HTML renderers to allow people to view the Unicode txt+ version.
Then submitters do not have to submit HTML unless they want to. Recent
efforts about 95% of the submissions DO have HTML, but its not clear that that
is because people want to provide HTML or because the WW require it.
PG *is* already doing this more-or-less on the rare txt-only
submissions nowadays – automagically unwrapping and translating to HTML
in a way which most of the time is a win and obviously occasionally a
loss. The PG legalese unfortunately is particularly unattractive in this
approach, and when the unwrapping fails then it is visually distracting –
“how come this paragraph isn’t unwrapped – is it suppose to
be poetry?”
How about it? Unicode txt+ file submissions if that is what a
submitter wants to do, and PG automatically renders that in HTML, and ePUB, and
MOBI?
But if you are willing to take txt-only submissions and autorender
them into HTML accepting the resulting mistakes then why is it that you aren’t
willing to take HTML and autorender them into the mandatory txt70 files?
Certainly going from HTML to txt70 must introduce fewer mistakes. ???