Re: [gutvol-d] Talking about TXT

29 Oct 2011

      greg johnson said:
...
ONE USE   of HTML (in emails) makes it 
   harder for multiple people.
   ANOTHER USE   of HTML (in PG books) greatly 
   improves the readability for multiple people.
nice try, greg.

but reading software can be configured
so that either format is equally readable.

a great recent innovation in readability
-- it's actually _named_ "readability" --
is one that turns the .html of web-pages
_back_ to a display more like a text-file.

it also eliminates a lot of the _crud_ that
many web-pages have today -- e.g., ads,
recent-tweet slideshows, and all of that,
which is a tremendous relief as well --
but the heart of its charm is that it gives
a straightforward rendering of the words.

listserve software could easily use .html,
and e-books could easily use plain-text.
conversely, both of them could use both.
it's just a small matter of programming...

likewise, converters between the two are
exceedingly common.   if you need a link
to some that work well, just let me know.

programmers know the truth of formats:
they ultimately end up meaning very little.

what is important is _not_ how the data is
wrapped, but what happens after extraction.

only non-programmers are foolish enough
to think a format can accomplish anything.

if we took all of the energy which has been
_wasted_ on formats in the past decade and
put it instead into improving our programs,
we would be light-years ahead of ourselves.

instead, now we have listserve software that
often fails for many, in one way or another.
on the e-book front, we're now re-living the
"browser wars" from world-wide-web history.
somehow, the format wonks _never_ learn...

the good news is _users_ are now aware of
the importance of apps to user-experience.
programmers can go directly to end-users,
without wading through the format-wonks.

***

greg said:
...
Here's an attempt to get at an 
   objective truth behind the format wars
as per above, only non-programmers will
even bother to think formats really matter.
and get all strung out trying to make them.
...
a format that is still all ASCII 
   but removes all the end-of-column CR's, and 
   only has double returns at the end of paragraphs.
first, if you eliminate the end-of-line characters,
there's no need for end-of-paragraph characters
to be doubled -- singles work just fine, thanks...

but the mid-paragraph linebreaks in p.g. e-texts
aren't really a problem that's all _that_ serious...
they can be rather-easily removed systematically.

if they seem serious to you, it's only because you
are using some inferior and very stupid software.
...
Yeah, some peeps might be able to 
   write a program that takes out these CR's, 
   but I know one time I tried it in something 
   like MS Word, I messed it up worse.
ms-word isn't the best programming environment.

and if your first attempt ended in failure, then i can
_invite_ you to "join the club", but to do that, you'll
have to request the paperwork to be a member, and
you'll have to fill it out, and send it in, and hear back,
and if you're accepted, you'll have to pay dues to join.

one failed attempt simply ain't enough to get inside...

so if you wanna join the club, you'll need to do more.

but there's no need, really, because if you really want
to know how to do the job, start the dialog with me...
by the time it's over, you'll know everything you need
to know about how to do the job, easily and accurately.

the important thing to know is that if your viewer-apps
cannot do this for you, _automatically_, the problem is
in your viewer-apps, not necessarily in the text-format.

your viewer-apps should be able to handle any format
that has included all of the reasonably important data.
...
I'm not advocating for HTML, but for 
   a TXT version without the 80 char line breaks.
again, the 80-character line-breaks are easy to fix.

so you're barking up the wrong tree.   concentrate on
the things that are _not_ easy to fix, instead, please...

some of the p.g. e-texts do not include information
about the _structural_ aspects underlying the text...

for instance, they're missing information on italics.
or they don't have a clear indication of blockquotes.

_those_ are the shortcomings in the p.g. text-format.

and -- in reality -- they are the fault of _the_people_
who prepared the e-texts, not endemic in the format.
those structural aspects _can_ be coded in plain-text,
it was just that the person _failed_ to do that coding.

or the workflow failed to tell them that they should!

believe it or not, i had to _fight_ to get p.g. to require
that italics should get marked in the plain-text files...
_greg_newby_ himself said that it was "unnecessary".
in 2003! when the library already had 10,000 books!

greg is a smart man.   hey, he _must_ be, since he is
a professor at a university in alaska, a professor with
a library-science degree...   but what he said that day
was stupid...   very very stupid... exceedingly stupid...

luckily, he was big enough to admit he was wrong...

and jim adcock, the jim who is continually railing on
about all the deficiencies of the p.g. e-text format?
we saw here that -- at least on one book he posted,
but probably _every_ book he had done up to then --
jim _deleted_ italics markup from the book's e-text.
he included it in the .html version, but then actually
_deleted_ it from the text-version.   so unbelievable!

the text versions will, of course, be utterly useless if
you've decided that you'll intentionally neuter them...

but a well-prepared text-version, like a .zml file,
can be converted into .html, .pdf, .epub, .mobi, or
any number of new formats that might come along.
and, with a good app, be viewed on an old machine,
and not just viewed, but -- again, with a good app --
turned into a dynamite e-book with rad functionality.

forget formats.   concentrate on improving apps.

-bowerbird

Re: [gutvol-d] Talking about TXT

Bowerbird＠aol.com