here's that post to lee which i started over a week ago,
and part of what started making me feel that "despair".
this post started out as a straightforward review of
the difference between lee's tool and my methods.
but i felt i needed to preface that with commentary
on why a discussion with lee always is so frustrating,
and why i eventually had to put him in my kill-folder,
and how i wish that i wouldn't have reviewed his app,
and the "preface" soon overwhelmed the "review"...
so...
if you want to skip the back-and-forth scratching,
jump down to the two long lines of asterisks below,
surrounding a section saying "take a deep breath"...
***
lee's post (fri, oct 21st, at 17:47) can be found here:
> http://lists.pglaf.org/mailman/private/gutvol-d/2011-October/008200.html
dear lee...
ok, first, lee, let me be perfectly clear to you...
i understand all of your points -- every one --
about your program in your latest reply to me.
indeed, i understood all of those points when
you made them in your _previous_ reply to me.
so you didn't need to make them _again_, and
you won't need to make them again in a reply
to _this_. because i understand 'em. honest!
every one! totally and completely, lee! really!
i had simply forgotten how tedious it can be
to have a "conversation" with you, even when
you're _not_ trying to spin it or sabotage me.
but now i remember.
so i will give you a few reminders, and other
people here can see what i'm talking about.
i said:
> > or, ya know, you can always
> > give 'em _your_ source-code.
lee said:
> But that's exactly what I did!
yes, lee. i knew your code was open-source.
i downloaded your code from sourceforge.
and sourceforge is a host for open-source.
anyone with minimal experience knows that.
anyone who's been around a while knows that.
anyone who can read the blurb that describes
sourceforge, on the download page, knows that.
so yes, i knew your code was open-source...
and if you had thought about it for a second,
or given me 1/10 of the credit i "deserve" for
paying attention, or being a programmer, or
putting in time, or being a web-surfer who
isn't totally asleep, you woulda _known_ that
i knew that your code was open-source, and
you wouldn't have made the reply you made.
so i wondered why you made that reply?
but i stop wondering, after two seconds or so,
because i've learned it simply doesn't matter...
what it means, though, is that you missed that
my suggestion was _ironic_, a bit _sarcastic_,
and thus you missed the point i was making,
which is that you -- and others just like you --
make noise about the "open-source" aspect,
when -- in actuality -- the overwhelming mass
of open-source projects _don't_ get treatment
of the sort that you so-called "advocates" are
so fond of talking about, namely that the code
is worked over by a large number of people,
who not only ensure that it is solid but also
continually extend it to all kinds of new uses.
oh sure, that happens with _some_ programs.
but the vast majority of them are maintained
by one person, who does all the work on it,
until they tire of it, and then they respond to
further requests with a "you can do it yourself".
but nobody ever does.
you know who's gonna work on your app, lee?
you. and only you, lee. you. and nobody else.
when i told d.p. i would code a _spellchecker_
for them, they told me they weren't interested,
because "it won't be open-source"... so they
went without a spellchecker instead, for years.
and when they decided to take up the task of
adapting an open-source spellchecker, it took
a ton of time for them to get it to work, and
it _still_, to this very day, doesn't always do
what they would like it to do. and guess what?
have they _ever_ went in to rewrite that code,
so it would behave like they want it behave?
no, they haven't. as it would be too difficult.
is open-source a good idea? yes sir, it sure is!
is free software an even better idea? you betcha!
but let's not confuse the _real_ with the _ideal_.
just because somebody else _can_ work on it
does _not_ mean that they _will_ do that. ever.
that's the long explanation of the point that
i was making with my simple "suggestion"...
i didn't want to have to type all of _that_, though,
because then _i_ would've been the tedious one.
but as you don't "get the joke", then even having
any discussion with you becomes very tedious...
***
here's another example, for you, and the others...
i said:
> > isn't it the "trivially easy" tasks that we want
> > our computers to be performing _for_ us?
lee said:
> No, I don't think so. First you have to understand
> that there are tasks that are
> trivially easy /for a human bean/ that are
> extraordinarily complex for a computer.
> And there are tasks that are
> enormously complex for a "human bean"
> (primarily because they are so detailed)
> that are trivially easy for a computer.
well, gee, lee, thanks for the grand exposition there.
i bet your friends think that you're really really smart.
i grant that you're thorough, even as you manage to
miss the point completely, and by 3.85 country miles.
because we were talking about a task that is:
1. trivial for a human being.
2. trivial for a computer.
3. trivial for a human being to code a computer to do.
and i think that we can all agree that your exposition
is completely overblown in regard to that type of task.
yet that's what we were talking about. (go look it up,
if you need to, but the task was exactly like one that
you'd just talked about by saying your program did it,
so as "to relieve the tedium and avoid simple errors".)
***
plowing through these diversions becomes very tiring.
it's as if you're intentionally _trying_ to miss the point.
(i'm not saying that you _are_ doing it "intentionally",
mind you, because that _might_ be the "fundamental
attributional error" raising its ugly head... but i have
had to slash through the underbrush of these dodges
so often that they sure do _seem_ to be "intentional".
if they are not, then you would appear to have some
serious problems when it comes to staying on-point.)
i said:
> one of the "themes" of the event is "beautiful books".
lee said:
> Hopefully, you have your tongue
> planted firmly in your cheek.
not only do you _not_ get my humor when i put it out,
you _think_ i'm joking when i'm relating a simple fact,
combined with a link which you must not have checked.
these little misunderstandings cumulate to great frustration.
i will say that yes, i did find that theme to be _ironic_...
so maybe you can catch irony when i direct it at others,
but not when i direct it at you. however, i wasn't poking
_fun_ at that theme; i was appalled they would choose it!
that being the case, though, no need for your "hopefully".
their text is ugly, and thus they have no right to even use
the term "beautiful" in conjunction with their text-versions.
***
lee said:
> I have no Mac, no access to a Mac,
> and little interest in the Mac.
> The promise of Java was "write once run everywhere."
> Well, I wrote it once, now we'll just have to hope
> that some Mac developer out there can
> troubleshoot the problem (and tell me
> what the solution is once s/he figures it out).
this is exactly the type of attitude i was making fun of,
in my post to which you wrote this response...
for the record, let us note that "the promise of java"
has once again gone unfulfilled in a real-life instance.
***
i'll wrap this up, focusing on lee's post on wed oct 26 07:31.
> http://lists.pglaf.org/mailman/private/gutvol-d/2011-October/008216.html
> /Your/ file is
> http://ia700600.us.archive.org/16/items/artofbook00holm/artofbook00holm_djvu.txt.
> There it is, the text, the whole text, and nothing but the text.
i discussed why that text-file -- and all of the .djvu.txt files over at
at archive.org have problems -- but my post might have _followed_
this one, in which case we couldn't blame lee for not knowing that.
except that lee _should_ know all that. he has heard it before.
nonetheless, he keeps trying to distort what i mean by "a text file".
he keeps trying to talk about text-files as if _all_ of them had the
deficiencies of the archive.org text-files, as if _all_ of them were
lacking any structural information, and as if this was _required_...
you can make a text-file "smart" if you want to, and it does _not_
require any angle-brackets at all. and anything that someone can
do with angle-brackets can _also_ be achieved _some_other_way_,
in a plain-text file, and it's just ridiculous to say that it can not...
there's nothing magical about angle-brackets... nothing at all...
> I'm just being a little more demanding.
no, lee. you just _misinterpret_ what i am "demanding" as being
much less than it really is, and then you think you have "more"...
a direct and one-to-one correspondence can be made between
what _you_ are asking for, and what _i_ am asking for. the job
has some inherent demands, and if those demands are satisfied,
then both of us can do the job... but if not, neither of us can...
> What /I/ want is the output from FineReader
> as though the "Save as HTML" option was selected,
> with all the markup that FineReader was able to intuit
if i get "all the markup that finereader was able to intuit",
then i can do the job just as well as you can. maybe better.
the point is that archive.org isn't giving us that information;
they tell us we need to trawl their pile of x.m.l. crap to get it.
> Does anyone want to furnish me
> a *nix server with a fat pipe?
i had pointed out that, although it would be _possible_
to run a script against all 3 million books at archive.org,
the machinery and bandwidth required make it impractical.
lee's solution? ask someone to "furnish" all that to him...
i guess it never hurts to ask, 'eh? good luck with that, lee.
***
anyway, there are your examples, folks...
like i said, _tedious_.
and then he repeats everything.
this is why i point lee in my kill-file.
now i remember. so let's bring this to a close.
************************************************************
take a deep breath to clear your system...
take another deep breath to clear your system...
take a third deep breath to clear your system...
************************************************************
i now direct the remainder of this post _back_
to the audience at large, not lee specifically...
***
the main point of departure between lee and me
is that he _starts_ with "the text is in .html form".
_then_ his tool takes over. which is fine, i guess...
except for the fact that it doesn't match the reality
of how us regular humans actually make e-books.
it doesn't describe the task that is being done by
post-processors over at distributed proofreaders.
it doesn't even reflect how the e-book designers
who do the job _professionally_ go about the task.
because _we_ all start with text. maybe the text is
in a word-processing file, maybe it's raw ascii, but
it's most assuredly not already marked up in .html.
that's what _we_ have to do, to make it an _e-book_.
if it was already in .html format, we'd call it "done",
or mighty close to it.
you might have noticed, up above, when i said the
text _might_ be "in a word-processing file", yeah?
so maybe you're just thinking that we could ask the
word-processing app to convert the text into .html?
well, yes, we could. and some of us novices do...
but the professional book-designers don't do that.
and they strongly advise even us amateurs not to...
because what they have found is that the .html which
is applied by word-processing apps is _very_crappy_.
it gives poor results in most all the e-book viewers,
and it is extremely difficult to work with, when you
need to make changes. (and you almost always do.)
so the admonition is fairly universal: don't do that!
what do the professionals advise us amateurs to do?
they advise us to save the file as plain-ascii text, and
then to apply the .html to that plain text, including
the reapplication of styling (e.g., italics) which gets
_lost_ when the file is saved in plain-ascii format...
indeed, that is precisely what those professionals do.
preach what they practice, practice what they preach.
(if you don't trust me, ask me to provide some links.
or research it yourself. it's easy to find such advice.
joshua tallent, liz castro, or thebookdesigner.com...)
now, i think it's utterly ridiculous to strip away styling
and then have to _reapply_ it. but that's what they do.
the application of good solid .html, though, is wise,
so _that_ part of the advice i can thoroughly second...
even if you do it by hand, it's more economical than
letting a word-processor apply crap, which you then
waste more time -- long run -- trying to "improve".
now, the truth is that those pros have "scripts" that
apply the markup automatically. plus they _know_
.html already, well, so this comes naturally to 'em,
even if they have to do some of the work manually.
but their advice is still good advice for us amateurs,
because we get _totally_ confused by crappy .html...
without the slightest notions of how to "improve" it,
or even to make those inevitably-required changes
but whether you are a professional or an amateur,
the reality of making an e-book these days is that
you _start_ with text, which you mark up in .html...
(actually, for .epub, it's .xhtml, but we don't need to
even bother making such fine-grained distinctions.)
sometimes -- as with d.p. -- the text is from o.c.r.
other times, it was "born digital". whatever the case,
however, the reality is that we all start with _text_...
and the nature of the _job_ is doing .html markup...
it _is_ true that -- once you have done the markup
into .html, there's _still_ a bit more work after that.
and it's also true that this "bit more work" is often
_very_ confusing and time-consuming, _especially_
to us amateurs, because the i.d.p.f. -- which is the
organization that maintains the .epub standard --
has _never_ provided solid information concerning
just exactly what this "bit more work" really entails.
even the pros get confused, sometimes hopelessly.
(i'd give a link, but i don't want to embarrass them.)
however, if you do enough grunt work, and are ready
(if not willing) to power through frustrating failures
that can number in the dozens, or even _hundreds_,
you too can eventually discover the things that work,
and you can develop templates that ease future pain.
after you've done that, the "bit more work" that is
required _after_ you've done your .html markup is
fairly easy -- it's basically just filling in information
that's included in some "auxiliary" files in an .epub.
two of those files are the .opf file and the .ncx file.
you might recognize those extensions, since they're
the files about which i've lately been speaking to lee.
his epubeditor produces the auxiliary files for you,
and helps you put the required information in them.
so if you're one of the amateurs who are _struggling_
with the proper creation of these files, lee's program
would be a _godsend_ to you, saving time and hassle.
if you're a professional, you're not spending any time
or energy producing these files anyway, because you
already have scripts which make them automatically.
so you might use lee's tool to do occasional reviews
of your .epub files, or make minor corrections, but
it probably won't be an app you consider as "crucial".
more to the point, though, is that there are lots of
programs out there that already create .epub files
-- from text -- which generate the auxiliary files
(like .opf and .ncx) required _inside_ the .epub file.
they apply the .html _and_ create the auxiliary files.
so, to sum up, there are two steps to making an .epub:
1. transform the text of your book into an .html file.
2. create the auxiliary files required inside an .epub.
total novices, with no tools or experience, will spend
_much_ time on the first, and _much_ on the second...
professionals, operating with their pro tools, will spend
a good amount of time on the first, little on the second.
and amateurs, with the decent tools out now, will spend
a good amount of time on the first, little on the second.
in other words, lee's tool helps with the second step,
but no one except unexperienced novices spend time
on the second step to begin with. the second step is
the "paperwork" that must be done to "finish the job",
as the old expression puts it.
lee's tool totally ignores the first step, .html markup,
which is where everybody spends most of their time...
this makes me suspect that lee simply doesn't know
how real people in the real world make real e-books.
namely, we start with text, and we mark it up in .html.
then we do whatever little dance needed to turn it into
an e-book file that's viewable on our e-book machines.
now, if we only had some kind of a program that would
take plain old text, and automagically turn it into .html,
plus then create the auxiliary files required in an .epub,
_then_ we'd have an app we could call "an epub editor".
wait, isn't there such an app coming out real soon now?
well, yes, son, there is. called "jaguar". real soon now.
in the meantime, while you are eagerly awaiting that,
if you're on a mac, you might want to buy a program
called "multimarkdown composer", by fletcher penney,
which is an editor that incorporates "multimarkdown".
multimarkdown, also penney's, is a variant of markdown.
markdown is a light-markup system which converts text
into .html output that validates as standards-compliant.
thus "composer" is a great tool to help with the first step
listed above -- the hard step that takes most of the time.
"composer" is new in the app store, and it's just $7.99...
or, you know, you can use sigil. free/free. and it works.
as far as i know, it works fine. couldn't be much better.
-bowerbird