Re: [gutvol-d] improving blah blah blah

24 Sep 2012

      jon said:
...
my hope is that all the PG powers-that-be 
   are part of PG for the right reasons, that 
   they recognise that a problem exists, and 
   that they are keen that it gets fixed.
you can grant them all of that.

but none of that means that they are going to
agree with you that you have the right solution.

and they have the keys to the castle.
...
From what you say, it cannot get fixed 
   without buy-in from the WWs and Marcello 
   to a specific course of action, so the problem is
   really one of generating consensus around 
   a course of action.
right.   and that hasn't ever happened here.   ever.

and it is clear that your plan will not do it either.

i mean, we could all agree to re-do some e-texts
-- without committing to your ms/rtt plan at all --
but the new e-texts would still be ignored by users.

so you're not paying attention to the _real_ problem.
...
with an MS, an RTT and Knuth's genius 
   I can do a solo project that shames Amazon 
   in a day or two.
well, there's another problem.   you wanna use tex.
go over to d.p. and see the tex contingent there.
it's tiny.   the learning curve is obviously too hard.
...
What I can't do is create the MS and RTT by myself.
you're wrong.   a scan-set is easy to obtain.

and getting the text correct is not difficult.

so, you know, go ahead and do your demo.
do it for one book, e.g., "pride and prejudice".
get the text from that pointer i gave to my site.
...
There seems to be a consensus around 
   a 10 book pilot forming. Success would mean that 
   1 in 20 downloads would be of improved quality. 
   Sounds good to me.
you're counting chickens when you don't even
have any eggs yet.
...
DP co-operation may not be as crucial as I once thought.
it's not "crucial" in the slightest.   it would be a hindrance.
...
I think it is worth at least _trying_ to bring them on board.
you're biting off more than you can chew as it is.   and now 
you want to stuff a big decaying head in your mouth as well?

d.p. stopped growing a very long time ago...

soon they will start falling from their plateau.

you already have an albatross around your neck.
don't try to tie an anchor to your ankle as well...
...
I fully admit to having precisely zero knowledge 
   or experience regarding book editions and 
   availability of scans. I naively thought that 
   "someone" would know which edition 
   a given extant text was derived from 
   and that it would be straightforward to 
   find a decent scan given that knowledge. 
   This is cleary very wrong.
sometimes it is easy.   sometimes it is not.

i've been unable to find any scan-set for the text
used for the p.g. version of "pride and prejudice".
...
That we don't necessarily know the edition 
   of the extant text raises the issue of 
   which edition we lock in to. Who makes that decision?
that's a non-issue.

get a scanset from the internet archive, which will
give you the o.c.r. as well. you correct that o.c.r. by
comparing it against a clean text, resolving the diffs.
...
I conclude that Don is absolutely correct. 
   The initial action of a 10 book pilot should 
   involve nothing more or less than determining, 
   in each case, which edition and hence which scan 
   we are going to work with.
ok, start with _one_ book, not 10.   seriously.   one.

because you still have _no_ word from a whitewasher
or marcello that _any_ change will be made to the site
that will allow a corrected e-text to float to the surface.

until p.g. gives some indication that it _truly_wants_
corrected editions, you are simply wasting your time.

if i were you, i'd make 'em prove their intentions by
bringing the books jim re-did to higher prominence.
...
We need someone knowledgeable to 
   guide this selection process and if necessary 
   make a final executive decision in each case.
again, you're looking for some type of agreement
that has _never_ever_ happened on this list before.

besides...

there's no reason you can't do _multiple_versions_
of a book.   again, take my "pride and prejudice".

i took one e-text and diffed it against another one.

that pointed out the discrepancies between the two,
which were either (1) an o.c.r. error in one or both,
or (2) edition differences.   you decide which it is by
comparing each of the e-texts against its scan-set.
at the end of the process, both e-texts are correct,
with the caveat that both could conceivably have
the identical o.c.r. error located in the same place.
(although my research shows that to be very rare,
quite less likely than one would think it might be.)

being able to present multiple editions, and generate
pointed evidence of the changes made across them, is
an empowering thing, one that might be considered
to be of considerable value by some people out there.

scan-sets are plentiful these days.   there's no reason
to think that we need to limit ourselves to one edition.

(having said that, however, i see _no_ useful purpose
served by an e-text that _cannot_ show its provenance
in a solid demonstrable way by pointing to a scan-set.
but that's a point that i have made many times before.)

***

greg said:
...
This is exactly opposite the PG policy.
   We specificaly do NOT adhere to any print edition.
see what kind of lunacy you are up against, jon?

a policy that once made sense was retained until
it no longer produced any solid benefit, and then
_retained_even_longer_ as it became a liability,
and now _is_still_retained_ even when it is stupid.

heck, these people can't even _spell_ "specifically".
...
(That is part of why you will find it really hard to 
   find a matching print source for many PG eBooks.)
which makes it very hard to submit an error-report
that the whitewashers can't reject if they want to...

now we are in the sad situation where the world is
awash in p.g. e-texts which have zero provenance.

if this doesn't change, there will come a time, and
it's not far down the line, where project gutenberg
will come to be considered a _liability_ to e-books,
an example of _how_not_to_do_it_.   how depressing.

-bowerbird

Re: [gutvol-d] improving blah blah blah

Bowerbird＠aol.com