PG could be keeping itself busy forever digitizing some order of magnitude
more ebooks than it does today and still not be catching up. The list is
to describe the scope of what all would need to scale up if PG were
to take the true magnitude of the job seriously.

Take a ridiculously low number - say a rate of 10 times more books
than are being added now.

I can't even imagine trying to handle that with most of the processes
in place now. But I think it's reasonable to imagine better processes
that could scale that far. It's probably going to involve supporting
multiple ways of doing the same thing to suit different work style
preferences. It absolutely needs to be simpler. I think it's less a
matter of better tools at this point than deciding what's the core
requirements in each of the items on that list and agreeing on what
needs to be done in order to support the rest of the process. Almost
none of them are specified with enough clarity to enable a first-timer
to know what to do, how to do it, and most importantly when
they've done it correctly. We spend way to much time on that point
alone - person A thinks they've done a magnificent job and person
B accuses them of producing junk in the pursuit of vanity. There's
no objective standard so the animosity just grows and PG doesn't.

It's pointless to demand someone else do it better, or more correctly,
or less dumbed-down or more portably, when the nature of the job
to be done, much less the proper way to do it, is subjective to each
of us.



On Fri, Oct 5, 2012 at 2:37 PM, <Bowerbird@aol.com> wrote:
don said:
>   So what are you suggesting?
...
>   What is a constructive suggestion?

as per usual, for this _particular_ topic, we have the
otherwise uncommon phenomenon where jim adcock
is one of the only people making sense in the dialog.

so let me translate for you.

jim recommends that p.g. should _require_and_enforce_
submitted .html to be convertible to quality .mobi/.epub.

as _part_ of this, jim recommends that the converter-tool
should be improved, since its current performance sucks.

jim suggests that an improved converter coupled with just
a _little_ bit of bending on the part of the producer's whims
will mean that _everyone_ will end up with a better outcome.

he would probably also suggest that the converter-tool be
made more widely available, so that producers could use it
to check the viability of their files _while_ working on them.
(the current p.g. web-app tool is extremely clumsy to use.)

it is clearly _possible_ to implement all these suggestions,
because the systems i've built have routinely included them.

so i would certainly agree with all of jim's suggestions.

indeed, i can't see how there could be much opposition.

oh, right, now i remember.  you don't like the messenger.

***

as for the simple/complex dimension...

i'll ignore for a moment that the label has its problems...

i believe it is quite easy to make a concrete suggestion:

use the simple-basic workflow for the 99%+ of books
in the library which can be finished with that approach.

portion out the remaining less-than-1% of the books to
people who are in love with their complex methodology,
and see who comes back with the best-working product.
for each e-book, give the winning contender a gold-star
sticker to put on their forehead, for the bragging rights.

***

>   1. Choose a book to digitize.
>   2. Collect and record metadata, and confirm Public Domain status.
>   2. Obtain images of all the pages.
>   3. Produce text with OCR.
>   4. Refine the text to match the OCR.
>   5. Locate and mark significant artifacts in the text.
>   6. Submit package of text(s) and images to PG.
>   7. PG review and post to catalog.
>   9. Select layout and format for desired device or application.
>   10. Select or create layout and format for text and artifacts.
>   11. Issue ebook per spec (possibly including production.)
>   12. Identify errors or other revisions to text.
>   13. Locate and describe revisions.
>   14. Submit revision to PG.
>   15. Back to step 7 to validate and apply revision.

well, first off, you might wanna check your numbering.

more importantly, can you please explain how this list
sheds any light or understanding on contended issues?

it's reminiscent of your earlier call for "a feasibility study".
we're about a decade past the time for such suggestions...

there are simple, _obvious_ problems right in front of us
-- one example is a whitewasher who posted an "update"
less than 6 weeks ago with headers tagged as paragraphs,
and not with some "header" class, but bracket-p-bracket --
yet you want to fuzz this all up with a kindergarten outline?

is p.g. paying you to be a "consultant", or what?  i don't get it.

-bowerbird

_______________________________________________
gutvol-d mailing list
gutvol-d@lists.pglaf.org
http://lists.pglaf.org/mailman/listinfo/gutvol-d