
greg said:
typos or fixes or additional master formats can be contributed
you want people to fix typos without reference to a scan? please tell me that you didn't just suggest that seriously... and how can a master format be made with any certainty without full reference to a canonical version of the book?
The main features here mostly exist, but not as flexibly as I'd like to see.
where do any of these "main features" of customization exist -- not just "mostly" -- but in any form whatsoever?
These, and others, have also been discussed deeply.
don't be ridiculous, greg. these issues have "been discussed" repeatedly, but they have never been discussed "deeply" at all. they have never come to anything actionable, usually because people here drag these threads all over the back-40 acres, and then drag in dead field-mice from the neighboring homestead and plop them smack-dab into the middle of every discussion. surely most of this crap would earn a "c" grade in your courses, and even _that_ assessment is probably being far too generous. the discussions weren't even "shallow"... perhaps "superficial". (but even that implies they were on-topic. but no, usually not.)
These can be better than automatically-generated versions
oh please. haven't you learned anything from the "snowflakes"? hand-crafted versions are simply impossible to keep up-to-date. impossible. totally and completely. and now you want to _invite_ users to submit as many snowflakes as we possibly can? insanity. are you really in charge of this project? if so, i fear for its future... -bowerbird

BB>hand-crafted versions are simply impossible to keep up-to-date. I'm not sure I understand this passionately held opinion. OCRs of books have a *finite* number of scannos which need to be "hand-crafted" to remove those scannos. The important features of books need to tagged, and the visual identification and categorization of that tagging is a "hand-crafted" activity. Finally, the book needs to be rendered in an attractive manner on a variety of families of target devices. Specifying those formats is "hand-crafting" and visually checking whether that effort, indeed any final effort, is a "success" or a "failure" is also a "hand-crafting" activity. I think what you-all are talking about are attempts to *partially* automate some of these activities, which is fine, but the ultimate measure of success is not the ability to automate, but whether the final book being read on a final customer device ends up being a worthy result or not. The end customer, frankly, doesn't know and doesn't care what we-all had to do to get there. You can complain about "snowflakes" all you like, but if the final product is not worthy, all that happens is that which happens right now: People on other forums complain about how the books on PG are crap and how in the course of a weekend they were able to create a much better version which you should now download from their site instead of PG's. End result, their site gets the credit, not PG, and the readers of that book then don't understand that that book came from PG, was the work of a lot of volunteers, and if they like these books then maybe they can step up and help make books too.

On Tue, January 31, 2012 10:29 am, Jim Adcock wrote:
BB>hand-crafted versions are simply impossible to keep up-to-date.
I'm not sure I understand this passionately held opinion. OCRs of books have a *finite* number of scannos which need to be "hand-crafted" to remove those scannos.
You and the Bower Bird are talking past each other, not with each other. Imagine a scenario where someone OCRs a book and goes through by hand and carefully fixes any errors caused by the OCR process. Now imagine a scenario where someone takes that file which was carefully made by hand in scenario one and crafts it by hand into a /new/ file adding specific markup, and removing other markup, with the express purpose of making it most presentable on Acme corporation's MyPad reader. When BB talks of "hand-crafted" versions he is speaking of the second of these activities, not the first. The first activity creates the master, and the second activity creates a "snowflake" which attempt to preserve everything good about the master but with some unique aspects. If you see a bad Kindle "snowflake," and fix it (creating Yet Another Snowflake), you have done nothing to fix what might be the same error in any other "snowflake." OTOH, if all the "snowflakes" are derived programmatically from a single master, a fix to the master will automatically propagate to them all. If you choose, you may continue to intentionally misunderstand what BB is trying to say, insisting that he use your vocabulary instead of his own, but in doing so you won't be contributing to anyone else's understanding of the problems. It seems that your basic contention is that the automatic creation of derivative formats from a single master format is simply not possible. Fair enough, this kind of defeatism is common, and sometimes even accurate. The evidence you offer for this belief is that Mr. Perathoner's processes don't do their intended job well. My belief is that even if your proffered "evidence" is correct, the failure of one attempt does not prove that it is impossible. I still believe, perhaps naïvely, that a single master format can be created from which all other formats can be successfully derived without any hand-tweaking at all. So I would say that you should continue to lobby PG for your "snowdrift" repository. Just don't berate others simply because they are not interested in solving /your/ problem.

If you choose, you may continue to intentionally misunderstand what BB is trying to say, insisting that he use your vocabulary instead of his own, but in doing so you won't be contributing to anyone else's understanding of the problems.
You read evil intent on my part simply because we disagree.
It seems that your basic contention is that the automatic creation of derivative formats from a single master format is simply not possible. Fair enough, this kind of defeatism is common, and sometimes even accurate. The evidence you offer for this belief is that Mr. Perathoner's processes don't do their intended job well.
You again assign evil intent to what I am saying "labeling" it as defeatism, when I am simply trying to point out the weaknesses of what you propose, and that makes you uncomfortable. The automatic creation of derivative formats from an "HTML" "master" format is very problematic, which I'm sure Marcello and others can tell you. If that is what you are trying to do, you might consider instead specifying an XML spec that actually specifies that which you think needs to be specified. That still won't get you anywhere unless you have an army of people willing to buy into that XML specification, and realistically the only place to get that army is from DP. Which god forbid would mean that PG and DP would have to learn to get together and work together in a constructive manner, and which might lead to some of the DP semiautomated tools which might be a practical source of many of the problems we see today might get improved. Go talk to DP and see if you can get an army, or, if in practice, they just blow you off. Failing your ability to get that army, I would personally be happy just to have the ability to take a PG book which for a variety of silly reasons is in practice not readable on EPUB and MOBI devices today and make it readable. I personally would be happy to have that "snowflake" exist simply as long as it takes you guys *in practice* to get your act together and accomplish that which you claim you will accomplish. If you extinguish my "snowflake" by offering something someday which in fact actually proves to be better for the end reader, then you go boy! Again, what I see in practice today continues to be simply "the dogs guarding the straw." Spin the straw into gold and I am happy to see that gold. I am just tired of waiting to see that spinning when I all I hear instead is bold talking. I have been waiting years now to see PG *in practice* offer useful readable pleasant MOBI and EPUB books.

On Tue, Jan 31, 2012 at 12:33:06PM -0800, Jim Adcock wrote:
... I would personally be happy just to have the ability to take a PG book which for a variety of silly reasons is in practice not readable on EPUB and MOBI devices today and make it readable. I personally would be happy to have that "snowflake" exist simply as long as it takes you guys *in practice* to get your act together and accomplish that which you claim you will accomplish. If you extinguish my "snowflake" by offering something someday which in fact actually proves to be better for the end reader, then you go boy!
That's what I'm talking about, too. I'm entirely in favor of automatic conversion to & from different formats. Of master formats. I was personally deeply involved in a whole lot of our current setup for those things, and am hugely optimistic about improvements in the future. Meanwhile, we have an immediate and useful way to improve many reader's experiences, by providing access to hand-crafted alternatives. It should be obvious that I'm constantly seeking the long-term view of Project Gutenberg. That is why I'm thinking of solutions that involve keeping all derived files or changes we create/publish/distribute, forever. But that doesn't mean I'm against providing some files that will only have a few days, months or years of useful life before they are superseded. -- Greg
participants (4)
-
Bowerbird@aol.com
-
Greg Newby
-
Jim Adcock
-
Lee Passey