
On 21 Apr 2005, at 11:47, Michael Hart wrote:
On Thu, 21 Apr 2005, Carlo Traverso wrote:
It is much better if the incomplete projects remain at DP: when pages are found, DP updates both the text and the images, so that, when these will be made available, these will be complete too.
Is there any reason these projects cannot be kept at DP as suggested and also still shared with the world?
No such reason, but I will come to that later. There are philosophical differences between PG and DP that hardly ever come to light except in instances such as now. One is that DP doesn't care how long it takes before a public domain book is presented to the public. This is part of its very make-up; we distribute the work in bits that are as small as possible, and there are very few stakeholders who have a large interest in what finally will happen to the book. If neither the scanner or the post-processor care very much _when_ the book will be released, there is a chance that a text will be sat upon until it's ready, not until it's time. The other difference is that nitpickers are drawn to DP the way moths are drawn to a flame. A lot of volunteers at DP care more about the quality of the works we put out than the quantity. We don't want to produce as many books as possible for as long a time as possible (part of PG's main philosophy), we want to produce good books. Obviously I am exagerating the differences a great deal; I make it sound like PG does not care about quality, and obviously that is not true. I also make it sound that books sit forever at DP, while proofreading monks chip away at the tiniest of imperfections, which is also not true. But the differences that there are may account for why books are apparently sitting longer at DP than PG would like. I can see several solutions for this: - Spring cleaning; the powers that be at DP regularly organize proofreading / post-processing / mentoring / whatever marathons, whenever they feel something needs extra attention. If there are truly books that have been sitting at DP for too long, we can try and organize something like that to flush out the forgotten projects. - Assign quality levels; currently, a PG text is a PG text is a PG text no matter how much effort and attention has gone into it. This means there is a variety in quality that is currently not accounted for. (As a consequence, our bad texts are dragging down our reputation, causing PG's goal to reach out to as many people to miss the mark. Some people won't read our books because of their reputation--see my recent discussion with David Rothman at the Teleread blog.) I can see several disadvantages and several advantages to this proposal. The disadvantages: 1. PG has never liked putting out "editions". I am not sure why. Quality levels are like "editions". 2. Someone has to build it before we can use it. Things can go wrong while we use it. Readers might not understand what each level means. 3. On the PG side, someone has to check (whitewash) a book at every level, not just once. Corrections may have to performed to multiple versions, if we choose to retain versions at older levels. The advantages: 1. We can publish books during several stages of its restoration phase. Currently the following stages would make sense to me: a. After scanning (and perhaps OCR-ing) b. After proofreading/post-processing c. After extended mark-up/proofreading phases (what would be smoothreading at DP) 2. We can keep the process more transparent. 3. Users can choose between quality levels: have an unchecked, incomplete book now, or wait for the improved version. -- branko collin collin@xs4all.nl