
Does DP have a post-processing crisis? With thousands of volunteers texts flow regularly through the OCR and first phase quickly. However there are several thousand books that have been in post processing over a year. Many of these are hard, but many are plain text. Is it appropriate to re-scan a book to start the process over again hoping for better luck? One could clear another edition, etc. N Wolcott nwolcott2@post.harvard.edu

On Tue, 18 Jan 2005, N Wolcott wrote:
Does DP have a post-processing crisis? With thousands of volunteers texts flow regularly through the OCR and first phase quickly. However there are several thousand books that have been in post processing over a year. Many of these are hard, but many are plain text.
Is it appropriate to re-scan a book to start the process over again hoping for better luck? One could clear another edition, etc.
Perhaps "crisis" is too strong a word to use. I suspect this situation is somewhat inevitable, as playing a part in the proofing process, doing one page at a time is realatively easy to do, and gives a sense of having accomplishment sooner. Post-proofing is a larger commitment, and can be more tedious. On another note, one of the many texts waiting in the queue is "Alcyone", a collection of poetry by by Archibald Lampman, a highly-regarded Canadian poet. I have all the text of this volume which I have gathered from another online source, which I could use for comparison, and have offered multiple times to do post-proofing on this since June 2004, but have not had any responces, other than "contact the person the text has been assigned to" which I have tried multiple times with no response. Andrew

That is a bit of an exaggeration, but there are many, many texts in the post-processing stage at DP. Rescanning the book would only make it worse. Mostly we need people who are willing to work on post-processing texts. Long-term, we are actively working on new ways to handle much of the post-processing work. Currently, it is all done by one person. If things work the way we hope, much of the post-processing work will become distributed, too. Josh N Wolcott wrote:
Does DP have a post-processing crisis? With thousands of volunteers texts flow regularly through the OCR and first phase quickly. However there are several thousand books that have been in post processing over a year. Many of these are hard, but many are plain text.
Is it appropriate to re-scan a book to start the process over again hoping for better luck? One could clear another edition, etc.
N Wolcott nwolcott2@post.harvard.edu <mailto:nwolcott2@post.harvard.edu>
------------------------------------------------------------------------
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d

German Library Allowed To Crack Copy Protection Posted by timothy on Wednesday January 19, @04:03AM from the clashing-aims dept. AlexanderT writes "The EU Directive 2001/29/EU (also known as the European Copyright Directive) has made it "a criminal offence to break or attempt to break the copy protection or access control systems on digital content such as music, videos, eBooks, and software". Since today, at least in Germany there is one notable exception: The Deutsche Bibliothek, Germany's national library and bibliographic information center, has received a "license to copy", i.e. the official authorization to crack and duplicate DRM-protected e-books and other digital media such as CD-Audio and CD-Roms. The Deutsche Bibliothek achieved an agreement with the German Federation of the Phonographic Industry and the German Booksellers and Publishers Association after it became obvious that copy protections would not only annoy teenage school boys, but also prohibit the library from fulling its legal mandate to collect, process and bibliographic index important German and German-language based works." __________________________________ Do you Yahoo!? Yahoo! Mail - You care about security. So do we. http://promotions.yahoo.com/new_mail

Is it appropriate to re-scan a book to start the process over again hoping for better luck?
Absolutely not. This does not help the situation in any way, and in fact contributes further to the perceived logjam. Every book currently in the PPing phase at DP *will* one day be posted to PG. Ebooks don't have a shelf-life, they will not go stale, there is no race. Given a book written, say, 90 (or 190) years ago, and especially given that PG aims to keep the ebook version available, when finished, for many hundreds of years, a book taking a year or two (or, yes, even five (though I don't know of any needing so long, yet)) to be digitised is chickenfeed. Cheers Bill

N Wolcott writes:
Does DP have a post-processing crisis? With thousands of volunteers texts flow regularly through the OCR and first phase quickly. However there are several thousand books that have been in post processing over a year. Many of these are hard, but many are plain text.
I know that other people have commented on this, but I'd just like to state that from looking at the PGDP statistics, I don't believe that this is even close to true. There are about 2250 projects that have completed the first two rounds of proofing but not posted to PG, 400 are waiting for a PPer, 1600 are in post-processing, and 250 are waiting for verification. None of the 400 projects waiting for a post-processor have been the queue for more than a year, although it is possible that some have been checked out and returned to the queue one or more times and could be older than a year. Of the 1600 that are checked out for post-processing, only about 40 have been in the queue more than a year, and about half have been checked out to the current PPer for 60 days or less. Again, it is possible that several PPers have checked out particular projects so that the statistics make them appear newer than they actually are. Finally, I'd like to point out that most PGDP projects are now generating more than one version of the text, HTML and text, and some of the delays can be due to PPers waiting to get better copies of images. Two of my four post-processing projects that had images that were adequate for a text-only project, but inadequate for HTML, and I had to go back to the content provider for better images, and I need to do some image processing before I am done with the HTML edition.
Is it appropriate to re-scan a book to start the process over again hoping for better luck? One could clear another edition, etc.
It seems as though you have some specific projects in mind which have not made it through the DP post-processing process. If they are waiting for a PPer, volunteer to do it yourself, or if the project(s) have been languishing in the PP queues, try to contact the PPer directly, and if that fails, try to contact one of the PGDP powers that be, to see if you can get the project reassigned to you. The powers that be at PGDP do try to get PPers to complete the project within 90 days, or to give it up if they're not actively working on it. In any case, unless the other edition differs from the one that's in the queue, you're better off trying to work within the system before trying to restart the process from scratch.
participants (6)
-
Andrew Sly
-
bkeir@pgdp.net
-
Bruce Albrecht
-
Joshua Hutchinson
-
maitri venkat-ramani
-
N Wolcott