RE: [gutvol-d] Update on Harvesting of the Internet Archive's Ca

--- Gardner Buchanan <gbuchana@rogers.com> wrote:
I have done some work developing scripts to re-process the page image sets from the Toronto archive. If you're interested, maybe we should compare notes. I've found the images to be quite high quality.
Almost all of the people working on the Toronto archive, and the other archive.org page image archives, are using the generated DjVu files, as we don't have the bandwidth to download half a gig or more of images per book. These are usually of good enough quality to OCR.
You didn't mention reconciling your list with the cleared/books in progress list
The reconciliation is done by people informing me when material on this list is already in progress. It might be possible to do some of this by automatically comparing David's In Progress List with this list, but nothing along those lines has yet been done. It's up to the individuals who claim books from this list to check their status. , and looking at your web page, I see that you intend to
process at least one of those books in progress - by me - namely: <snip> Also, your list also has a duplicate entry for this title.
Thanks for reporting these in the thread on the DP forum. Both entries have been marked as already in progress. -- Jon Ingram __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
participants (1)
-
Jonathan Ingram