RE: [gutvol-d] Update on Harvesting of the Internet Archive's Ca

24 Apr 2005

      --- Gardner Buchanan <gbuchana@rogers.com> wrote:
...
I have done some work developing scripts to re-process the page
image sets from the Toronto archive.  If you're interested, maybe
we should compare notes.  I've found the images to be quite high
quality.
Almost all of the people working on the Toronto archive, and the other
archive.org page image archives, are using the generated DjVu files, as we
don't have the bandwidth to download half a gig or more of images per book.
These are usually of good enough quality to OCR.
...
You didn't mention reconciling your list with the cleared/books in
progress list
The reconciliation is done by people informing me when material on this list is
already in progress. It might be possible to do some of this by automatically
comparing David's In Progress List with this list, but nothing along those
lines has yet been done. It's up to the individuals who claim books from this
list to check their status.

, and looking at your web page, I see that you intend to
...
process at least one of those books in progress - by me - namely:
<snip>
Also, your list also has a duplicate entry for this title.
Thanks for reporting these in the thread on the DP forum. Both entries have
been marked as already in progress.

-- 
Jon Ingram

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com

Jonathan Ingram

tags

participants (1)