Re: the problem with the e-books from the internet archive -- 02 of 32

if some of those stimulus funds are used to _clean_ the o.c.r., that would be an extremely excellent use of them, to be sure... if they're used to scan more books and create more crap o.c.r., however, that would be sad. to put this in perspective, it takes about _one_hour_ per book to clean up the o.c.r. and turn that digital text into an e-book. just one hour. (people from distributed proofreaders will try and tell you it takes longer than that; that's because they're doing it wrong.) but "just one hour" for a million books takes a million hours. so you need to _budget_ for that. brewster worked himself into a hole when he promised that he could digitize a book for $30 (roughly 10 cents per page):
http://www.opencontentalliance.org/2009/03/22/economics-of-book-digitization... you can _scan_ it and handle the other associated overhead, but you need an hour of crafty human labor _after_ scanning in order to make the digital text worthy of human exposure... -bowerbird
participants (1)
-
Bowerbird@aol.com