
----- Original Message ----- From: "Michael Hart" <hart@pglaf.org>
As one who works from a lot of older works to not only scan and OCR but correct them, I know how much human labor is involved. There are volunteer efforts like Distributed Proofreaders http://www.pgdp.net/c/default.php , but I have concluded that it takes me more time to set up a project for them than it would take for me to do the proofreading myself, and my work would likely be more accurate, since I would understand the underlying content and know how to render obscure text.
While it does take a little time to set up one's first project with the Distributed Proofreaders, it is usually quite a bit easier the second time, not to mention that we have volunteers who will walk you through processes the first few times around, which seems to do the trick for nearly everyone.
I just want to make a quick comment on this part (since I somehow missed the initial e-mail). Setting up projects at DP is not time consuming (well, the upload of the image files can be, depending on your internet connection), especially once you've done it a few times. As one of the larger DP project managers (currently at 687 projects created for DP), I can tell you that there is NO WAY to proof even an easy text in the amount of time it takes to create and upload the project to DP. Even if I take into account OCR time (which I batch up and run overnight), it is still less time than I would take to proof the work. I can also reiterated Michael's comment that there are plenty of folks ready to help out new content providers on their first few projects. It can be a little daunting the first time, but it gets easier once you've done a couple times. Also, for folks that don't want to get heavily involved, we can usually work something out with someone that just wants to provide the image scans. We can usually take it from there (assuming they are public domain scans, of course). Josh PS I also haven't created any new projects in many months because of the backlog we've got in the system. I wanted to help clear out some more work before sending more into the queue. So those 687 were done in a much shorter frame of time than my login statistics at DP might otherwise imply.