
Geoff, I find your approach fascinating because I usually think of error-catching in posted texts as a smoothreading sort of task (I can't read anything from PG without ending up sending back a list of errors...I keep Jim busy), or at least a text-by-text verification. But, rather than searching for all errors in one book, you're searching for particular errors across the whole set--the obvious extrapolation here is some sort of tool that would blast through the whole archive and create a list of candidate errors that could then be checked over and eventually result in a list of corrections to each text. It'd be a lot of work (to develop the tool, sift through the output, and eventually apply to the texts), but it could result in a significant one-time jump in the quality of the texts that are already posted. And a serious bandwidth hit to the PG server the first time they're all pulled for checking... Yahoo! Mail Stay connected, organized, and protected. Take the tour: http://tour.mail.yahoo.com/mailtour.html