
On Dec 23, 2011, at 2:32 PM, Bowerbird@aol.com wrote:
roger said:
The DHYP button dehyphenates the page so spellchecking has whole words to check.
minor question, for roger...
why not have the spellchecker join the hyphenates, solely to do its checking, and leave the text intact?
There are several reasons of varying quality. 1. habit. it's the way I've done it for years and it just looks right. 2. It doesn't require me to use the -~ markup and not using that, I can take the output of the editing tool and pass it directly to gutcheck (or equivalent). 3. I can't have hyphens resolved over a page break in this page-at-a-time implementation, at least not without code that I don't want to write. 4. I see no pure solution in leaving hyphens intact, since I move other things around (such as images) anyway.
major questions, for everyone...
can someone give me a 2012 justification for the reductionistic attitude toward the proofing task? why are we still doling out pages one-at-a-time?
There's a lot to be said for working at the book level instead of the page level. If this were a program being run locally on a user's own machine, I would do many things differently. One reason I do it this way the online version is that presenting the text to the user one page at a time fits on the computer's screen. I also feel it fits comfortably with users that want to just get one page right. There's a sense of accomplishment resolving the warnings on one page and knowing it is "done." I do it page-at-a-time because this allows someone who isn't prepared to take on a whole book to still contribute. From my experience in working with everyday people who want to be a part of this, encouraging successful little steps is a good thing.
but today, you can clean an entire book in an hour -- in one pass, or in six 10-minute sessions, say -- and send it to the smoothreaders for a final check...
so why don't you guys want people to do it that way? smoothreaders are lots easier to recruit from scratch.
I don't know about doing this book in an hour. Though this was a low-density text, it had quite a few errors. I've just finished going through it using essentially the same tool announced earlier at http://etext.ws/ppe.php. It took me more time, perhaps two hours. I suspect the difference is that there are some corrections that can be made globally and those are the ones that take more time when done on a page-by-page basis.
for my side, i scraped roger's text and images, and have already cleaned the book and placed it online, as if for smoothreaders.
Great! Thanks, BB. I'm excited to compare what you've done to what I did with the same source text. I trust you scraped and used the OCR version of each page's text. I'll report back what I find. --Roger