pontifications from mount high horse -- #1499

i don't think anyone will build another digitization community like distributed proofreaders. and d.p. will wither away soon... the digitizers of tomorrow will be people like paul flo williams, and james simmons, who take on the task as "a labor of love"... they'll do books that they love -- _because_ they love them -- and not just take part in some abstract "digitization" project... they'll be far more likely to do a half-dozen books end-to-end, than to take on some piece of dozens (or hundreds) of books... they might get a few people to help 'em out, or they might not. it won't matter, not much, because either way they'll trudge on. and they'll use tools that will fit their small, personal workflow. *** this brings up something roger said a while back:
I also feel it fits comfortably with users that want to just get one page right. There's a sense of accomplishment resolving the warnings on one page and knowing it is "done."
first of all, let me give roger full credit for what is one of his most significant contributions over the past few years, and that's saying a lot because he's made plenty of them... but one of the most significant is that his actual tests showed that if a person _says_ they consider a page to be "finished", then the odds are that it really _is_ finished, i.e., error-free... i had always argued, based on my gut, that it took _2_ people to confirm a page as error-free before we could "trust" it was, _and_ that anyone who made a change to a page could _not_ be counted as one of them. in other words, 2 _confirmations._ roger's research showed that that's not necessarily the case... at least it _hinted_ at that... but there was one troubling confound in his test, which is that he didn't factor the _initial_state_ of the page. and we _know_ that -- typically in o.c.r. -- _many_ pages start out error-free. indeed, in some o.c.r. files, after a decent preprocessing run, the majority of the pages will be error-free, so a person could certify _all_ of them as "finished", without even looking at 'em, and be "correct" far more often than they were "wrong". but there's another aspect to this, a psychological aspect, and one that involves the _motivation_ to do digitization, especially in a large project where you're a cog in a wheel. i fully recognize that people get satisfaction from the feeling of "finishing" a page, and that we can use that for motivation. but at what point does that person begin to feel manipulated, because the vast majority of the pages were "finished" before the person even looked at them, and we neglected to tell 'em, or -- worse yet -- we implied that the page _did_ have errors. i mean, it's one thing to say, "we think this book is error-free, so go ahead and read it, and let us know if you catch anything", which is the way that i would pitch beta-reading to my people. but it's another thing to say, "proofread this word-by-word", when you darn well know _most_ of the pages are error-free. there is a dishonesty about that which causes me discomfort. or, in other words, what kind of satisfaction does a person get from certifying a page as "finished" if the page was "finished" before they even got it, but you led them to believe it wasn't? *** there's a flip side to this, too. what kind of satisfaction did roger get from calling his pages "done" when he later found that he had missed many of the italics? this cuts both ways... anyway, just giving you something to think about in 2012... *** have a nice year. -bowerbird
participants (1)
-
Bowerbird@aol.com