
carlo, perhaps the answer was hidden in your last post, but let me present the situation, and ask the question directly... the project gutenberg e-text for "pride and prejudice" is relatively accurate, but completely lacking provenance... the internet archive scan-set for "pride and prejudice" is self-documenting, but the o.c.r. text from it is abysmal... even though, as you point out, they are different editions, it seems we could use these problems to offset each other. so... how would you suggest that we go about doing that? we can, of course, just send the internet archive stuff through d.p., but that wouldn't take advantage of our already-proofed-and-relatively-accurate p.g. e-text. so, carlo, what would you recommend, specifically? -bowerbird