
jon said:
1. Develop a methodology for nominating master scans where no record of provenance if extant text exists.
2. Source master scans and upload to PG.
3. Run master scans through DP until P2 using LOTE non-clothing exception. Capture P2 output and diff it against extant text to produce the RTT. Note that if P2 output is perfect the RTT _is_ P2 output.
4. Diff RTT against extant text to produce comprehensive errata list, and deliver to WWs so they can update the e-texts.
And that, for the moment, is the limit of my ambition. As Bowerbird points out, I have more than likely just described an impossible scenario, even though from a technical point of view it is completely trivial.
well, yes, it's "impossible" because the people who would be necessary to make it work have no interest in doing it. but even if they did, this isn't really "completely trivial"... the twist is your assumption that there _must_ be a solid and usable provenance for the existing e-texts. for some of the classic e-texts, though, the "source" was an amalgamation of several different editions... so you're not gonna find a scan-set for "provenance". and for others, the "source" document was a _reprint_ done by the "classics" imprint of a corporate publisher. so you're not going to find a scan-set for those, either. (and even if you did, you wouldn't want to use it, since those imprints often "took liberties" with the real text.) therefore, steps 3 and 4 in your list simply won't work, not in the way that you've described them. furthermore, the whitewashers aren't gonna accept any "error-reports" that might be based on your work, because your scan-set wasn't the actual source of the currently-existing e-text. (the fact that the currently-existing file _has_no_source_ is of little concern to them, or to anybody else, it'd seem, except to some of us obsessive-compulsive fuss-budgets, although i've predicted, for some time, that will change.) the simple solution, of course, would be to replace the unsourced file with the one that has a solid provenance. but, as we see, p.g. doesn't seem to want to do that... -bowerbird p.s. it's also worth stating that even a replacement will _not_ be a cure-all. many end-users get their files from "republishers" -- manybooks.com, feedbooks.com, etc. -- and i'm not sure if those republishers go back and re-do the low-numbered "classics", even when they're replaced.
participants (1)
-
Bowerbird@aol.com