re: Re: [gutvol-d] Final PGTEI... page numbers

jon noring said:
for most of DP's volunteers, the markup will be "under-the-hood" and largely invisible -- most of the volunteer work anyway is for copyediting the text (correcting OCR errors), not markup insertion, so no need to require these volunteers to learn the gory details of TEI.
actually, that represents a poor understanding of the work-flow at distributed proofreaders, under the current system anyway, where the proofers are using a clumsy system of pseudo-markup.
Only the most experienced and interested of the DP volunteers, who do the final cleanup/finishing stages, will actually play with the markup itself.
well, now you're talking about the system that will be created, and what that will ultimately look like has not yet been decided. the way you've put it here is, to some degree, what is desired, but there's some question about whether proofers can do their job prior to the introduction of any markup at all. of course, there is _also_ some question about how easy it will be to do the proofing if any obtrusive markup is "inflicted" on the text prior to proofing. further, at present, "proofing" -- the act of catching and correcting errors, either in the text or in the formatting -- happens right up until the end of the text's processing, and i think the finding will be that obtrusive markup, whenever it occurs, will short-circuit that. whether the early rounds can be improved to the point that this "short-circuiting" causes no problems is yet another open question.
Aside: the DP-produced XML Master texts will certainly be used for many purposes, all of which instill requirements on the markup specification, and which must be considered -- this is the biggest missing area not being discussed on gutvol-*.
well, the discussion _here_ carries absolutely no weight at all. if you want to know what d.p. is going to do, you'll have to go over to their forums, where they're batting abut these issues right now. (look under the "everything but distributed proofreaders" section, which is an odd place to put such a discussion, wouldn't you think?)
The most exciting of these is where the DPXML texts will be archived into a special library-like repository which allows a very high-level of end-user interface and customizability to the collection (e.g., bookmarking, annotation, interlinking within the repository and to other content repositories, blogging, etc. -- all things several associates and I are now working on.
sounds like you're off and running. perhaps you could teach people here how to crawl first. -bowerbird
participants (1)
-
Bowerbird@aol.com