
Some portions of the changes are I expect going to be 100% automatable, and will be 100% beneficial in 90% of projects. Stuff like taking images/captions out of fixed-size tables and putting them into %-sized divs. With EB I write regexes that can get those right almost all the time. Probably other candidates are footnotes, chapter headings, page numbers. I only have EB and three or four others to work from. But we could automate that pretty quickly, run it against a sample of the corpus, and check over the results thoroughly. The key is to do no other tweaks but the automated ones so we find out how close we can come. Then we may know enough to plan the next step. I'd like to hear what Lee and the others think first, though. They're better judges than I am. On Thu, Feb 2, 2012 at 11:15 PM, Greg Newby <gbnewby@pglaf.org> wrote:
On Thu, Feb 02, 2012 at 06:46:56PM -0800, don kretz wrote:
... It will take some work and cooperation. The critical question still remains: will PG allow existing projects to be altered this way? Under what condtions? With what verification requirements?
I already answered that in this thread, and the answer is that we do have a procedure to get fixed files back (i.e., the errata process, with a WWer in the loop).
A theme that is not well-handled by the errata process is, what if only the HTML is tweaked, to make the file more epub (etc.) friendly? That is, when the "fix" is not typos/scannos/missing pages, etc., etc., but simply formatting or markup?
The short answer is a rephrasing of the starting point from a few days ago: I'd like to go ahead and make a way to get these back into the collection, replacing the originals, *en masse*. (Actually, we keep the originals, in an 'old' subdirectory.) I don't anticipate opposition to this idea, assuming we're tweaking, not redoing the look and feel crafted by the submitter. How to tell which is which?
One thing we've done with a few very people who were very active in posting/reposting/augmenting is give them direct access to upload. This is something we do AFTER the procedure is very clear. It's easy to screw things up, trust me....
My emphasis in this discussion has been to look at ways to make this type of process more efficient and scalable. We don't want to have a lot of back and forth discussion for every file, if we want to eventually re-do thousands. This interest is at least partially selfish, since I'd rather not be part of a decision process for every such fixed eBook that comes along, and I'm pretty sure the current WWers have similar feelings.
-- Greg
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/mailman/listinfo/gutvol-d