
On Tue, January 24, 2012 3:08 pm, Joshua Hutchinson wrote:
So, if someone were to start "refactoring" old PG texts into TEI or RST and working with a WWer to repost them ... is this a workable idea? <div><br /></div>
<div>I'd love to see the PG corpus redone as a "master format" system (and the current filesystem supports "old" format files in a subdirectory, so if someone wanted to get the old original hand-made files, they could). I'm not particularly wedded to any master format. Hell, if someone came up with a sufficiently constrained HTML vocabulary that could be easily used to "generate" the additional formats necessary, I'm good with that.</div> <div><br /></div>
<div>But before anyone will start doing this work, there needs to be a concensus from PG (I'm looking at you, Greg!) that the work will be acceptable. A half-assed "master format" system is no master format system at all.</div> <div><br /></div>
<div>I'm even ok with working up the system as you go (i.e., start with "simple" fiction works and make sure the system handles them before throwing more and more complex works at it, tweaking and fixing in the time honored method of "incremental development").</div> <div><br /></div>
<div>Maybe we start this process on a semi-private mirror of the PG corpus and only when it reaches a critical mass of some sort it gets moved over. But an official notice that this project has some backing is necessary or we'll just keep seeing everything running around in ten different directions and nothing ever getting done.</div> <div><br /></div> <div>Josh</div>
I'm in. I think. Just to be sure, let me reiterate what I think I'm agreeing to. 1. A semi-official mirror of PG will be created. 2. Texts in the mirror will be refactored into a single file format which can be used to automatically create every delivery format offered by Project Gutenberg. 3. The focus of the project will be to re-work the most popular PG texts. At the outset simple works will be preferred to more complex works. 4. The project will evolve as new knowledge is gained. 5. The controlling principals at PG agree that if this effort is successful the refactored works will eventually replace the original works in the PG repository. Am I wrong in any of these broad points? If so, please clarify.