
On 01/31/2012 09:22 AM, Greg Newby wrote:
I. making changes to the master file(s) [let's imagine that we retain the practice of every PG eBook having a small number of master files, in a small number of master formats]. The short list of master formats includes RST, HTML, TeX/TEI, and plain text (perhaps with light markup). Maybe this list will grow in the future; maybe it will shrink.
Having more than one master format per book does not make sense. Decide which format is best for that book and stick to it. Every typo should have exactly one location that needs fixing.
II. from those master files, various other file formats can be [and are, currently] derived automatically. These include EPUB, Kindle variants, variations on HTML or text (especially if they were not previously provided), RTF, and a few others. Again, maybe this list will grow, maybe it will shrink. I do hope to offer conversion on-demand, which will let people select conversion options, and maybe even different conversion programs, for their purposes.
Conversion on demand will not be possible with the cycles available at ibiblio. If you want that, you'll have to organize some very beefy servers that do nothing but crunch books.
III. from those master files, various other file formats that are created/contributed by individuals. I get offered these (via help@) practically every day. Usually EPUB, but also RTF/DOC, PDF. Often with typos applied. These are what I called "lovingly prepared," though of course some are better than others.
These can be better than automatically-generated versions in various ways. They might have advantages over master files (for example, improved HTML). The main feature is that these would, in many cases, provide an improved reading experience (at least for some people, on some devices).
If we accept that anyone could contribute such a new file (or set of files) for an existing PG eBook, then the main challenges I see are (a) how to help readers select among them, and (b) dealing with the fact that, over time, master formats will be fixed, but not these hand-crafted derivatives.
The whole idea seems to me very ill-conceived. We just don't have the resources to handle that kind of workload. It will just divert our resources away from posting more master files to posting lots of nearly identical vanity editions. Every user contribution will have to be checked for external site links or other SEO optimizations, malicious text edits, etc. or PG will turn very quickly into a link farm for spammers or exchange point of `corrected´ editions of the Origin of Species. Checking some proprietary formats could be expensive. Some format could even be impossible to check except via eyeball grep. Every `just one small typo fixed´ version will have to be checked completely anew. Typos will not be first reported to errata any more, but a new edition will be sent in. Every fixed edition will have a slightly different set of typos fixed. People reporting typos will not state which edition they have. Vanity editions will invariably fall out of sync with the master format. Miscontent will ensue about the ranking of multiple vanity editions. Same about the extent of allowed customizations ("It's a book about cats and I have added just a dozen pics of my cat ...") Users will be confused about which edition to download. I would redirect vanity editions to mobilereads or any other web site that already posts them. We could even link to them if they gave us landing pages. -- Marcello Perathoner webmaster@gutenberg.org