
On Tue, October 11, 2011 3:39 am, Marcello Perathoner wrote: [snip]
I'm interested in and willing to offer all technical support I'm capable of to a new DP built around these guidelines:
OK Marcello, I'm with you. How do we start?
1. Use one master format for every book. (There will be a small set of master formats to choose from.)
Makes sense to me. Of course, we need to pick master formats that allow for the preservation of as much semantic "goodness" as possible. And we will all need to be willing to compromise. There are some things I'm adamant about (only true paragraphs can be called paragraphs) and other things I can compromise on (name of classes, use of spans, etc.). And as we compromise, I think we should favor solutions that preserve data over those that omit it, even if preservation is somewhat harder.
2. Minimize formatting. Make books that are usable across a wide variety of devices, not books that look exactly like the paper edition.
I agree 100%. Using HTML as an example, all styling should be restricted to external style sheets, and the document should be marked so that it will look acceptable in the majority of HTML user agents (aka browsers) even if the style sheet is lost.
3. Use a resource control system (like git) for posting and maintenance. PG will host the master repository and the public can pull from it. A group of `committers´ can push. Every committer can have his own group of aides and pull from them.
I have never used git, so I cannot comment on it directly. I tend to prefer CVS as it stores the files on the server in the file system rather than in a database, which makes it simpler for me to do back-end compilation, evaluation, composition, etc. Git (or subversion which I do not like at all) have the advantage of being accessible via HTTP or FTP, as well as a proprietary port and protocol.
4. Use already scanned material: IA, Google, Gallica etc.
I don't think this is absolutely necessary. What I /do/ think is absolutely necessary is that the scanned material must be publicly available. So, if someone wants to work on a book from scans that are /not/ publicly available, it should be required that the person get those scans into a public archive first.
5. Important works first. Don't bother with those embarrassing amateurish works DP turns out by the hundreds.
I agree in principle, but think this could be a very difficult position to enforce. I also think that a lot of these dime novels are being churned out because the more important works have already been "done," however amateurishly. I think if we abandoned the notion of a work being "done," and let people work on whatever excites them, even if it's something that has existed in the corpus for decades, attention will return to the important works naturally.
6. Accept unicode only.
Yes, using utf-8 encoding. About a year ago I was able to snag the domain name "ebookcoop.net." I would be willing to donate the name to the cause if we can get an ip address to attach it to. I have been very impressed with the work that Mr. Frank has done at fadedpage.net. Is this a project we could leverage? Roger, would you like to join us?