
Michael Hart wrote:
The Project Gutenberg Philosophy Concerning Orwellian History Rewriting
However, there are at least a dozen or two very outspoken volunteers at Project Gutenberg among a dozen or two thousand of such volunteers, who would prefer to delete many of the original Project Gutenberg eBooks in favor of replacing them with something else, as opposed to just working on them to bring them up to the standards of the modern era of eBooks.
Who says the original PG texts (many of which *need* to be redone from scratch [note]) will disappear? Don't you keep the prior versions of the same Work in the archive? [Note: many of the pre-DP texts need to be redone from scratch for various reasons. For another project, I'm now working on My Ántonia by Willa Cather, one of the early PG releases (#242), and the latest PG edition of it, #11!, is horribly mangled from various edits, without recourse to the original, during its lifetime. (In addition, the PG version apparently used the very buggy English release as one source -- gak!) This emendment process over many editions without recourse to the original is like the party trick of sharing a bit of information in a chain from person to person; by the tenth person the meaning of the information has so changed that it no longer conforms to the original!) I recently bought the original 1918 edition (fourth printing I believe) of My Ántonia and am now scanning it. But in the meanwhile I'm mostly done producing an entirely *faithful* (content-wise) draft XHTML version of this book, faithful to the content of the 1st Edition in every detail (no doubt a few small errors persist, but I know they are few and far between.) I will gladly donate the finished XHTML 1.1 version to PG if the associated page scans, which are linked from the XHTML, will be included in the archive, and the full source citation is kept *intact* in its entirety in the marked-up text and in the boilerplate metadata. For those interested, a temporarily and awful-CSS-styled version of the draft can be seen at: http://www.openreader.org/myantonia/myantonia.html (includes page scan links) http://www.openreader.org/myantonia/myantonia-np.html (sans page scan links) (Only the first several page scans are available online at this time as low-rez JPGs -- the originals are full-color 600 dpi (optical). Critical feedback on the underlying markup is more than welcome. If the XHTML+scans won't fit into the work flow of DP, which I don't believe they will, I'll soon ask for volunteers to finalize the XHTML version by comparing it to the page scans which will all be placed online and linked from the text, and to email me any found errors for final fixing -- a sort of DP-like process since it can be done page-by-page. Any volunteers?)
Now. . .the question:
Would someone be willing to do all the work to donate a Britannica 11th to Project Gutenberg this year if they thought it would be removed from Project Gutenberg a decade after it was first included?
(Again, why *remove* what has already been submitted?) Michael, I considered over a decade ago in actively volunteering for PG but decided against it because PG was not focusing on doing things *right*, IMHO. For starters, PG was amiss in: 1) Not including full source information in the texts. 2) Not making faithful reproductions of the sources -- too much leeway was given to emendments and to merging different editions, at least without a vetting process to assure there were no bad emendments or surreptitious changes (now that's rewriting history!) As it stands now, I have no faith that the early texts are faithful reproductions of the original print versions or that some have been surreptitiously changed -- and no tracking of the emendments were ever recorded -- that's why I rarely use the PG texts, other than DP releases which I have a lot more faith in. (I also have the Frankenstein "monster" debacle which I've shared here in the past.) 3) Converting non-ASCII characters to ASCII equivalents (e.g., removing accents from characters.) Proper reproduction of the original characters used is *critical* to preserve. Any PG text which "ASCII-ized" all characters is automatically broken and must be replaced with a remake from, or by reference to, an original source copy. (Today I'd add a fourth requirement: retain page scans for all new works, and no longer accept works which don't have page scans to go along with the texts to 1) verify authenticity, 2) to provide guidance for those who plan to use the texts, such as for presentational purposes, and 3) to help properly fix any claimed errors. Internet Archive will gladly archive the page scans if PG's servers don't have the space and bandwidth to handle the page scans.) I'm not alone in this sentiment, Michael. I talk to others who did *not* volunteer for PG because of the clearly wrong policies which PG early-on established (and "not establishing policies" is a defacto policy.) One must not only count the volunteers, one must also count the non-volunteers who considered volunteering. To answer your question, any book would not be replaced *if* it were processed *right* in the first place. We know enough today as to what is necessary to properly make digital text versions of books, and by and large DP is following best practice. Consider the early years to be experimental. (Most engineers will tell you that the first and even second versions of anything are "learning" -- you learn from them, and then throw them away. Stable design is not usually reached until at least the third version of anything.) (You also ask how people would feel if their work would be "thrown away" after a decade. Well, how do people feel when their work is mangled in subsequent PG editions by new emendments of others, such as what appeared to happen to My Ántonia, which, as I noted above, is so terribly mangled that it must be replaced?) Many people have enjoyed the texts which PG has produced, buggy as many of the early ones are, so it's not as if the early work PG produced was wasted. It was not. Just like anything in the world, there is a life-span to the texts. One doesn't look back but one looks ahead to the future. I see redoing the early corpus of PG texts to be a great opportunity, and not something to be avoided. Properly done, this redo project will produce texts which should have a very long shelf life, if not indefinite. DP should take the lead in this effort to redo the early PG classics, since these are the most popular books in the PG corpus. Jon