
Michael Hart wrote:
Michele wrote:
And Michael- I think you are brilliant in many ways, but you don't even want to provide the amount of information required of a junior high school student writing a social studies paper, let alone a scholar- and I think that's a shame. I shudder to think what you believe scholars do, and why, if you love books so much, you have so high an antipathy for them.
It's not that I don't believe in this kind of information, it's that I didn't want to provide a different Project Gutenberg eBook for each and every single paper edition out there, and then have to keep canonical errors [sic] in them for all time.
I wanted to created a "critical edition" that combined corrections and items from various editions, and we have always supplied the necessary information for citing our eBooks on request, which has apparently never caused any problem either for student or teacher.
Now I think this is getting us to the core of the various issues being discussed of late. In the early days of PG, when disk space was ultra-expensive (and removable storage was of limited capacity), when volunteers were few, and when the Internet did not yet exist (and when it came into being for the ordinary Joe in the late 1980's with very slow modem access), the idea of PG focusing on producing a "critical edition" of important public domain works for casual reading made a whole lot of sense. However, I believe things have changed so much that this focus needs to be reevaluated. Let's look at the situation today, and tomorrow: (o) Disk space is getting so cheap and of such high capacity that we can now consider it economical for text repositories to hold the high-density original page scan images for *one million books*. When the texts are in high-quality XML, we can hold *billions* of textual works, with no problem. In a decade, we can begin talking about *trillions* of textual works (big and small). There's no longer an issue of which published edition to pick to "represent" a particular Work -- we can have them all online. (o) More and more people have high-speed access to the Internet, allowing fast downloading of books, as well as enabling the technologies to mobilize large numbers of avid volunteers to produce high-quality texts (eventually in XML markup) using Internet-enabled systems such as Distributed Proofreaders. And tomorrow? Here's what I see: (o) We will see Distributed Proofreaders greatly improve in both quality of production (high quality XML output) as well as much greater capacity. It will also be "clonable" by other groups dealing with specific types of publications. I believe we'll see over 1000 major books PER DAY being completed by DP and its various "clones" throughout the world, not to mention innumerable texts of other types. That's a thousand book-length works PER DAY worldwide. Thus, the need for "critical" editions based on technical limitations is no longer an issue. Many works were only issued once anyway, so the etext version *is* the critical edition, but some works were issued in various editions over time -- all of them can now be scanned and placed side-by-side online. Let the end-user decide which one to access, based on their own investigation or by the recommendations of others (advanced systems can be set up to aid in selection -- PG itself can recommend which version the reader should consider first.) It is thus important to preserve the full source information, since end-users will need to know that information, to know what they are getting. If an earlier, more faithful version of the Work is not in the PG system (how would they know unless the versions of the Work already in the system have complete source information?), they can suggest which edition to convert through DP. Ultimately, I hope that PG will cover almost all first and early editions of important works. Another aspect of this issue are submissions of works to PG which are based on original Public Domain works, but which have been substantially modified by the submitter acting as editor, in essence creating a new edition of the Work. For example, my publishing company's version of Sir Richard F. Burton's "Kama Sutra of Vatsyayana", first published in the 1880's, has been significantly edited and modified -- but not expunged in any way -- no content has been removed, but has been moved around to aid with logical organization, plus I've added several annotations to clarify things which Burton inexplicably did not. The publisher intro to this book makes clear what changes were made to the text. For submissions such as this, PG should certainly accept such altered and composite works, but it is important the metadata state clearly this is an "altered" work from the source, or something to that effect, as well as stating what public domain source(s) were used to create the work. (Ideally, PG would have these source works in the PG Library, with the original page scans and the faithful etext versions alongside, so the user of the altered/composite etext will be able to determine, if they want, the alterations which were made to create it.) In summary, I believe PG is making a big mistake going down the road of being a "gatekeeper" or "original publisher" of some sort. It should concentrate on what it does best: locate/acquire, copyright clear, and place online Public Domain (and Creative Commons) texts in high-quality form. Let others do the vetting and recommendations for what should be read. Let PG make it ALL available for free to everyone, everywhere and at all times. Jon Noring