
THE HISTORY OF PROJECT GUTENBERG It seems I have been remiss in keeping everyone up to date on the history of Project Gutenberg over the past few years, so I am taking this opportunity to write A Brief History of Project Gutenberg in several segments that will, hopefully, make amends for this lack on my part. PART ONE: THE FIRST 10 YEARS 1971-1980 In terms of actual page production, many people dismiss the opening decade of Project Gutenberg nearly completely. Now in terms of space, the published eTexts, as we called those at the time, will all fit on a modern floppy disk. Because of stringent storage allocations our eText files on the million dollar mainframe were just barely allowed. The struggle to put even these small files online was enormous, as it was a totally revolutionary idea to put up a file for a non-predetermined time period. This idea of something an entire future could download had never been brought up, and thus it was VERY hard to get permission to post even a file as small as the Declaration of Independence, because it was going to take up permanent space on the computer. Files of the following list were perhaps the first inkling of a kind of permanence the early Internet pioneers did not consider: Dec 1979 Abraham Lincoln's First Inaugural Address Dec 1978 Abraham Lincoln's Second Inaugural Address Dec 1977 The Mayflower Compact Dec 1976 Give Me Liberty Or Give Me Death, Patrick Henry Dec 1975 The United States Constitution Nov 1974 Gettysburg Address, Abraham Lincoln Nov 1973 John F. Kennedy's Inaugural Address Dec 1972 The United States Bill of Rights Dec 1971 The United States Declaration of Independence These first 9 files were collected into all7011.txt and all7011.zip for easy redistribution in upper and lower case in later years. The original files were all upper case, as there was no lower case on the early machines we were using at the time. You may see that we skipped two years between the US Bill of Rights and the US Constitution; we were originally going to try to include the complete Constitution in just a year after the Declaration, but we were told that would take too much space, and we were given just enough space for the Bill of Rights. The next year we asked again, but room was still very scarce, and so we asked again the next year and the year after. By then I was able to make convincing argument that waiting any longer might delay it so long that people wouldn't have access to it long enough before Bicentennial year of 1776, so we finally got room at the end of 1975. This may not sound very exciting to you from 30 years later, but it was VERY exciting to us, being able to put these files online for a whole country to use during the United States Bicentennial. We finished out the decade with more of those "Freedom Celebration" documents, as they were called, which were placed on the walls of a a variety of schools, malls, etc., during this period. During the period the greatest struggle was just to talk operators, even those that were very good friends, into giving us enough space to store anything but the smallest files. It was one thing to have $100,000,000 in "computer money" that could be used to run programs and send emails, but it was quite another thing to be granted space to store files that people from around the country would download. Here's just one early example: When I completed the Declaration of Independence, I wanted to email it to everyone on the Net [DARPANet, as we called it], but I found, to my great surprise, that if I had done this, even with such small files as the Declaration of Independence [5K], that it would create a complete network crash, since most of our wires were 113 baud, or 11 characters per second. Luckily, I asked for help in sending it, and avoided becoming quite well known as the first person to bring the Net to its knees; and a "Morris Worm" would have only been an asterisk, and so would I!!! In the end we simply posted a message to what later became comp.gen so people could get the file on request. My recollection is that 6 people downloaded it, other than the other four on our site, so the greatest penetration would have been about 10%. . .which sounds big by today's standards, but I had been hoping for more. A word about the computer operators of the day: we used to joke in many ways that the computer operators were the current priesthood-- you handed in your offering through the stainless steel window, and prayed that they would be worthy enough for the computer to run it. If you understand this, then perhaps you can also understand how it was the computer operators had so much power. Not only should they be considered as the entire force of computer security of that day, but they could also save you hours, if not days, of time by telling you just where and/or why your program wasn't running. I was quite seriously lucky that my brother's best friend was the operator from midnight to 8AM, when most of the free computer time was available, and that he gave me the account I used to start Project Gutenberg-- and as lucky that MY best friend became the 8AM-5PM operator. I should add that even at such an early date, I had help from those anonymous contributors who so often help. In this case I never was able to find out who typed in the first U.S. Constitution versions. I asked and asked, and even though there weren't that many persons, I never could find or thank the one who did it. That version was a print version in what served as a sort of markup of the day, so all I had to do was take out all the markup, backspace/underscores etc. to create a version that looked good onscreen. If anyone knows, it would still be nice to find out today, and send our thanks; for now I would just like to include a general thanks to all the volunteers who have helped Project Gutenberg over more than 1/3 century. Anyway, that's the story of the first decade of Project Gutenberg-- and I hope to work up something for the 1980's for next week. I should add here that even though the Apple II was out, I had none of the kind of money it would have taken to buy one, so my computer ownership starts not in this segment, but in the next one. Michael S. Hart Founder Project Gutenberg Postscript: For those interested in counting ye olde Project Gutenberg eBooks-- please note that there is no growth curve for this period; a growth graph would simply be a straight line, 1 title = 1 year, so it is a trivial point to say that at this growth rate it would take ~15,000 years to do ~15,000 titles, and that I would have been dead so long before we ever got to eBook #100 that no one would have remembered. Hence we do not talk about doubling rates for this period since the years required doubled at the same rate as the index entries did. Nevertheless, you will, from time to time, see people manipulate an army of statistics in such a way as to include these in patterns of growth, even though it is common knowledge that the earliest growth figures of any such pattern are quite linear. Just look at a curve of the population of the earth for a perfect example. Such curves, if studied in detail, yield a wealth of such growth information. Sample Moore's Law Projections Based on 1971 Here is an example of what would happen if Project Gutenberg growth projections were started using the 1 item we had in 1971: Start Finish Total Total ##### Year Year Years Doubles x2^y = Grand Total in Year #1 1971 2001 30 20 1*2^20 = 1,048,576 in 2001 #1 1971 2004 33 22 1*2^22 = 4,194,304 in 2004 Obviously no one ever seriously considered that Project Gutenberg might actually release a million eBooks in 2001, but there were a few examples recently of suggestions that we should have used the 1971 date, and thus the resultant figures listed above when doing our Moore's Law predictions. I trust at least this specific example has now been put to rest.

Michael Hart wrote:
Obviously no one ever seriously considered that Project Gutenberg might actually release a million eBooks in 2001, but there were a few examples recently of suggestions that we should have used the 1971 date, and thus the resultant figures listed above when doing our Moore's Law predictions.
[epighraph] "Contrary to popular claims, it appears that the common versions of Moore's Law have not been valid during the last decades. As semiconductors are becoming important in economy and society, Moore's Law is now becoming an increasingly misleading predictor of future developments." ... "Indeed, sociologically Moore's Law is a fascinating case of how myths are manufactured in the modern society and how such myths rapidly propagate into scientific articles, speeches of leading industrialists, and government policy reports around the world." http://firstmonday.org/issues/issue7_11/tuomi/index.html [/epigraph] Read that page for the sad truth about Moore's "Law". My suggestion was to stop using arbitrary data to keep up the illusion of Moore's Law (which, if you had read that page, would have known never worked even for computers) but to use real data to show that Moore's "Law" does not fit to PG production. The suggestion was to use the real date the project started (1971) instead of your fictitious and arbitrary one (1990). Of course, using real dates, the idea that PG production followed Moore's Law dies a horrible death. Even looking at the relatively short period of Nov 2003 to Nov 2004 we can prove in a very simple manner that Moore's Law doesn't hold. 1. In Nov 2003 we had 10000 books. 2. Applying Moore's Law, in Nov 2004 we should have had 10000 * 2 ^ (12/18) = 15874 books. 3. Moore's Law does not hold for PG archive size. QED The sad fact is: some people with a marketing person's mind prefer to stick to a phony and proven wrong "Law" because it is such a slick formulation. Why do we need flashy formulas at all? If we say: "we got 15000 books today" isn't that enough? Don't you have faith in the facts? -- Marcello Perathoner webmaster@gutenberg.org

Just a simple question. . .how many people believe any of this? Should I really go through the motions of refuting it again, and again, and again? As I said privately, offline, I don't think even the speaker believes what he is saying. . . . Michael On Thu, 20 Jan 2005, Marcello Perathoner wrote:
Michael Hart wrote:
Obviously no one ever seriously considered that Project Gutenberg might actually release a million eBooks in 2001, but there were a few examples recently of suggestions that we should have used the 1971 date, and thus the resultant figures listed above when doing our Moore's Law predictions.
[epighraph]
"Contrary to popular claims, it appears that the common versions of Moore's Law have not been valid during the last decades. As semiconductors are becoming important in economy and society, Moore's Law is now becoming an increasingly misleading predictor of future developments."
...
"Indeed, sociologically Moore's Law is a fascinating case of how myths are manufactured in the modern society and how such myths rapidly propagate into scientific articles, speeches of leading industrialists, and government policy reports around the world."
http://firstmonday.org/issues/issue7_11/tuomi/index.html
[/epigraph]
Read that page for the sad truth about Moore's "Law".
My suggestion was to stop using arbitrary data to keep up the illusion of Moore's Law (which, if you had read that page, would have known never worked even for computers) but to use real data to show that Moore's "Law" does not fit to PG production.
The suggestion was to use the real date the project started (1971) instead of your fictitious and arbitrary one (1990).
Of course, using real dates, the idea that PG production followed Moore's Law dies a horrible death.
Even looking at the relatively short period of Nov 2003 to Nov 2004 we can prove in a very simple manner that Moore's Law doesn't hold.
1. In Nov 2003 we had 10000 books.
2. Applying Moore's Law, in Nov 2004 we should have had 10000 * 2 ^ (12/18) = 15874 books.
3. Moore's Law does not hold for PG archive size.
QED
The sad fact is: some people with a marketing person's mind prefer to stick to a phony and proven wrong "Law" because it is such a slick formulation.
Why do we need flashy formulas at all? If we say: "we got 15000 books today" isn't that enough? Don't you have faith in the facts?
-- Marcello Perathoner webmaster@gutenberg.org
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d
participants (2)
-
Marcello Perathoner
-
Michael Hart