RE: [gutvol-d] my impression

See the introduction to PG etext #26, Paradise Lost, identified there as "the oldest etext known to Project Gutenberg (ca. 1964-1965)" Dick Adicks

Dick wrote:
See the introduction to PG etext #26, Paradise Lost, identified there as "the oldest etext known to Project Gutenberg (ca. 1964-1965)"
Wow! Here's the Intro of that text detailing the source of the original etext. Undoubtedly Michael Hart wrote this introduction since it is nicely right-justified. <smile/> A comment and question below. ******************************************************************* Introduction (one page) This etext was originally created in 1964-1965 according to Dr. Joseph Raben of Queens College, NY, to whom it is attributed by Project Gutenberg. We had heard of this etext for years but it was not until 1991 that we actually managed to track it down to a specific location, and then it took months to convince people to let us have a copy, then more months for them actually to do the copying and get it to us. Then another month to convert to something we could massage with our favorite 486 in DOS. After that is was only a matter of days to get it into this shape you will see below. The original was, of course, in CAPS only, and so were all the other etexts of the 60's and early 70's. Don't let anyone fool you into thinking any etext with both upper and lower case is an original; all those original Project Gutenberg etexts were also in upper case and were translated or rewritten many times to get them into their current condition. They have been worked on by many people throughout the world. In the course of our searches for Professor Raben and his etext we were never able to determine where copies were or which of a variety of editions he may have used as a source. We did get a little information here and there, but even after we received a copy of the etext we were unwilling to release it without first determining that it was in fact Public Domain and finding Raben to verify this and get his permission. Interested enough, in a totally unrelated action to our searches for him, the professor subscribed to the Project Gutenberg listserver and we happened, by accident, to notice his name. (We don't really look at every subscription request as the computers usually handle them.) The etext was then properly identified, copyright analyzed, and the current edition prepared. To give you an estimation of the difference in the original and what we have today: the original was probably entered on cards commonly known at the time as "IBM cards" (Do Not Fold, Spindle or Mutilate) and probably took in excess of 100,000 of them. A single card could hold 80 characters (hence 80 characters is an accepted standard for so many computer margins), and the entire original edition we received in all caps was over 800,000 chars in length, including line enumeration, symbols for caps and the punctuation marks, etc., since they were not available keyboard characters at the time (probably the keyboards operated at baud rates of around 113, meaning the typists had to type slowly for the keyboard to keep up). ******************************************************************* Am I right to assume that this etext was originally punched in for lexical (text) analysis? That time frame corresponds to when the Brown Corpus was started. What other complete texts of books were rumored (or known to be) "digitized" (such as it is on punch cards) in the 1960's and early 70's? Thanks. Jon

Hi Everybody, I hate to disapoint everybody, but there are even older "etexts" than this! Though I have to admit that back then they were not called etexts !!? They were called corpera. They were not stored on disks or such mass storage system, but on punch cards and such. "ebooks" have been aroun since the mid 80s. They were programs that were dedicated to one book and its display. I have "The Hitchhiker's Guide to the Galaxy" somewhere in box. Anybody remeber the Apple Newton (also mid 80s) it also what would be termed today as ebooks. As a matter of fact I use to read the first PG etexts on my Newton. Just my 2 Euro cents worth Keith. Am 10.01.2006 um 17:10 schrieb Jon Noring:
Dick wrote:
See the introduction to PG etext #26, Paradise Lost, identified there as "the oldest etext known to Project Gutenberg (ca. 1964-1965)"
Wow!
Here's the Intro of that text detailing the source of the original etext. Undoubtedly Michael Hart wrote this introduction since it is nicely right-justified. <smile/> A comment and question below.
*******************************************************************
Introduction (one page)
This etext was originally created in 1964-1965 according to Dr. Joseph Raben of Queens College, NY, to whom it is attributed by Project Gutenberg. We had heard of this etext for years but it was not until 1991 that we actually managed to track it down to a specific location, and then it took months to convince people to let us have a copy, then more months for them actually to do the copying and get it to us. Then another month to convert to something we could massage with our favorite 486 in DOS. After that is was only a matter of days to get it into this shape you will see below. The original was, of course, in CAPS only, and so were all the other etexts of the 60's and early 70's. Don't let anyone fool you into thinking any etext with both upper and lower case is an original; all those original Project Gutenberg etexts were also in upper case and were translated or rewritten many times to get them into their current condition. They have been worked on by many people throughout the world.
In the course of our searches for Professor Raben and his etext we were never able to determine where copies were or which of a variety of editions he may have used as a source. We did get a little information here and there, but even after we received a copy of the etext we were unwilling to release it without first determining that it was in fact Public Domain and finding Raben to verify this and get his permission. Interested enough, in a totally unrelated action to our searches for him, the professor subscribed to the Project Gutenberg listserver and we happened, by accident, to notice his name. (We don't really look at every subscription request as the computers usually handle them.) The etext was then properly identified, copyright analyzed, and the current edition prepared.
To give you an estimation of the difference in the original and what we have today: the original was probably entered on cards commonly known at the time as "IBM cards" (Do Not Fold, Spindle or Mutilate) and probably took in excess of 100,000 of them. A single card could hold 80 characters (hence 80 characters is an accepted standard for so many computer margins), and the entire original edition we received in all caps was over 800,000 chars in length, including line enumeration, symbols for caps and the punctuation marks, etc., since they were not available keyboard characters at the time (probably the keyboards operated at baud rates of around 113, meaning the typists had to type slowly for the keyboard to keep up).
*******************************************************************
Am I right to assume that this etext was originally punched in for lexical (text) analysis? That time frame corresponds to when the Brown Corpus was started.
What other complete texts of books were rumored (or known to be) "digitized" (such as it is on punch cards) in the 1960's and early 70's?
Thanks.
Jon
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d

On 1/11/06, Keith J. Schultz <schultzk@uni-trier.de> wrote:
Hi Everybody,
I hate to disapoint everybody, but there are even older "etexts" than this! Though I have to admit that back then they were not called etexts !!? They were called corpera. They were not stored on disks or such mass storage system, but on punch cards and such.
According to the intro, etext #26 was probably entered on cards in 64-65. The Brown Corpus (I'm guessing 1962, since the Wikipedia article doesn't really say) didn't really include etexts, since it was 2000 word samples, not entire texts. Given the memory size and cost of early computers, and the fact that Wikipedia says the "Brown Corpus pioneered the field of corpus linguistics", I'd like some evidence that there were older etexts.
participants (4)
-
David Starner
-
Dick Adicks
-
Jon Noring
-
Keith J. Schultz