
Instead of worrying about perfection, we would be better advised to fix the many texts which are or have become unreadable. It is also uncomfortable, when there are several translations of a work with the same title and an anonymous translator to havve the publisher routinely or randomly removed. Also there are many DOS texts with accents that are hence unreadable. Any code page should be acceptable? maybe but. . . Also although there are explicit directions for submitting a text, correcting one or updataing one, even one I contributed, has apparently no explicit provision. Also, at random apparently, a little preamble I have added to help the reader identify the text or its possible shortcomings is removed. Although many texts shave no unique provenance as MH has advised, but that is no reason for removing any hint of preovenance when one is supplied by a contributor. nwolcott2@post.harvard.edu Friar Wolcott, Gutenberg Abbey, Sherwood Forrest keeping the inkpots full.

And herein lies some of the problem. I'm a college professor, and I recently earned my PhD. I would have had a hard time getting a rtext past my professors without being able to document who published it. I would have a hard time making a citation to a document with no pages. I would be very annoyed with a student who just pointed to something on the net that had no provenance whatsoever- even many pieces of ephemera have provenance. I don't think this is a matter of fuddy-duddy professors who just don't understand how wonderful e-books are; I think the very concept of e-books as it now stands, while excellent for casual readers or people who simply want to educate themselves, is deeply flawed. When I am citing a text, I cannot refer to a vague document. I need to know EXACTLY when the original was published, who published it, and where, since there are variant texts out there. Even a single word change that might have occurred in the copying process could change the meaning of a vital sentence. PG is wonderful- but as a student and a teacher, I don't think that most cybertexts provide the citability that is so important for academics. If PG was the only source in the world for vital texts, that would be one thing- but it isn't. I love PG, and I sned students to it all the time- but only for the purpose of reading. I would not seend a student to a PG text in order to make a citation. I have no way of knowing where many of the texts came from, whether the edition copied was a variant on the original, what page the information appeared on inthe original copy, or anything else. In the social sciences and liberal arts, these things are very important. It is the soul of how we check for plagiarism, understand the history of a work, and make specific references. PG is great for when I want to read a Tom Swift book or understand the human genome - but it doesn't help me if I need to explain the migration in the ideas of Franz Boas over time and through eiditions of his works or examine the changes between editions of Dust Tracks on the Road. -----Original Message----- From: gutvol-d-bounces@lists.pglaf.org [mailto:gutvol-d-bounces@lists.pglaf.org]On Behalf Of Norm Wolcott Sent: Friday, November 12, 2004 10:22 AM To: Project Gutenberg Volunteer Discussion Cc: Norm Wolcott Subject: [gutvol-d] Perfection Instead of worrying about perfection, we would be better advised to fix the many texts which are or have become unreadable. It is also uncomfortable, when there are several translations of a work with the same title and an anonymous translator to havve the publisher routinely or randomly removed. Also there are many DOS texts with accents that are hence unreadable. Any code page should be acceptable? maybe but. . . Also although there are explicit directions for submitting a text, correcting one or updataing one, even one I contributed, has apparently no explicit provision. Also, at random apparently, a little preamble I have added to help the reader identify the text or its possible shortcomings is removed. Although many texts shave no unique provenance as MH has advised, but that is no reason for removing any hint of preovenance when one is supplied by a contributor. nwolcott2@post.harvard.edu Friar Wolcott, Gutenberg Abbey, Sherwood Forrest keeping the inkpots full.

On Fri, Nov 12, 2004 at 10:48:18AM -0500, Her Serene Highness wrote:
And herein lies some of the problem. I'm a college professor, and I recently earned my PhD. I would have had a hard time getting a rtext past my professors without being able to document who published it. I would have a hard time making a citation to a document with no pages. I would be very annoyed with a student who just pointed to something on the net that had no provenance whatsoever- even many pieces of ephemera have provenance. I don't think this is a matter of fuddy-duddy professors who just don't understand how wonderful e-books are; I think the very concept of e-books as it now stands, while excellent for casual readers or people who simply want to educate themselves, is deeply flawed. When I am citing a text, I cannot refer to a vague document. I need to know EXACTLY when the original was published, who published it, and where, since there are variant texts out there. Even a single word change that might have occurred in the copying process could change the meaning of a vital sentence. PG is wonderful- but as a student and a teacher, I don't think that most cybertexts provide the citability that is so important for academics. If PG was the only source in the world for vital texts, that would be one thing- but it isn't. ...
My Ph.D. in Information Transfer is from 1993. I've taught Internet stuff and a whole lot of other things since 1988. I went to college in 1983, and never left, holding faculty positions since 1991 - in short, I'm very much a professional academic. Here are some of my experiences related to electronic texts: - I *have* entirely electronic articles cited in my academic vita (http://petascale.org/vita.html). Nobody (none of my deans, etc.) has even raised an eyebrow. Today, like always, peer review and the reputation of the publication are what matters, not whether it was printed. - I have refused paper submissions of any assignments from my students for years (http://petascale.org/paperless.html), including master's theses and doctoral dissertations. Again, this is just not a problem. At the end of the degree process, we (the committee) signs a piece of paper and the student submits copies of the printed document to the library. Then, a PDF or similar goes to various archives and Web pages, and is available for widespread free access. - I was recently appointed Editor of the standards document series in the Global Grid Forum (http://www.ggf.org), which publishes an all-electronic document series modeled after the RFC series published by the IETF (which is much older, and is essentially the standards that defines the Internet). - Every citation format (APA, MLA, Chicago, etc.) specifies how to cite documents which are not printed. For the most part, they distinguish between epheremal stuff like email messages and more permanent stuff like online journal articles. This is still difficult, and many people cite inappropriate items as though they were published documents rather than things like personal communication, changeable Web pages, etc. But it's certainly done, and it's done in journal articles (print & electronic), standards documents, books, newspaper articles, etc. Here's one of many good pages describing electronic citation: http://owl.english.purdue.edu/handouts/research/r_docelectric.html In short, I'm happy to say that my experience is completely different than yours. Moreover, unlike you, I seem to have specific documents, citations and processes to back up my impressions, while you haven't provided any. Certainly it's the case that some academic fields rely more on the exact words of a particular printed item. Hermeneutics is an example, and some others of the historical, classic & humanities disciplines. But to dismiss "academics" as being unable to deal with online content (as the subject/object of research, as support for research, or as the published outcome of research) is certainly an overstatement, and inconsistent with the experiences of me and my academic peers. -- Greg

-----Original Message----- From: gutvol-d-bounces@lists.pglaf.org [mailto:gutvol-d-bounces@lists.pglaf.org]On Behalf Of Greg Newby Sent: Friday, November 12, 2004 1:47 PM To: Project Gutenberg Volunteer Discussion Subject: Re: [gutvol-d] Perfection On Fri, Nov 12, 2004 at 10:48:18AM -0500, Her Serene Highness wrote:
And herein lies some of the problem. I'm a college professor, and I recently earned my PhD. I would have had a hard time getting a rtext past my professors without being able to document who published it. I would have a hard time making a citation to a document with no pages. I would be very annoyed with a student who just pointed to something on the net that had no provenance whatsoever- even many pieces of ephemera have provenance. I don't think this is a matter of fuddy-duddy professors who just don't understand how wonderful e-books are; I think the very concept of e-books as it now stands, while excellent for casual readers or people who simply want to educate themselves, is deeply flawed. When I am citing a text, I cannot refer to a vague document. I need to know EXACTLY when the original was published, who published it, and where, since there are variant texts out there. Even a single word change that might have occurred in the copying process could change the meaning of a vital sentence. PG is wonderful- but as a student and a teacher, I don't think that most cybertexts provide the citability that is so important for academics. If PG was the only source in the world for vital texts, that would be one thing- but it isn't. ...
My Ph.D. in Information Transfer is from 1993. I've taught Internet stuff and a whole lot of other things since 1988. I went to college in 1983, and never left, holding faculty positions since 1991 - in short, I'm very much a professional academic. Here are some of my experiences related to electronic texts: - I *have* entirely electronic articles cited in my academic vita (http://petascale.org/vita.html). Nobody (none of my deans, etc.) has even raised an eyebrow. Today, like always, peer review and the reputation of the publication are what matters, not whether it was printed. Agreed. I have no reason to doubt you. However- you did say that your work is in Information Transfer, right? Do you think there might be a teensy bit of difference between a reference by an Information Transfer academic that is from an electronic journal and was published for other academics in that and related fields, and a citation of, say, Emily Dickenson's poetry without information as to when the book it was taken from was published- considering that it is now known that many earlier copies of Dickenson used incorrect punctuation because previous editors messed around with them? I'd have no problem accepting or using a citation of the US Census online- I've done it. I've used citations of NYS divorce and sexual offense law from online sources- no problem. All of those are frequently updated. But a citation of an out of print book in anthropology, English literature, the hard scieces, et al, which might very well not be correct in its information- that will be problematic. I would be very happy to see Boas online. Eventually I hope to track down an out of copywright version of his writings and scan it for PG. I'd like to do the same with Zora Neale Hurston, Ruth Benedict, and quite a few other people. However- and this is the big 'however'- while these texts would be useful for casual and serious non-academic readers, and even for many academic readers as a point of reference, theie usefulness would be seriously impaired without info as to who originally published the books and when. Boas' works vary according to edition- therefore, knowing which edition you are reading can matter if you are doing research on his theories. If I were doing online research in a general fashion onthe history of anthropology, it wouldn't matter. If I were writing a scholarly work, it would. It would also matter if there was no pagination. Again- I'm not talking about materials produced in the past twenty years. I'm talking about historical materials. They are not entirely electronic. Another example- I'm tutoring a 15 year old about the incidents that led up to WW2. We go online and find the Treaty of Versailles. He can cite it- not only is it a well-known document (making it easy to check for errors and lacunae), but each section of the treaty is numbered. It's easy for him to refer to Article 15 in a paper, and easy for a teacher to find the section in an online document. I would encourage him to use it in class, and to do an internet citation- no problem. But if he was to try to cite Winston Churchill's autobiography from an online site (Not that it's online) or Mein Kampf (which probably is), he'd run up against a problem. In chapter 5 there might be a very quotable sentence- but what my student doesn't know is that this sentence was changed in later editions. And there's no page number- does he tell his teacher to read the entire chapter to find a sentence that won't be there in a later edition? The last time I looked at PG (a few weeks ago) I found it very easy to red books if I wanted to read the whole text. If I wanted to find chapters or pages I had hard luck- I had to scan through whole documents. You don't have to believe me. Just find this quote. It's from The Koran. "And thou takest vengeance on us only because we have believed on the signs of our Lord when they came to us. Lord! pour out constancy upon us, and cause us to die Muslims." It's in Sura VII. I have no doubt that you'll find it- but it will take you quite a while to do so with no page numbers and no way to go to each section separately. As a teacher, I don't have time to read half The Koran (that's a hint, by the way)to find this one quote on PG. I can however find websites that will make the search much easier for me, and will provide some info on the translation. After all, I have no idea who JM Rodwell was, or whether his translation of The Koran is the definitive English version, or why his translation was chosen- other than that his book was out of copyright. From my point of view, that's a red flag itself. If this translation is so superb, why isn't it still being used- or is it? to the library. Then, a PDF or similar goes to various archives and Web pages, and is available for widespread free access. - I was recently appointed Editor of the standards document series in the Global Grid Forum (http://www.ggf.org), which publishes an all-electronic document series modeled after the RFC series published by the IETF (which is much older, and is essentially the standards that defines the Internet). - Every citation format (APA, MLA, Chicago, etc.) specifies how to cite documents which are not printed. For the most part, they distinguish between epheremal stuff like email messages and more permanent stuff like online journal articles. This is still difficult, and many people cite inappropriate items as though they were published documents rather than things like personal communication, changeable Web pages, etc. But it's certainly done, and it's done in journal articles (print & electronic), standards documents, books, newspaper articles, etc. Here's one of many good pages describing electronic citation: http://owl.english.purdue.edu/handouts/research/r_docelectric.html I'm aware of that. As I states above, I've used electronic citations, even when professors raised eyebrows. But you are not dealing with my particular statements, which have nothing to do with the citation of contemporary documents and ephemera, or with copies of documents that make searches for particular passages much easier for readers and writers. I was very specific in my criticism- and since you have a degree in Information Transfer and have taught Library Science, it ought to be of concern to you, too. But being an expert in Information Trnasfer is not the same thing as doing research using out of print documents. Your business is making them readable and accessible, which is important. From where I stand that is important too, but less important than being able to consistently find passages, and checking to see the differences according to editions. Nietzsche's work for instance, was butchered by his sister. There are conflicting copies of his work floating around. When his works were copied for Project Gutenberg, did someone go for an out of copyright copy that is definitive, or one that his sister chopped up? Did that matter, or was it just more important to get a copy up? Cattle ranchers, butchers, and chefs all deal with meat. That doesn't make a chef an expert on cattle feed or an butcher an expert on how to best prepare beef in orange sauce. We may both be involved in academia, but our concerns regarding information technology might be very different- that doesn't mean that one or both of us are idiots, or that I'm a Luddite, or that you're a geek with no appreciation for what's inside the books you put up. support for research, or as the published outcome of research) is certainly an overstatement, and inconsistent with the experiences of me and my academic peers. -- Greg _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d

A simple implementation of the id="" HTML attribute would solve the issues regarding quoting a particular sentence or paragraph... for example: http://kodekrash.com/project/btw_ufs.html#p191 -- will put you right at a paragraph talking about learning multiplication before cube roots (in Booker T Washington's autobiography). If we had decent master versions of the texts, such features would be child's play... I _will_not_ go into the "master versions" rant again tho. -- James

On Fri, 12 Nov 2004, Norm Wolcott wrote:
randomly removed. Also there are many DOS texts with accents that are hence unreadable. Any code page should be acceptable? maybe but. . .
We have a couple people who are fixing up and reposting older files, (which is often more involved than simply changing character encoding) A little while ago, I heard over 400 etexts had been reposted, so it's more than that by now. Are you volunteering to help? Andrew
participants (5)
-
Andrew Sly
-
Greg Newby
-
Her Serene Highness
-
James Linden
-
Norm Wolcott