
-----Original Message----- From: gutvol-d-bounces@lists.pglaf.org [mailto:gutvol-d-bounces@lists.pglaf.org]On Behalf Of David Newman Sent: Saturday, November 13, 2004 4:31 AM To: gutvol-d@lists.pglaf.org Subject: [gutvol-d] Scholarly use of PG As a credentialed conflict avoider, I've been loathe to stick my head into this fray. Indeed, this battle about meeting the needs of academia appears to be waged at times with an ideological fervor to rival that of the recent US election. It seems to me that the fervency with which people approach this issue has made it difficult in some cases for the arguments to follow a path towards resolution. It is perhaps also complicated by the wide assortment of changes being proposed to remedy the perceived problems. Some arguments for change suggest that PG should direct its energies towards making its library suitable for scholars by including more information in the files, particularly pagination and provenance, presumably packaged with XML. I have no problem with including such information. However, I don't think it should be required of all texts, nor do I believe that it really solves the scholarship issue. Including page scans _would_, to the degree that a solution is possible, and requires approximately 0% extra work for most of our valiant volunteers. And, PG has made it clear that this is acceptable, and has already done so for some projects. I feel that Marcello gave the most persuasive and concise summary of the situation, and I didn't notice any overt disagreement. Marcello Perathoner wrote:
The best value for Academia (and the least work for us) would be just to include the page scans. Any transcription you make will fall short of the requirements of some scholar. I think we should use our time for producing more books for a general audience instead than producing Academia-certified editions of them.
HSH's comments justify such an approach. Her Serene Highness wrote:
I need to know EXACTLY when the original was published, who published it, and where, since there are variant texts out there. Even a single word change that might have occurred in the copying process could change the meaning of a vital sentence.
Of course, there is a simple, if unsatisfactory, answer to all these questions for PG texts: they were published by PG, on the PG website, and each file states when it was published. Each work we publish is the "PG variant" of that text. As an academic, I find it dishonest and unhelpful for a scholar to cite a physical volume when the volume they consulted is an electronic edition. It is virtually impossible to guarantee that "even a single word change" was not introduced in the transcription process. Even with DP's careful processes, I would not wager that most of our books enter PG completely error free (or correction free, for that matter.) ** I would find it dishonest also. I think it is very important for people to give correct citations. However- and this is a big however- PG is not 'publishing' books. It's copying them. There is no PG publishing house that is making decisions on whether something is worth publishing or not. PG acts a repository- a library. Paper publishers cannot guarantee that each word onthe written page is exactly as written by the author. However, with books that are well known or historically important, scholars can often compare published texts with author's notes in order to see the variants. Many of the books on PG are obscure. we are given the name of a book and an author, but there is no book to be looked at. If these texts are important- and I would argue that many obscure texts are, if only for historical reasons- it is important to have copies of the scans. In some cases, PG may be the only place where someone can find particular texts. Textual clues do not live only in words. A book comes alive in typeface, and in word placement on a page. James Joyce didn't just write words to be read- he placed them on pages in ways that told the reader how to interpret them. Taking a book out of context- the context of the page- when that book was written prior to the computer revolution is like ignoring how many paintings were paired with their frames by the painters themselves. Saving a book while divorcing it from its index, illustrations, typefont, and so on is not 'saving' it. It's a decontextualization. A perfect example would be movie remakes. There are many different versions of 'A Christmas Carol', several of them in modern dress. Many of them use pretty much the same exact script. Does that make them the same? Why do people prefer even an old, scratched-up and faded copy with Alistair Sim to a nice shiny new version, even if the new film is a shot for shot remake? A film is more than actors spouting lines. Film is every aspect that goes into it, even beyond hwat Dickens thought up. There are times when we want nothing more than the words of dickens, and there are times when we want the thrill of seeing characters come to life before us in front of our physical eyes. A book may be perfectly good reading material- but an ebook printed in Courier (which is very hard to read), perhaps missing its original illustrations, without an index that shows the manner in which the author's or editor's mind worked- is no longer the original book. As a scholar I like working from original materials. An original material may be on a computer screen- that's fine by me. An original material might be enhanced by being online- many versions of The Bible are, for instance, and I received great joy recently while reading what was essentially a book that gave a key to Silverlock- it worked better online than it ever could have on paper. But PG is not publishing or storing original texts. It's working with old ones. I recall the cry that vinyl was going the way of the dinosaur- yet it has not. In fact, the MP3 player is the new vinyl- for the first time in years, there are cost effective '45s', courtesy of Napster and other companies. I can hear snippets of a song before buying, just as my mother one did in record shops. However, Napster technology is not better in the long run than a record- CDs and computer memory degrade at an alarming rate. Books aren't dead either, and people who think books are about finding passages in less than 25 seconds are missing the point of why people read- in the same way that people who drink coffee to get revved up often don't understand why tea drinkers make elaborate ceremonies around a caffeinated beverage. People read because they want a total experience- computers don't feel like paper. They don't smell. The text is usually flat and more difficult to read. Some of this will change over time- but not all of it, thank the Lord. I want books to be available to the public in ways that they have never been before, and so I support PG. But it doesn't have the credibility of a real library or publishing house, because it doesn't publish (copying things and leaving out some of the vitals doesn't constitute puplishing in most people's minds, or at least not in a good way, no matter that info techies might want to think)and it doesn't store (libraries don't cut the covers and publishing info off their books to make more room on the shelves, they include books of criticism, and they have technologies for cross referencing- they also have people called librarians who can help people refine their interests and find books that might be of use to them. So do bookstores. Even Barnes and Noble, to some extent). I think some people here want to store books. That's nice, as far as it goes. Whether they understand how people use books or why- well, I seriously doubt that some people here have thought about that. It's like MS Word, which ignores that people write more complex things than business letters. Its vocabulary and understanding of grammar are seriously stunted, and it's hellish for anyone who wants to edit anything longer than two pages. Does it process words efficiently? Yes. But it's a fucking bad word processor and has none of the grace of WordPerfect. That most people are fine with it shows how few people actually write or edit for the joy of doing so, which is fine- but its incomaptibilty with WP and vice versa makes life tough on those who do.*** Page scans allow for an additional layer of safety for any scholar concerned about the adherence to a given print edition, though a certain level of trust in the provider is still required. Thus, while I hope that PG's holdings are as accurate as possible, it would also be my hope that scholars using PG would cite PG. Evidently this is not always the case. Michael Hart wrote:
I've also heard that many of those who complain, actually use our eBooks in secret, and ONLY want the provenance so they can steal them without giving credit where credit is due.
This suggests to me two things. 1) We can include page scans and information about provenance, _when available_, with the files so that academics can feel confident in the reliability of those PG holdings. Not so that the original sources can be dishonestly cited, but to provide the necessary data for certain scholars to confidently cite PG's edition. We can point to this in our documentation to enhance our scholarly credibility. 2) We can prominently suggest an appropriate style of citation of works in PG's holdings. (I've seen this done with other digital collections.) Perhaps if the citation style also takes into account the original source, some otherwise reluctant scholars would be appeased. Is this something we can all agree on? -- David Newman www.davidnewman.info _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d