
Sorry if this is a FAQ, or a variation on a FAQ. Are there any Project Gutenberg databases that show the *original* publication dates (e.g. 1875, 1916) for all or most of the texts? I've created a database (current as of a few months ago) that has info on each book -- author, title, LC classification, etc --- but nowhere in the metadata for the texts could I find the original publication date. Unless I can find such a database, I'm going to get a research assistant to find this info for all ([original] English language) texts in the collection, or else write a script to automate the process. FWIW, I'm planning on using the Gutenberg texts as part of a 100 million word corpus of texts from English (British and US) from the 1800s-1900s, similar to what I've done for the 100 million word British National Corpus (http://view.byu.edu) and the 100 million word Corpus del Espanol (www.corpusdelespanol.org). Thanks in advance for any info you might have. Mark Davies ================================================= Mark Davies Assoc. Prof., Linguistics Brigham Young University (phone) 801-422-9168 / (fax) 801-422-0906 http://davies-linguistics.byu.edu ** Corpus design and use // Linguistic databases ** ** Historical linguistics // Language variation ** ** English, Spanish, and Portuguese ** =================================================

Hi Mark. Your project sounds fascinating. I hope it does work out well. The issue of "original publication dates" is not always straight-forward. There are many of the older texts in PG which are not intended to be representative of any particular paper edition. And what about a 1950's edition of material from the 1700's with new editorial commentary? (Yes, we do have material like that in PG.) Also, there is the issue of when a work was written vs. when it was published. In many cases, these two dates will be close, but in some cases, they can be decades apart. For your purposes, we could perhaps try for a concept of "original publication date of basic material", but that would have a great reliance on individual judgement, and we would undoubedly get email from people expressing discontent with how "we've got it wrong". Andrew On Wed, 31 Aug 2005, Mark Davies wrote:
Sorry if this is a FAQ, or a variation on a FAQ.
Are there any Project Gutenberg databases that show the *original* publication dates (e.g. 1875, 1916) for all or most of the texts? I've created a database (current as of a few months ago) that has info on each book -- author, title, LC classification, etc --- but nowhere in the metadata for the texts could I find the original publication date.
Unless I can find such a database, I'm going to get a research assistant to find this info for all ([original] English language) texts in the collection, or else write a script to automate the process.
FWIW, I'm planning on using the Gutenberg texts as part of a 100 million word corpus of texts from English (British and US) from the 1800s-1900s, similar to what I've done for the 100 million word British National Corpus (http://view.byu.edu) and the 100 million word Corpus del Espanol (www.corpusdelespanol.org).
Thanks in advance for any info you might have.
Mark Davies
=================================================
Mark Davies Assoc. Prof., Linguistics Brigham Young University (phone) 801-422-9168 / (fax) 801-422-0906
http://davies-linguistics.byu.edu
** Corpus design and use // Linguistic databases ** ** Historical linguistics // Language variation ** ** English, Spanish, and Portuguese **
================================================= _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d
participants (2)
-
Andrew Sly
-
Mark Davies