
On Mon, Feb 13, 2012 at 12:03:33PM +0100, Marcello Perathoner wrote:
On 02/13/2012 08:13 AM, Greg Newby wrote:
The policy forever (at least since the first version of the "small print", in the early 1990s or late 1980s) is found in every single eBook and elsewhere:
"Project Gutenberg-tm eBooks are often created from several printed editions, all of which are confirmed as Public Domain in the U.S. unless a copyright notice is included. Thus, we do not necessarily keep eBooks in compliance with any particular paper edition."
This policy is a de-facto non-policy as the majority of our books are now produced from one specific paper edition. And that is what most producers want.
I think we should drop this `policy´ entirely and make the catalog entry reflect the choice of the producer by saying either:
- Project Gutenberg edition transcribed from different sources, or
- include metadata of edition X.
The current policy allows this.
For producers to include such information in a more structured format seems fine to me. I don't recall anyone ever presenting an eBook in such a format (say, with a snipped of Dublin Core XML at the end).
The main point here was to have a standard way to include metadata information in every file and have it go thru the WW process unscathed.
The current policy allows this.
All that said: the idea that PG could catalog our items, and derive their *primary* metadata as based on one or more print editions used as sources is just not consistent with the policy and practice cited above. Our #140 was *not* published in 1906, it was published in 1994. (Hmmm...interesting example, since the catalog doesn't have this right, either.)
We'd get beat up about it. Librarians would complain. Publishers would have a basis to complain about us mis-using their trademarks. And, it would be false. The PG editions are *not* their print sources.
This argument is wrong on 3 counts:
1.
A MARC catalog entry for a reproduction can either describe the source or the reproduction. This is intentionally left as a choice for the library. If the main catalog entries describe the source, then MARC field 533 describes the reproduction. See:
http://www.loc.gov/marc/bibliographic/bd533.html
If PG decides so, it is perfectly legal MARC to encode the original source description in the main body and use 533 to describe our electronic edition data.
I don't think this is a correct interpretation of what MARC allows, but would like to become better informed. Reproductions are about facsimilies, photographs and microfilms. We're certainly not making pictures of books, or facsimiles.
2.
It is not clear why the portion of our ebooks that are faithful reproductions of one paper edition should not get the metadata for their editions included.
(Note that we'd need some measure of faithfulness. My notion of faithfulness is not yours. For example, one of us might not thing page numbers are important. Another might believe that typesetting errors should always be preserved. Etc.) (Theoretically, if the RST master includes page numbers and graphics, and the EPUB derivative does not, is it still as faithful a reproduction?)
OTOH those produced from many editions should clearly state: PG edition.
Both ways should be acceptable.
We already do both. The producers often do include such source metadata, and it is allowed (even if it's a less-than-faithful reproduction). But it's not in an easily machine-parseable format. Such a format would be a fine idea. -- Greg
3.
The argument "somebody could sue us for no reason" is always invalid.
The idea of structured metadata about sources makes sense. Only if it's clearly a search for source material(s) used, not for the PG titles.
I think it is clear enough for everybody from the context of the search that they are searching for an ebook and not a physical book.