
DP does generate a DC file for each project. I'm not entirely sure what's in it, although I presume that it captures the information that we collect, which is Title Author Language Genre (by our definition, not an official cataloging one) We do not collect information about publication dates, multiple authors or creative content roles, publisher, etc. In these regards the new PG clearance system collects much more information, and is probably much more accurate as well. Many project managers (including myself) tend to shorten or adjust the titles so that they fit better on the project listing page. Similarly with author/illustrator/editior/etc information. This is appropriate and useful for our internal purposes, but doesn't work well when mapped to anything external. All in all, I'd recommend using the information collected as part of the copyright clearance as a basis for cataloging. JulietS ----- Original Message ----- From: "Marcello Perathoner" <marcello@perathoner.de> To: "Project Gutenberg Volunteer Discussion" <gutvol-d@lists.pglaf.org> Sent: Sunday, September 19, 2004 12:10 PM Subject: Re: [gutvol-d] Indexing Editors, etc.
D. Starner wrote:
Greg Newby <gbnewby@pglaf.org> writes:
Sorry about that, David. The reason is that the automatic cataloging program only picks up the metadata in the book header like Author:, Title:, etc.
So is this something that I should take up within DP? What needs to be done that isn't?
We (WWs and me) have been discussing ways to fix the meta-data transfer between DP and the PG catalog.
What we came up with is:
- Put a unique identifier in the last line(s) of the text. This would allow the catalog database to query the database at DP for all missing info.
or
- Put a DC or XML/RDF metadata block at the end of the file.
Example of DC metadata block:
END OF THE PROJECT ...
dc.author: Twain, Mark dc.title: 1601 dc.language: en dc.encoding: us-ascii dc.publisher: Project Gutenberg dc.rights: http://www.gutenberg.org/license pg.etext: 12345 pg.id: af04.bd32.1234.5678
EOF
Example of RDF/XML metadata block:
END OF THE PROJECT ...
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:pg="http://www.gutenberg.org/pgrdf" xml:base="http://www.gutenberg.org/rdf/catalog.rdf">
<rdf:Description rdf:ID="etext13485"> <dc:publisher>Project Gutenberg</dc:publisher> <dc:title rdf:parseType="Literal">An Enquiry Concerning the Principles of Taste, and of the Origin of our Ideas of Beauty, etc.</dc:title> <dc:creator>Reynolds, Frances</dc:creator> <dc:contributor>Clifford, James L. [Contributor]</dc:contributor> <dc:language>en</dc:language> <dc:created>2004-09-17</dc:created> <dc:rights rdf:resource="http://www.gutenberg.org/license" /> <pg:identifier>0123.4567.89ab.cdef</pg:identifier> </rdf:Description>
</rdf:RDF>
EOF
-- Marcello Perathoner webmaster@gutenberg.net
_______________________________________________ gutvol-d mailing list gutvol-d@lYP5g.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d