Conrad Parker wrote:
Unfortunately it seems the catalog.rdf file is missing some lines, and as a result cannot be parsed by strict parsers such as those in libxml2 (which is very widely used by many platforms).
I just parsed it successfully using perl 5.8.0 and libxml 2.5.10.
After some brief googling I came across Grahame Bowland's site, which includes a simple unix shell script which he developed recently:
http://angrygoats.net/svn/gutenberg/fix-catalog.sh
This inserts the missing entities into the DOCTYPE declaration at the top of catalog.rdf. Of course it would be better if these entities could be included in the original catalog.rdf published by Project Gutenberg :)
We do not use HTML entities in the database any more, so the generated RDF/XML and RSS should not contain any. -- Marcello Perathoner webmaster@gutenberg.org