Re: Quick question about file formats
William Waites wrote:
On Sat, Oct 30, 2010 at 08:56:50PM +0200, Marcello Perathoner wrote:
Paulo Levi wrote:
Another quick question :) Are the rules for creating a download url from the "file" tag in the rdf catalog consistent? The "algorithm" is the expansion of XML entities, which any common run-of-the-mill xml parser will do for you.
RDF != XML
I think we had this discussion already. This is an XML file and should be processed thru an XML parser. If you don't, every little cosmetic change to the file structure will break your program. You have been warned.
If you're trying to interpret RDF data, it's better to use a library, they exist for just about all programming languages. If you try to interpret it as XML you are asking for trouble.
It is too bad that the RDF you get from here, http://www.gutenberg.org/ebooks/12345.rdf is different from the catalogue.
This is intended and documented. http://www.gutenberg.org/wiki/Gutenberg:Feeds The old catalog.rdf is a legacy format we keep for compatibiity.
This is because you have
xml:base="http://www.gutenberg.org/feeds/catalog.rdf
and then, e.g.
rdf:ID="etext12345"
This amounts to giving the URI
http://www.gutenberg.org/feeds/catalog.rdfetext12345
to that book which is not what you intend.
Wrong. This gives http://www.gutenberg.org/feeds/catalog.rdf#etext12345 "The rdf:ID attribute on a node element (not property element, that has another meaning) can be used instead of rdf:about and gives a relative RDF URI reference equivalent to # concatenated with the rdf:ID attribute value."
If on the other hand you had used
rdf:about="http://www.gutenberg.org/ebooks/12345"
the data would be the same (which I guess is what you intend).
where lower down you talk about formats, you use
rdf:resource="#etext12345"
which refers to
http://www.gutenberg.org/feeds/catalog.rdf#etext12345
which if it weren't for the error with rdf:ID would at least be consistent within the catalogue.
But supposing this is fixed, I still have two URIs for one text:
http://www.gutenberg.org/feeds/catalog.rdf#etext12345 http://www.gutenberg.org/ebooks/12345
and you've given no way of knowing that they are in fact the same.
Because you are not supposed to mix the old catalog.rdf with the new catalog.rdf which will be put online when I get to finish it. -- Marcello Perathoner webmaster@gutenberg.org
On Sun, Oct 31, 2010 at 12:39:28AM +0200, Marcello Perathoner wrote:
Wrong. This gives
http://www.gutenberg.org/feeds/catalog.rdf#etext12345
"The rdf:ID attribute on a node element (not property element, that has another meaning) can be used instead of rdf:about and gives a relative RDF URI reference equivalent to # concatenated with the rdf:ID attribute value."
Quite right. I was confusing nodeID with ID (now will a user trying to use an XML parser also get this right? Kind of proves my point: RDF != XML, don't look at the XML unless you have a good reason to be writing a parser). In any event I stand corrected on this. -w
participants (3)
-
Jim Adcock
-
Marcello Perathoner
-
William Waites