New subject: Quick question about file formats

30 Oct 2010

      William Waites wrote:
...
On Sat, Oct 30, 2010 at 08:56:50PM +0200, Marcello Perathoner wrote:
...
Paulo Levi wrote:
...
Another quick question :)
Are the rules for creating a download url from the "file" tag in the rdf 
catalog consistent?
The "algorithm" is the expansion of XML entities, which any common 
run-of-the-mill xml parser will do for you.
RDF != XML
...
I think we had this discussion already. This is an XML file and should 
be processed thru an XML parser. If you don't, every little cosmetic 
change to the file structure will break your program. You have been warned.
If you're trying to interpret RDF data, it's better
to use a library, they exist for just about all
programming languages. If you try to interpret it
as XML you are asking for trouble.
It is too bad that the RDF you get from here,
http://www.gutenberg.org/ebooks/12345.rdf
is different from the catalogue.
This is intended and documented.

   http://www.gutenberg.org/wiki/Gutenberg:Feeds

The old catalog.rdf is a legacy format we keep for compatibiity.
...
This is because you have
xml:base="http://www.gutenberg.org/feeds/catalog.rdf
and then, e.g.
rdf:ID="etext12345"
This amounts to giving the URI
http://www.gutenberg.org/feeds/catalog.rdfetext12345
to that book which is not what you intend.
Wrong. This gives

   http://www.gutenberg.org/feeds/catalog.rdf#etext12345

"The rdf:ID attribute on a node element (not property element, that has 
another meaning) can be used instead of rdf:about and gives a relative 
RDF URI reference equivalent to # concatenated with the rdf:ID attribute 
value."
...
If on the other hand you had used
rdf:about="http://www.gutenberg.org/ebooks/12345"
the data would be the same (which I guess is 
what you intend).
where lower down you talk about formats, 
you use
rdf:resource="#etext12345"
which refers to
http://www.gutenberg.org/feeds/catalog.rdf#etext12345
which if it weren't for the error with rdf:ID would
at least be consistent within the catalogue.
But supposing this is fixed, I still have two
URIs for one text:
http://www.gutenberg.org/feeds/catalog.rdf#etext12345
  http://www.gutenberg.org/ebooks/12345
and you've given no way of knowing that they are
in fact the same.
Because you are not supposed to mix the old catalog.rdf with the new 
catalog.rdf which will be put online when I get to finish it.

-- 
Marcello Perathoner
webmaster@gutenberg.org

Re: Quick question about file formats

Marcello Perathoner

William Waites

Jim Adcock

tags

participants (3)