[gutvol-p] Re: Quick question about file formats
Marcello Perathoner
marcello at perathoner.de
Sat Oct 30 15:39:28 PDT 2010
William Waites wrote:
> On Sat, Oct 30, 2010 at 08:56:50PM +0200, Marcello Perathoner wrote:
>> Paulo Levi wrote:
>>> Another quick question :)
>>> Are the rules for creating a download url from the "file" tag in the rdf
>>> catalog consistent?
>> The "algorithm" is the expansion of XML entities, which any common
>> run-of-the-mill xml parser will do for you.
>
> RDF != XML
>
>> I think we had this discussion already. This is an XML file and should
>> be processed thru an XML parser. If you don't, every little cosmetic
>> change to the file structure will break your program. You have been warned.
>
> If you're trying to interpret RDF data, it's better
> to use a library, they exist for just about all
> programming languages. If you try to interpret it
> as XML you are asking for trouble.
>
> It is too bad that the RDF you get from here,
> http://www.gutenberg.org/ebooks/12345.rdf
> is different from the catalogue.
This is intended and documented.
http://www.gutenberg.org/wiki/Gutenberg:Feeds
The old catalog.rdf is a legacy format we keep for compatibiity.
>
> This is because you have
>
> xml:base="http://www.gutenberg.org/feeds/catalog.rdf
>
> and then, e.g.
>
> rdf:ID="etext12345"
>
> This amounts to giving the URI
>
> http://www.gutenberg.org/feeds/catalog.rdfetext12345
>
> to that book which is not what you intend.
Wrong. This gives
http://www.gutenberg.org/feeds/catalog.rdf#etext12345
"The rdf:ID attribute on a node element (not property element, that has
another meaning) can be used instead of rdf:about and gives a relative
RDF URI reference equivalent to # concatenated with the rdf:ID attribute
value."
>
> If on the other hand you had used
>
> rdf:about="http://www.gutenberg.org/ebooks/12345"
>
> the data would be the same (which I guess is
> what you intend).
>
> where lower down you talk about formats,
> you use
>
> rdf:resource="#etext12345"
>
> which refers to
>
> http://www.gutenberg.org/feeds/catalog.rdf#etext12345
>
> which if it weren't for the error with rdf:ID would
> at least be consistent within the catalogue.
>
> But supposing this is fixed, I still have two
> URIs for one text:
>
> http://www.gutenberg.org/feeds/catalog.rdf#etext12345
> http://www.gutenberg.org/ebooks/12345
>
> and you've given no way of knowing that they are
> in fact the same.
Because you are not supposed to mix the old catalog.rdf with the new
catalog.rdf which will be put online when I get to finish it.
--
Marcello Perathoner
webmaster at gutenberg.org
More information about the gutvol-p
mailing list