Re: [gutvol-d] barriers to XML posting

----- Original Message ----- From: Marcello Perathoner <marcello@perathoner.de>
Jim Tinsley wrote:
That process must work for _all_ teixlite files, not just ones that are specially cooked, using constraints not specified within the chosen DTD. Here's where we hit the rocks today.
Impossible. There are things you cannot specify in a DTD but still must be followed to get a semantically correct file. (This holds for every XML application not just for PGTEI.) You always have to obey some extra rules besides validity. These are put down in the PGTEI guide.
Hmm... Maybe I misunderstand here. If a file comes in, marked up in TEI-Lite and we cannot transform it with our standard process, it seems to me either the DTD we've chosen is incomplete or the TEI markup has a bug. Now, if a new text needs a feature not in our current DTD (am I using the teminology right here), I'm not against modifying the DTD standard to include it, but there would need to be some procedure to do it so that it gets "reviewed" by others first. Or, maybe there is a way to define new elements that are outside the standard DTD within the XML submission file itself? Again, I'm trying to learn this as I go, so if my question is stupid, I apologize in advance.
The only things we must have -- both for our own internal practical purposes and for the use of future readers -- is that it should work reliably on _all_ texts that conform to the XML DTD chosen, be open source, and be cross-platform. A reader needs to be able to tweak the transform and re-run on her own desktop.
Same as above. The DTD is not strict enough (RelaxNG will be better, but it's still early). There will always be valid TEI files that do not transform to `correct' output files.
I don't see why it is necessary for the conversion tools to run on everybodies desktop before we can start posting files. If the tools run on pglaf.org and gutenberg.org that is more than enough for a start. The tools can be fixed later. That won't make posted valid TEI files invalid.
If we have the tools on the server and available for use, that is sufficient for me. But I also think that all the files (DTD, XSLT, and whatever else) should always be available for download for the industrious person that DOES want to run it on their own machine. Josh

Joshua Hutchinson wrote:
Hmm... Maybe I misunderstand here. If a file comes in, marked up in TEI-Lite and we cannot transform it with our standard process, it seems to me either the DTD we've chosen is incomplete or the TEI markup has a bug.
Consider following examples. A DTD-based validator can catch this: <address> <date>01 Jan 2004</date> </address> because a date has no business inside an address. But not this: <address> <name>Chicago</name> <street>2830 North Clark</street> <place>Curl Up and Dye Beauty Salon</place> </address> The validator cannot know that the markup is all wrong. Of course this will _transform_ all right.
Now, if a new text needs a feature not in our current DTD (am I using the teminology right here), I'm not against modifying the DTD standard to include it, but there would need to be some procedure to do it so that it gets "reviewed" by others first.
TEI has a well documented interface for exactly this purpose. Experience has shown that not even the full TEI can accomodate all cases. So, if you need to mark up something completely new, as eg. the message you just got from an alien civilization, you can expand the TEI DTD and still conform to the TEI standard.
Or, maybe there is a way to define new elements that are outside the standard DTD within the XML submission file itself? Again, I'm trying to learn this as I go, so if my question is stupid, I apologize in advance.
No. All you can define inside an XML file is the DTD (or other schema) you want to use and entities like &myentity; Of course you can use a DTD that defines some stuff and then includes the standard TEI DTD. But, as said above, there is a better way to do that in TEI.
If we have the tools on the server and available for use, that is sufficient for me. But I also think that all the files (DTD, XSLT, and whatever else) should always be available for download for the industrious person that DOES want to run it on their own machine.
Already done. Start here: http://www.gutenberg.org/tei/ -- Marcello Perathoner webmaster@gutenberg.org

Marcello Perathoner wrote:
No. All you can define inside an XML file is the DTD (or other schema) you want to use and entities like &myentity;
That's not true. You can define a full DTD (or a subset of one) within the XML document itself if you want. The W3C gives the following example in the XML 1.0 (3rd ed.) spec: <?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE greeting [ <!ELEMENT greeting (#PCDATA)> ]> <greeting>Hello, world!</greeting> This is a fully valid and well-formed XML file with the DTD defined in the DOCTYPE header instead of in a separate DTD file. Of course, while you _can_ do that, it's probably not the best way. Curtis.

Curtis A. Weyant wrote:
That's not true. You can define a full DTD (or a subset of one) within the XML document itself if you want. The W3C gives the following example in the XML 1.0 (3rd ed.) spec:
<?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE greeting [ <!ELEMENT greeting (#PCDATA)> ]> <greeting>Hello, world!</greeting>
You are right. This way you could introduce some personal tags into a document and slip them past the validator.
Of course, while you _can_ do that, it's probably not the best way.
It could become difficult to track which translator goes with which files. It is easier if you just reference one out of a known set of DTDs. -- Marcello Perathoner webmaster@gutenberg.org
participants (3)
-
Curtis A. Weyant
-
Joshua Hutchinson
-
Marcello Perathoner