
Jim Tinsley wrote:
I really do not mean to be disrepectful when I -- speaking for myself -- say that I'm not interested in spending my time making developers' jobs easier. That's not what I'm here for.
One one hand you wonder why we developers are not able to come up with a solution on the other hand you are not disposed to get one inch back on your developer-unfriendly position. Your policy of not posting TEI files is at present the main roadblock. It's like requesting the final release of the product before allowing a beta test. I have been doing other (hopefully useful) work and have not looked at the TEI code for about a year now because I don't see a way to get it to work with this `moratorium' in place.
We have text, and HTML, both proven and well-supported formats that we know how to work with and for which we know there is a demand. I'll stick to those until we can see a way clear through to making successful XML.
You sure know how to work with PLAIN ALL CAPS ASCII TEXT FILES but that's not a reason to shun all progress since.
Correct spelling is necessary but not sufficient. I don't know about other people, but I most commonly find errors by skimming the text. I can't do that with XML.
After a few weeks you'll skim thru TEI like you skim thru plain text. (Use an editor that highlights the tags and use a low contrast color for the tags.)
And it may not be the way software development works, but then we're not a software development project.
But you depend on software. DP is 250.000 lines of code. If it was not for software you wouldn't have much to do.
that's not the problem. If the process we agree for teixlite is, say, run it through Saxon, then I expect to be able to run all teixlite files through Saxon, and not have a submitter say "oh, no, you must use Xalan for this file, and not just any Xalan, but one with my patch in it."
You have to use PGTEI stylesheets to convert PGTEI text. You can use them with any XSLT 1.0 compliant processor.
You see, we appear to differ very fundamentally on one point. It's my lock and key analogy again. I do not want to start down the road of producing posted files from an XML if the transform, will be, for any reason, not repeatable in a year's time, or five, or ten.
This amounts to the same as: never start at all. Remember: the first files were uppercase ascii. We *had* to do them over again. We *are* doing all pre-10K texts over again. We *will* have to do the TEI files over again, maybe more than once. That's only being realistic.
I do not want to start down the road of producing posted files from XML if an end-user who wants to -- on whatever platform -- cannot replicate the process.
Then you should also post all the scanned pages so a user can redo the OCR on her platform if she wants to. I think we can postpone this, because the user can grab the converted files. And if converting at home is an issue with him, hey!, the tools are Free Software. He can change them until they work on his platform and and submit the patches to us.
For the start I will act as interim Post-Processor for people wanting to post PGTEI and pass on to you only the perfectly good ones. You'll just have to stick in the etext number where I put 5 asterisks.
No; I, at least, don't want to work with an experimental process in which each text is an exception.
Is there some qualifying exam to become a whitewasher? I ask, because by now I'm so desperate that I'm quite willing to become a whitewasher myself just to see some TEI texts posted.
Why can't we just name them .xml? I see no reason to invent extensions. _Is_ there one? Not that it matters much, just curious why you would think this a good idea.
Because there ain't such a thing as an XML file. XML is just a framework for building applications. XHTML is an XML application, SVG is an XML application, TEI is an XML application, OpenOffice file format is an XML application ... Labelling a file .xml is like labelling a Word file .bytes -- Marcello Perathoner webmaster@gutenberg.org