Re: [gutvol-d] Posting TEI

21 Oct 2004

      Jim Tinsley wrote:
...
I really do not mean to be disrepectful when I -- speaking for myself --
say that I'm not interested in spending my time making developers' jobs
easier. That's not what I'm here for.
One one hand you wonder why we developers are not able to come up with a 
solution on the other hand you are not disposed to get one inch back on 
your developer-unfriendly position.

Your policy of not posting TEI files is at present the main roadblock.

It's like requesting the final release of the product before allowing a 
beta test.

I have been doing other (hopefully useful) work and have not looked at 
the TEI code for about a year now because I don't see a way to get it to 
work with this `moratorium' in place.
...
We have text, and HTML, both
proven and well-supported formats that we know how to work with and for
which we know there is a demand. I'll stick to those until we can see
a way clear through to making successful XML.
You sure know how to work with PLAIN ALL CAPS ASCII TEXT FILES but 
that's not a reason to shun all progress since.
...
Correct spelling is necessary but not sufficient. I don't know about
other people, but I most commonly find errors by skimming the text.
I can't do that with XML.
After a few weeks you'll skim thru TEI like you skim thru plain text. 
(Use an editor that highlights the tags and use a low contrast color for 
the tags.)
...
And it may not be the way software development works, but then we're not
a software development project.
But you depend on software. DP is 250.000 lines of code. If it was not 
for software you wouldn't have much to do.
...
that's not the problem. If the process we agree for teixlite is, say, run
it through Saxon, then I expect to be able to run all teixlite files 
through Saxon, and not have a submitter say "oh, no, you must use Xalan for
this file, and not just any Xalan, but one with my patch in it."
You have to use PGTEI stylesheets to convert PGTEI text. You can use 
them with any XSLT 1.0 compliant processor.
...
You see, we appear to differ very fundamentally on one point. It's
my lock and key analogy again. I do not want to start down the road
of producing posted files from an XML if the transform, will be, for
any reason, not repeatable in a year's time, or five, or ten.
This amounts to the same as: never start at all. Remember: the first 
files were uppercase ascii. We *had* to do them over again. We *are* 
doing all pre-10K texts over again. We *will* have to do the TEI files 
over again, maybe more than once. That's only being realistic.
...
I do
not want to start down the road of producing posted files from XML
if an end-user who wants to -- on whatever platform -- cannot 
replicate the process.
Then you should also post all the scanned pages so a user can redo the 
OCR on her platform if she wants to.

I think we can postpone this, because the user can grab the converted 
files. And if converting at home is an issue with him, hey!, the tools 
are Free Software. He can change them until they work on his platform 
and and submit the patches to us.
...
...
For the start I will act as interim Post-Processor for people wanting to 
post PGTEI and pass on to you only the perfectly good ones. You'll just 
have to stick in the etext number where I put 5 asterisks.
No; I, at least, don't want to work with an experimental process in which
each text is an exception.
Is there some qualifying exam to become a whitewasher?

I ask, because by now I'm so desperate that I'm quite willing to become 
a whitewasher myself just to see some TEI texts posted.
...
Why can't we just name them .xml? I see no reason to invent extensions.
_Is_ there one? Not that it matters much, just curious why you would 
think this a good idea.
Because there ain't such a thing as an XML file. XML is just a framework 
for building applications. XHTML is an XML application, SVG is an XML 
application, TEI is an XML application, OpenOffice file format is an XML 
application ...

Labelling a file .xml is like labelling a Word file .bytes

-- 
Marcello Perathoner
webmaster@gutenberg.org