
BB>you know, the kind that jim labels "txt70", and lee calls "an impoverished text-file"... And what is perhaps most surprising is that ZML is not in a form that the WWers would be willing to even accept. BB>what these guys are never clear about is exactly _how_ volunteers are supposed to transform the _text_ files (of the type we mentioned up above, the type you end up with after your digitization) _into_ .html. What I am not clear about is why BB insists that what one starts from must be an "an impoverished text-file" because I never work with text files per se until I am forced to derive one at the end of my html development as a needless extra step in order to get the PG WWers to accept my html work. I do not start with an "an impoverished text-file" for the simple reason that my OCR gives me better file format choices which help preserve more of the information available in the original page images, such that I do not have to rediscover and re-enter that information again later manually -- after needlessly throwing that information away in the first place just to reduce the OCR result to txt70. PS: I call it "txt70" for the simple reason that I wish to distinguish that what PG insists one submit is not a text file in any normal sense, anymore than ZML is a normal text file in any normal sense. At least ZML has the arguable advantage that it retains the original line breaks -- but I have shown how these can be easily rederived. And the txt70 has a PG-specific requirement to put in manual line breaks at about every 70 chars, not to mention reimagining some of the standard ASCII code points as prosodic markers. PG'ers tend to spend so much time smelling their own roses that they forget that that which they call a text file really isn't a text file, anymore than the contents of an html file, or of a ZML file, is a text file.