
brad said:
As has already been mentioned, ASCII is an encoding and plaintext is a format.
i fail to see how this distinction has any importance to the original point. the user wants the words free of markup.
And ASCII is being replaced with Unicode. Some decades from now ASCII will gradually go the way of the Dodo.
well, if you want to get into this kind of doubletalk -- which i don't because, as i just said, it has no importance -- then it is inaccurate to say that ascii is "being replaced" by unicode, since the bottom 127 characters of unicode are the same 127 ascii characters we've come to know. if we give the original poster a unicode-aware text-editor, and a file that contains no heavy markup, he will be happy. he wants the words, all the words, and nothing but the words.
As for plaintext, one of the core design goals for XML is that it you'll be able to open it in any text editor and read it.
ok, and now here you seem to be trying to say that an x.m.l. file is a plain-text file. it's not. it might consist of nothing more than those 127 ascii characters, but it is decidedly not a plain-text file. the original poster knows it's not plain-text. so does michael hart. most people do. including, i suspect, you. why confuse the issue?
If a file is human readable when it's opened in a text editor then it's a type of plain text.
again, this subterfuge is dishonest. first, it's inaccurate to say that an x.m.l. file is "human readable". and second, it's misleading to say it is "a type of plain text". it might be an ascii file, but it's decidedly _not_ "plain-text".
All XML does is place tags around text in order to give the text a structure that machines can understand.
you give machines far too little credit. they can be made to be far smarter than a dirt-dumb x.m.l. processor, which can _only_ be made to "understand" the structure of text _if_ it is tagged.
As long as you have a text editor, you'll be able to read XML.
let's give the original poster an x.m.l. file, and have _him_ say whether he is able to "read it". just because you can load a file into a text-editor doesn't mean you'll actually be able to figure out _how_ to edit the darn thing in the way you want. and _that_ is the real topic at hand here... these semantic games do nothing but cloud the discussion.
A good text editor can clean out all of the tags with a simple regular expression like "<.*[^>]*>".
ok, well at least now you're starting to talk about _issues_. but of course, you're glossing over the reality even here. the inference you are trying to get us to make is that "cleaning out all the tags" will convert an x.m.l. file into a plain-text file, magically. it won't. not in all cases anyway. not unless the x.m.l. file was created -- carefully -- with that specific conversion in mind. i've been writing a separate post that will give details how this careful consideration and crafting must be done. (some hints: whitespace, quotemarks, and tables.)
Script languages like perl, python, ruby or any other language likely to come down the pike will be able to process XML and convert it into whatever comes along in the future.
it's telling how all of the hype about x.m.l. is in the present-tense, but when you focus down to particulars, it moves to future-tense. pay attention to this, lurkers! it's a sure sign of vapor-ware!
Very few applications render XML directly (except perhaps word processors), everyone else converts it into html, pdf or other formats for display.
ask yourself why this is the case. the answer is interesting.
SGML (XML's older sister) has been around for, what, twenty years or more? And all SGML documents are easily converted into XML. XML is simplier and designed to be around as an archive format for far longer than that.
in its day, s.g.m.l. made all the same promises as x.m.l. does now. it couldn't keep them, so s.g.m.l. people had to invent a variant, so they could regenerate all their hype from scratch and reuse it. and sure enough, the public is gullible enough to believe it all again. of course, the same difficulties that thwarted s.g.m.l. back in the day -- sabotaging all their hype -- will return and bite x.m.l. in the butt. but by the time we figure out how we've been had this time around, all the x.m.l. proponents will have carted off their consultant cash...
Most people will never know about the master version in XML, they only will see the file formats they use to read books.
they'll "know about" that x.m.l. version indirectly; it will be the reason their books are so expensive. due to all that cash those consultants carted away.
XML is only a long term and safe archive format
hype and marketing.
Once you understand that XML is just plain text, you can use any software for processing text to work with it.
you can save a spreadsheet in "plain-text" form too, and then "use any software for processing" that too. but you're going to find yourself coming up short. likewise when working with an x.m.l. file in a plain-text editor; yes, it can be done, but you will find yourself coming up short. but x.m.l. people will continue telling us this untruth, because they want us to believe that x.m.l. is really simple. but it's not.
As long as there is a text editor, an XML documment will never be lost.
of course, if it ain't human-readable in that form, it doesn't really matter if it "will never be lost". it won't need to be "lost" once it has been "tossed"... *** i will repeat: make x.m.l. work if you want us to respect it. don't come and _tell_ us how wonderful it will be; show us. the proof is in the pudding. not in the hype and marketing. -bowerbird