On Jan 25, 2012, at 3:28 AM, Marcello Perathoner wrote:
The PG website already offers UTF-8 text only.
For a while I was sending up UTF-8 text only along with HTML. I stopped when it seemed that was causing a lot of work for the WWers. As of now, I send ASCII if that's sufficient, Latin-1 if it has characters in the Latin-1 set, and UTF-8 if it has characters not in Latin-1. There are two consequences: (1) everything that goes up in Latin-1 could go up in UTF-8 instead but doesn't and (2) I don't send up plain text with curly quotes at all. I work in UTF-8. It would be easiest for me to send up UTF-8. But the WW tools to check the submission, like gutcheck, struggle with UTF-8. The HTML I send up is UTF-8 and that survives because the WWers don't have to check it. They check the text file, which should be ASCII if it can be, and only UTF-8 instead of Latin-1 if there are characters that absolutely are necessary and not in Latin-1. Curly quotes are not viewed as necessary to the text version. Even an oe ligature isn't strong enough to justify UTF-8. Seems to me it's all about the tools. The WWers always seem overloaded and sending UTF-8 up makes that worse. If they had better tools to do their job, actually getting UTF-8 up on the PG website wouldn't be as problematic. This is an outside view of the problem. I am not a WWer. I'd love to hear from a WWer about Marcello's comment. Though correct as stated, perhaps it would be more accurate to say "The PG website already offers UTF-8 text only, but please don't send it to us unless absolutely necessary." I hope someday that statement will not be true, but I believe it is now. --Roger