
On Sat, 8 May 2010, Marcello Perathoner wrote:
Andrew Sly wrote:
Would it be possible to run some kind of automated check on all files labelled ISO-8859-1, searching for characters in the 0x80 to 0x9F range?
In theory yes. In practice I've found that there are very many mislabelled files, not always so simple a case as ISO vs. WIN.
I could believe that. The one that jumps to my mind is the Swedish Bible that I prepared for re-posting back in 2005. It looked like it had been prepared on a computer using one of the old DOS code pages, only it didn't quite seem to match any standard that I could find. Possibly it had been mangled in a file transfer somewhere. I was able to find what the correct characters should be, do some global search/replace and repost it as ISO-8859-1. --Andrew