>Huh. Not obvious to me what he is doing in bookloupe.
What I meant was that it was not obvious to me what kind of coding approach was being used in bookloupe. What I did in gutcheck_u was a straightforward translation of the programming from being an 8-bit char program, to being a 16-bit widechar program. There was some places in gutcheck where valid glyph-range assumptions were hard-wired in for the 127-255 range – an amalgamation of code pages, and those needed to be changed to something more “reasonable” assuming that people are using Unicode, and using Unicode, hopefully, for some sensible reason. And there was a large number of implied-handiness tests for straight-quote and straight-double-quote and/or straight-apostrophe, and those tests all become somewhat easier and more sensible on the Unicode curly handedness versions. In any case gutcheck_u attempts to check the sanity of both the straight and curly versions – whatever it finds, and if it finds a mix [at least within a paragraph], it should report that too. The straights tests are the same as before, assuming I didn’t accidentally break something.
Probably what gutcheck_u ought to have is some switches to say which of the Latin code pages are being used, vs. not used, so that the testing can be more reasoned. It should also probably test better passing common non-alphas, and querying uncommon non-alphas – since there are a lot more non-alpha out there in Unicode world which PG submitters don’t actually intentionally use very often.
In any case I’m thinking that submitters are going to be wanting to submit curlies more and more often. The straights are beginning to look increasingly anachronistic.
Best. Jim A.