
james said:
I will set this project aside
nope, no time for that... :+) here's the next step, a list of some of the words that didn't pass the spellcheck. (might not be all of them, as i'm not certain i've handled all the encoding issues, but these are definitely some words that will not pass.)
this list of flagged words is ordered by _the_page_... first of all, as an overview, i don't know about you, but when _i_ see a list like that, i begin to get a feeling that i am the master of the o.c.r., rather than a victim of it... i'm no longer on a "hunt" for the errors, i have a _map_, so i know exactly where to find them and apply the fix... or, for some cases, to check them once (and for all) to make sure the name is spelled correctly, so i can ignore other cases (if it is) or do a global correction (if it's not). that's where this particular list gives us another edge, namely that it has automatically culled a lot of names. it won't always get them all, especially in a book with american names, some of which are "good" words, like "will smith", "may brown", "mike cane", or "brook springs". but on a book like this one, it does a _very_ good job... and there's an added benefit on this book, because many of those names have diacritical marks on them, so being able to pull them out like this means that we can easily generate some global changes to fix them... so that will be one of your next jobs, james, to prepare such a list of global changes to the names listed there. the matter of encodings also brings up another issue... at the top of this page of flags, there's a little "v?" link... you will see this on a lot of my pages, at top or bottom. if you click it, it will send the page to the "validator" site -- in a separate window -- to see if the page "validates". it's not that i care that much about "validation" (i don't), it's just that having a link like this one makes it painless. anyway, if you click the validator link on this page, james, you'll see that it generates an error and a couple warnings. the error is for a naked ampersand i'd "failed" to escape... (i did the others, but i left this one in, for an illustration.) the first warning is because it's html5, so you can ignore it. but the second warning is a troublesome one, which says:
Text run is not in Unicode Normalization Form C.
i don't know what that means, and i don't particularly care to have to track it down, so if you can figure it out, james, that would be dandy. the problem also occurs in your file, the one in your dropbox, so i think it's likely on your end. but there might also be parts of my workflow that are plain oblivious to utf8 issues, so i can't rule out that possibility. but it probably has something to do with the way that you generate some utf8 characters, e.g., by "composing" them. i'm so ignorant of these issues that i don't even know when these characters _look_ right or wrong, so i can't help much. but if you can get your text to go through the wc3 validator without generating a warning, that would be a good thing, and i will make sure that my apps don't screw anything up. -bowerbird