
On Sat, Feb 5, 2011 at 10:21 AM, Jim Adcock <jimad@msn.com> wrote:
For example if I send a large set of Unicode code points into it, a smaller set of Unicode code points comes back out of it -- which I don't understand. I would have thought that "Unicode is Unicode" and that Calibre would pass it through unmolested.
I don't know what you're seeing, but that's not unexpected in some cases. If it applied NFD and decomposed the characters, you would get the base characters + a set of combining characters out. It could also translate special spaces to normal ones, or even impose NFKC or NFKD, which would normalize all sorts of characters--that's questionable, because it turns things like ² into 2, but it may be expected by some programs. -- Kie ekzistas vivo, ekzistas espero.