
On 02/05/2011 07:21 PM, Jim Adcock wrote:
Calibre is very slow and takes a fair amount of work to use in practice.
Calibre might be a good choice for the end user. There's a lot of knobs you can turn. For mass conversion it is too fat and too slow and the knobs won't help you any unless you turn them into a position that works for all books. And good luck with that.
It converts the input file format to its own internal format, and then to the designated output file format, and in the process seems to apply a bunch of heuristics and assumptions which don't seem to work out very well in practice for me. For example if I send a large set of Unicode code points into it, a smaller set of Unicode code points comes back out of it -- which I don't understand. I would have thought that "Unicode is Unicode" and that Calibre would pass it through unmolested.
Not necessarily. For all accented characters there's a precomposed and decomposed form. If you use the precomposed forms, you'll use a lot more codepoints (one for each character), while in the decomposed form you'll use one codepoint for each character stripped of its accent and one for each accent. The decomposed form might get better results on the limited fonts you find in ereaders. OTOH, precomposed chars as a rule look better than decomposed chars because the font designer can put the accent in the exact right place, while the decomposed accents get placed algorithmically (and YMMV). -- Marcello Perathoner webmaster@gutenberg.org