[gutvol-d] Re: Calibre: Open Source Software for Managing eBook Collections

5 Feb 2011

      On 02/05/2011 07:21 PM, Jim Adcock wrote:
...
Calibre is very slow and takes a fair amount of work to use in practice.
Calibre might be a good choice for the end user. There's a lot of knobs 
you can turn.

For mass conversion it is too fat and too slow and the knobs won't help 
you any unless you turn them into a position that works for all books. 
And good luck with that.
...
It converts the input file format to its own internal format, and then to
the designated output file format, and in the process seems to apply a bunch
of heuristics and assumptions which don't seem to work out very well in
practice for me.  For example if I send a large set of Unicode code points
into it, a smaller set of Unicode code points comes back out of it -- which
I don't understand.  I would have thought that "Unicode is Unicode" and that
Calibre would pass it through unmolested.
Not necessarily. For all accented characters there's a precomposed and 
decomposed form. If you use the precomposed forms, you'll use a lot more 
codepoints (one for each character), while in the decomposed form you'll 
use one codepoint for each character stripped of its accent and one for 
each accent.

The decomposed form might get better results on the limited fonts you 
find in ereaders.

OTOH, precomposed chars as a rule look better than decomposed chars 
because the font designer can put the accent in the exact right place, 
while the decomposed accents get placed algorithmically (and YMMV).

-- 
Marcello Perathoner
webmaster@gutenberg.org