>so you have to let the user specify;
see how feedbooks.com does it...
The way “feedbooks.com does it” is to dumb-down books to their txt-only subset, and then pretty-print that txt-only subset into a number of different nearly-txt-only formats. It is easy to find feedbooks examples where there is some trivially not-txt-only portion of a book, for example even just an embedded quotation, where the rendering by feedbooks gets it totally wrong.