
greg said:
Not all input books or types can be reasonably accurately converted to any possible format, especially for the older titles with no well-formed & valid HTML version.
as i've said for years now, with a small commitment from you to consistent formatting, i could take plain-ascii files as input, automatically apply the typographic niceties that are expected, and output the results to .pdf and to .html, such that the .html can be converted to a large number of other auxiliary formats. of course, i'm not unique. david moynihan has done it for years. david was willing to make a small commitment to edit the files himself so as to obtain that consistent formatting. i think it is more important to teach you how to fish than to give you fish. check with 3 tool-makers from distributed proofreaders -- thundergnat, donovan, and bill flis -- and they'll confirm that a clear path for ascii-to-(x)html conversion is quite workable -- due to the fact that d.p. now has the required consistency -- even with their current programs, and that if they worked on it a bit more, they could make it into a regular part of the workflow. there is no need for the more-complex switch to a .tei workflow. -bowerbird p.s. if you only would have accepted moynihan's offer of his files when he made it to you, you'd already _have_ a consistent library.