james said:
> I do have the book as 1 text page per file.
ok. (i think.)
> I got it this way by downloading the page images,
> making TIFFs out of them, then running tesseract
oh dear. that was a waste of time.
> I can give you the separate text files
> in a Zip archive if you wish.
are these the files after you made your corrections?
if so, then yes, those are exactly the files that i need.
zip 'em up, and put it in your dropbox.
> My work method is to use guiguts to
> remove page numbers and reformat paragraphs first.
oh dear. more wasted time. oh well.
(also, removing pagenumbers is the _last_ thing to do.
they help let you be aware where you are in the book.)
> The link you gave gives me a 404 error.
yes, here's the correct one:
> http://z-m-l.com/go/bhaga/bhagap123.html
sorry about that...
> I'm not sure what you mean by online.
i mean you do your corrections on the web...
which means that other people can help you.
(at least if you give them the web-address.)
but if you prefer to work offline, you can do that.
> I thought you would
> provide a command line utility
i'm a mac person, james. we believe in a friendly interface.
only a sadist seeks to saddle you with command-line crap...
> that would convert ZML to the various formats.
but first you have to get your text _into_ .zml format.
> Were you thinking of something like DP uses?
"something like" that is a fairly accurate description.
my system isn't nearly as convoluted or bureaucratic.
-bowerbird