Bowerbird,
james said:ok. (i think.)
> I do have the book as 1 text page per file.
oh dear. that was a waste of time.
> I got it this way by downloading the page images,
> making TIFFs out of them, then running tesseract
are these the files after you made your corrections?
> I can give you the separate text files
> in a Zip archive if you wish.
if so, then yes, those are exactly the files that i need.
zip 'em up, and put it in your dropbox.oh dear. more wasted time. oh well.
> My work method is to use guiguts to
> remove page numbers and reformat paragraphs first.
(also, removing pagenumbers is the _last_ thing to do.
they help let you be aware where you are in the book.)yes, here's the correct one:
> The link you gave gives me a 404 error.
> http://z-m-l.com/go/bhaga/bhagap123.html
sorry about that...i mean you do your corrections on the web...
> I'm not sure what you mean by online.
which means that other people can help you.
(at least if you give them the web-address.)
but if you prefer to work offline, you can do that.i'm a mac person, james. we believe in a friendly interface.
> I thought you would
> provide a command line utility
only a sadist seeks to saddle you with command-line crap...but first you have to get your text _into_ .zml format.
> that would convert ZML to the various formats.
"something like" that is a fairly accurate description.
> Were you thinking of something like DP uses?
my system isn't nearly as convoluted or bureaucratic.
-bowerbird
_______________________________________________
gutvol-d mailing list
gutvol-d@lists.pglaf.org
http://lists.pglaf.org/mailman/listinfo/gutvol-d