
Bowerbird, This looks pretty neat. There are a couple of points to mention, though. 1). I don't have corrected pages. I started doing the page-at-a-time thing and gave up. Your pages are already better than mine because I used Tesseract and archive.org uses ABBY Fine Reader. 2). This book really requires a way to enter UTF-8 characters. If I could just stick a circumflex above a's, u's, and i's (both lower and upper case) that would be 99% of what I need. That's why I use JEdit: there is a plugin that makes a docked window for entering these characters. As you can see the OCR actually worked pretty well. Most of what I'm doing now (after de-hyphenating, joining split paragraphs, and re-wrapping everything) is putting in those circumflexes. James Simmons On Wed, Dec 21, 2011 at 4:14 AM, <Bowerbird@aol.com> wrote:
for want of better text, i've now loaded in the original o.c.r. for the "book of james".
this isn't the edit interface, it's just a skeleton to display the o.c.r./scan combo for each page.
resizing your browser-window resizes the scan.
so james, if you want me to pursue this at all, send me your corrected files. otherwise, i will just be done with it.
i'd love to see an xhtml tutorial on this book...
***
roger said:
yes! roger's back! :+)
different users with different browsers.
iphone, no. ipad, also no, unfortunately.
doesn't appear to size itself very well...
http://z-m-l.com/misc/cinema-screen-shot.png lotta unused space on my 23-inch screen.
anyway, all you people who bellyache about "open source" need to step up to the plate... go over and help roger code something cool. show that you really _deserve_ open source...
***
don said:
Is this closer?
i don't think so. i'd say you're getting colder. but i _will_ read that story about the turnip...
***
keith said:
elves
aha! that's great. i didn't think the gerbils would have enough manual dexterity... but _elves_ are very handy with their... hands...
Now, if I could only get then to program.
oh, they don't have to be able to code at all.
just tell them to run these commands:
pandoc -i criterion.html -o markdown.md pandoc -i criterion.html -o rst.rst pandoc -i markdown.md -o markdown.html pandoc -i rst.rst -o rst.html
if criterion.html and markdown.html and rst.html are not "sufficiently similar", let me know why not.
***
carlo said:
djvutxt --page=123 djvufile.djvu 123.txt
ok, i get it now. it's on _my_ machine. i thought you issued it on archive.org.
i can't thank you enough for that, carlo; this ends a years-long headache for me.
-bowerbird
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/mailman/listinfo/gutvol-d