gardner said:
Where was that Sourceforge project again?
there is no sourceforge project. my source code has never been open-source. my enemies will need to write their own code.
I know you've talked about tools that do more/better checking than Gutcheck
actually, i don't think i've ever compared my stuff with any other software directly, because any tool is better than no tool. gutcheck has some charms. my tools do different things, and do things differently, but whether they do "more" or "better" is not an issue.
and have automated fixing and such.
some have, yes. it's also important to remember that i do lots of experimentation, with quick-and-dirty code that serves to test the usefulness of a particular feature, but which might never be implemented further, perhaps because it doesn't prove to be worthy, or because the generalized code would take more time than i can give, or simply because that task just hasn't been done yet...
I would like to try them out. Where can get get my hands on this stuff?
i'll be happy to send you a copy, gardner, since you are an independent producer -- that's my target sweetspot. until i release the program generally, which might be very soon but also might not, you'll have to agree not to distribute the app any further, since i want to know who has it so that i can engage them in dialog about it, but that's the only restriction at this point. you'll also need to tell me what version you want -- mac or p.c. or linux. your signature-block screams out linux, which is fine, but you'd be one of my first linux users, so if you want the more-well-tested windows version, say so. finally, please give a short description -- frontchannel -- of your _current_ workflow. how do you do your books? do you use an editor, or some other tool? use gutcheck? what kind of preprocessing do you do on the raw o.c.r.? if you need to view a scan to check the text on some page, how do you do that? how do you find errors, with reg-ex?, or via a word-by-word proof of every page? anything else? if anyone else wants to get a copy of my program, say so, either frontchannel or back. the same conditions apply... also, if you're interested, you should check out don's app:
http://code.google.com/p/dp50/downloads/list his tool is similar in many ways, and you might like it too.
-bowerbird
On Thu, Feb 18, 2010 at 01:47:26PM -0500, Bowerbird@aol.com wrote:
if anyone else wants to get a copy of my program, say so, either frontchannel or back. the same conditions apply...
I believe I've already said so, for Linux, at least twice. Each time I was told I'd have to join some "Yahoo!" listserv, which is too far to go for a an unproven piece of software...I generally try pretty hard to keep my information out of the clutches of Yahoo!
On 18-Feb-2010 13:47, Bowerbird@aol.com wrote:
actually, i don't think i've ever compared my stuff with any other software directly, because any tool
Perhaps not, but over time you have described checks that your tools can do and fixes that you can automatically make that sound a little to me like a super-duper gutcheck. Also the workflow I picture is a little like gutcheck -- I am thinking of text-in text-out command line tools, not something that needs to look at image scans or makes me talk to it in a fancy U/I. This is perhaps an inaccurate impression I have. If the comparison is totally inappropriate, I'm sorry.
you'll also need to tell me what version you want -- mac or p.c. or linux. your signature-block screams out linux,
Probably something that would run in FreeBSD would be most useful -- a Linux build would, I think. Windows would be fine too.
finally, please give a short description -- frontchannel -- of your _current_ workflow. how do you do your books?
This is still fairly accurate: http://www.gutenberg.org/wiki/Gutenberg:Volunteers%27_Voices#Gardner_Buchana... ...although I have a nicer flatbed scanner now. I used to always page-by-page scan, OCR and first-proof books that I was doing from physical copies. The last couple of books I've done instead by scanning, bulk OCR and then proof from the scans and raw OCR text, which I can do on the road with my laptop or anywhere I can mount a USB key for a couple of hours. After OCR I have a few basic things that I do via regular expressions in vi: I find and fix spaced punctuation, find and fix M-dashes. If there's any obvious consistent scannos -- the Heavysege item I just finished had Ys that looked to Finereader more like Vs, for example -- I will have a crack at finding those. I have been known to write a one-off perl script to get at something that bugs me enough. The thing is that I do not have a specific set of checks and fixes that I consistently do. I rely a lot on jeebies and gutcheck. I would like something perhaps with a wider range of things that it can find so I don't have to know all the things to look for. Over the years you have mentioned several automated checks and fixes that sounded sensible enough to me. I'm not keen enough to go back through the archives, find them and implement them -- but I am nevertheless interested in trying a tool like this out on a project to see if it adds value for what I do. Heck, you can grab http://www.gutenberg.org/dirs/3/1/2/1/31212/31212-8.txt and just tell me what you find. I have no doubt there is lots to find. ============================================================ Gardner Buchanan <gbuchana@teksavvy.com> Ottawa, ON FreeBSD: Where you want to go. Today.
participants (3)
-
Bowerbird@aol.com -
Gardner Buchanan -
Joey Smith