
al said:
from what I've read here the past few weeks, I think you've taken on a project that's more than your current skills/experience will let you handle easily.
well, al, i underestimated james based on that, too. so i can't say i blame you. but i found i was wrong. his book showed that he possesses lots of expertise. james has done a ton of work to learn his way around. he is kind of "google-smart", rather than experienced, but google-smart's a good way to get yourself started. i mentioned his strengths in a recent post, and forgot to include that he obviously has a love for digitizing, and -- deeper still -- both love and respect for books. and that's a great cornerstone on which you can build. i also forgot he built himself his own scanning station. so, as for his "shortcomings", i think it boils down to 2. the first is that he's "frugal", as in cheap. the second is an irrationality about open-source, which could probably be related to being cheap, except that irrationality can't be spun as a virtue. (but, of course, such irrationality is fairly common with open-source advocates, which unfortunately mocks the intelligence bedrocking the philosophy.) the combination of these two things leads james to value his time at a very low worth, and that is sad... the proof of this is that, even after he was told -- point-blank, by the d.p./p.g. infrastructure, as he reports in his book -- that tesseract is worthless, he _continued_ to use it for his following projects. as he put it in his book:
since my original page images were of poor quality (due to the age of the book and inadequate lighting) it was difficult to convince anyone that ABBYY Fine Reader would not have done a better job on the OCR than Tesseract. It is possible it would have.
read that again:
It is possible it would have.
there are two ways to interpret that. the first is that he hasn't done the research to verify it. well, hey, james, if that's the case, research it! (or else take the word of the people who did.) the second, more troubling, interpretation is "they say it is so, but i question their wisdom." and _that_ boils down to a sheer irrationality bordering on pure stupidity, akin to "they say the earth is round, but i don't know about that." there's no question about the fact of the matter, james, and you need to face that fact head-on: abbyy finereader does better o.c.r. than tesseract. out of 100 books, finereader will do 95-99 better. i wish it wasn't so lopsided. but that's what it is... *** so, al... those are the two "shortcomings" i see in james -- hey, i'm sure he can see a lot more in me! -- :+) but neither of them affects his ability to pull off a digitization of this book. he has the tenacity... i agree that his inability to see the difficulty that he was causing himself by rewrapping the text _before_ he did the proofing was quite glaring, but i think he now sees the error of the practice. plus i hope he now knows the truth of tesseract. so i think we need to give him full credit now... (as if my "credit" will buy him anything!) ;+) *** james, the other "problem" i had with your book has nothing to do with the book per se, but with your co-author -- rebecca "webchick" malamud. back when i was trying to communicate with the internet archive people, she was working there... at first, she was all _nice_ and everything to me, but once she learned that i wasn't going to just "go away" after lodging my complaints, she got rather mean and even did some attacks on me... so until she apologizes to me, i don't like her... that, of course, doesn't have to have any bearing on _our_ relationship... unless you want it to... but it sounds like she might not be a close friend. (if she is, you should ask her if she has finereader, and if she'll be willing to do some o.c.r. for you.) *** more to come shortly, as we get back to the work... -bowerbird

Bowerbird, I am cheap. I'll cop to that. I do mention ABBYY Fine Reader in the book and list it's advantages, but actually Tesseract has worked pretty well for me on the following books: The Big Book Of Aviation For Boys The Big Sleep Benchley Beside Himself plus a manuscript I typed up thirty years ago and am now revising for publication. It was less good for Ancient Manners, absolutely useless for the book we're doing now, and I used OCR from archive.org for the others. I agree that spending money to make the work go faster is always an option. My book was written for people with more time than money. I was deliberately looking for cheap options. As a professional programmer I use very expensive gear and bitch about it not being better. At home I use reconditioned computers and Linux and am happy. My programming book was for people who had no choice but to use cheap equipment. I am very interested in ways to get better OCR data from archive.org, as has been discussed in other emails. The two books I did with the existing lousy OCR actually went pretty well. I have given page scans of six or so books to PG and PG Canada so they can do their own OCR and their own proofing. I don't expect to live long enough to see the e-books resulting from these donations. Three of these were books by Raymond Chandler! Who wouldn't want to proof those? So I'm dubious about other people scanning my page images. I sell my books and some PD titles on the Kindle Store. I've made a grand total of twelve dollars doing that. Clearly, what I'm doing will always be just a hobby. If I ever do another Purana (archive.org has a bunch of them) I'll consider getting ABBYY Fine Reader. That does not seem likely. For the record, Rebecca Malamud is not someone I know that well. I met her when I was trying to figure out how to write an Activity to download books from archive.org to XO laptops for OLPC. I came up with something that really simplifies the process and she helped. She copied some stuff the RDC had written for another project as a chapter for the book. I ended up rewriting most of it. She did however do a few really good things: 1). Got a talented young artist to do original art for my book. 2). Showed me a good way to put fancy dropcaps in my pages. 3). Published a limited edition (50 printed copies) of the book for contributors to her mentoring project. The book was laid out by one of her other mentees. James Simmons On Tue, Jan 17, 2012 at 1:41 PM, <Bowerbird@aol.com> wrote:
al said:
from what I've read here the past few weeks, I think you've taken on a project that's more than your current skills/experience will let you handle easily.
well, al, i underestimated james based on that, too. so i can't say i blame you. but i found i was wrong. his book showed that he possesses lots of expertise.
james has done a ton of work to learn his way around. he is kind of "google-smart", rather than experienced, but google-smart's a good way to get yourself started.
i mentioned his strengths in a recent post, and forgot to include that he obviously has a love for digitizing, and -- deeper still -- both love and respect for books. and that's a great cornerstone on which you can build.
i also forgot he built himself his own scanning station.
so, as for his "shortcomings", i think it boils down to 2.
the first is that he's "frugal", as in cheap.
the second is an irrationality about open-source, which could probably be related to being cheap, except that irrationality can't be spun as a virtue.
(but, of course, such irrationality is fairly common with open-source advocates, which unfortunately mocks the intelligence bedrocking the philosophy.)
the combination of these two things leads james to value his time at a very low worth, and that is sad...
the proof of this is that, even after he was told -- point-blank, by the d.p./p.g. infrastructure, as he reports in his book -- that tesseract is worthless, he _continued_ to use it for his following projects.
as he put it in his book:
since my original page images were of poor quality (due to the age of the book and inadequate lighting) it was difficult to convince anyone that ABBYY Fine Reader would not have done a better job on the OCR than Tesseract. It is possible it would have.
read that again:
It is possible it would have.
there are two ways to interpret that. the first is that he hasn't done the research to verify it. well, hey, james, if that's the case, research it! (or else take the word of the people who did.)
the second, more troubling, interpretation is "they say it is so, but i question their wisdom." and _that_ boils down to a sheer irrationality bordering on pure stupidity, akin to "they say the earth is round, but i don't know about that."
there's no question about the fact of the matter, james, and you need to face that fact head-on: abbyy finereader does better o.c.r. than tesseract. out of 100 books, finereader will do 95-99 better. i wish it wasn't so lopsided. but that's what it is...
***
so, al...
those are the two "shortcomings" i see in james -- hey, i'm sure he can see a lot more in me! -- :+) but neither of them affects his ability to pull off a digitization of this book. he has the tenacity...
i agree that his inability to see the difficulty that he was causing himself by rewrapping the text _before_ he did the proofing was quite glaring, but i think he now sees the error of the practice. plus i hope he now knows the truth of tesseract.
so i think we need to give him full credit now...
(as if my "credit" will buy him anything!) ;+)
***
james, the other "problem" i had with your book has nothing to do with the book per se, but with your co-author -- rebecca "webchick" malamud.
back when i was trying to communicate with the internet archive people, she was working there...
at first, she was all _nice_ and everything to me, but once she learned that i wasn't going to just "go away" after lodging my complaints, she got rather mean and even did some attacks on me...
so until she apologizes to me, i don't like her...
that, of course, doesn't have to have any bearing on _our_ relationship... unless you want it to...
but it sounds like she might not be a close friend.
(if she is, you should ask her if she has finereader, and if she'll be willing to do some o.c.r. for you.)
***
more to come shortly, as we get back to the work...
-bowerbird
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/mailman/listinfo/gutvol-d
participants (2)
-
Bowerbird@aol.com
-
James Simmons