slightly off topic, first post, scanning

This is my first post to the list. I come here for advice not directly related to PG. I have a number of physical books that have had their spines cut off and I need to scan them so they can be OCR'd and stored digitally. I am encountering two problems. All of the bulk page-feed scanners I have tried have one or both of the following issues: Duplex scanning happens simultaneously, so extreme contrast changes (such as borders) from one side of the page show through (about 5%) to the other side. Remnants of glue left on the edge of a page (from the removed spine) get stuck to the inside of the scanner or feeder, ruining the scans of subsequent pages until the scanner is cleaned. Can anyone recommend a solution to either of these problems that can hopefully be applied automatically or in a batch operation? Reduction of human intervention/effort is my goal.

On Sat, 13 Feb 2010, Sparr wrote:
Remnants of glue left on the edge of a page (from the removed spine) get stuck to the inside of the scanner or feeder, ruining the scans of subsequent pages until the scanner is cleaned.
I use a knife to cut the binding off rather than try to separate the pages. A plough knife is actually made for this. I've had pretty good results with a standard construction razor knife. -- Greg Weeks http://durendal.org:8080/greg/

FWIW, Since I bought myself a digital camera for general use, plus a copy of Omniscan, my scanner has been pretty well idle. As it happens, the camera is a 12 megapixel model, but for most purposes I find it better to set it to 8 MP or even less. Also, for most books I set the mode to black and white. It is best for mass input either to get a tripod as well, or to buy some sort of cheap plastic stand and mutilate it into a camera stand. I have been using a kindergarten table into the top of which I cut a camera-shaped hole with a hobby knife and due caution. Avoid buying s cheerfully coloured stand, because if you happen to need colour shots it can seriously affect the picture. The best is translucent white or grey, or possibly transparent. Grey or black are not too bad if illumination is no problem. Then it is just a matter of setting manual focus and clicking away till done. The table is very light and firm and I have had no problems with unsteadiness. Obviously one chooses a suitable surface to work on, so that glue and similar pollutants are not a consideration. There are of course umpteen variations on the theme. You might prefer stands and clips to hold the objects erect. You might buy a second-hand camera economically, but do make sure that it will take a suitable memory module, the larger the better. SD cards are very good, especially if have a reading USB attachment. I got one pretty cheap. The main regret is that I didn't get a mains adapter to power the camera while one was still available. As it stands I simply use rechargeable NIMH batteries of the right size. Remember: the power burden is much heavier than most other photographic activities. There are some definite advantages over the scanner, even though modern scanners are remarkably good. Fewer moving parts for one. (once you have the camera set up, it is only the button and the shutter that move! ) Unless you have a scanner with an automatic feed, the speed is better too, plus, there are few books that you need mutilate to photograph them. Another luxury, though I have not in practice needed it, is that the camera can be set to various degrees of resolution. For most purposes very modest resolution is far more than adequate, but if you should need more than you can get from a single shot, then set it up to take only part of a page at a time, and you can magnify your material till the limiting factor is not the camera, but the quality of the printing. Is my choice unusual in any way? Jon On 2010/04/14 13:44 PM, Greg Weeks wrote:
On Sat, 13 Feb 2010, Sparr wrote:
Remnants of glue left on the edge of a page (from the removed spine) get stuck to the inside of the scanner or feeder, ruining the scans of subsequent pages until the scanner is cleaned.
I use a knife to cut the binding off rather than try to separate the pages. A plough knife is actually made for this. I've had pretty good results with a standard construction razor knife.

Hi Jon, Nope, I think you're part of a pretty popular movement there. There's a whole cottage industry of building home-made book scanners that consist of a jig to hold the book and a pair of digital cameras positioned to capture the two facing pages. Look at http://www.diybookscanner.org/ Personally, I still use a flatbed, but that's because I'm a Luddite. On 14-Apr-2010 10:42, Jon Richfield wrote:
FWIW, Since I bought myself a digital camera for general use, plus a copy of Omniscan, my scanner has been pretty well idle. [...]
Is my choice unusual in any way?
============================================================ Gardner Buchanan <gbuchana@teksavvy.com> Ottawa, ON FreeBSD: Where you want to go. Today.

I bought a crappy digital camera to use "most of the time". It didn't even come with a manual - just a url to download it. But it did have instructions on how to scan a book. Don On Wed, Apr 14, 2010 at 6:07 PM, Gardner Buchanan <gbuchana@teksavvy.com>wrote:
Hi Jon,
Nope, I think you're part of a pretty popular movement there. There's a whole cottage industry of building home-made book scanners that consist of a jig to hold the book and a pair of digital cameras positioned to capture the two facing pages. Look at http://www.diybookscanner.org/
Personally, I still use a flatbed, but that's because I'm a Luddite.
On 14-Apr-2010 10:42, Jon Richfield wrote:
FWIW, Since I bought myself a digital camera for general use, plus a copy of Omniscan, my scanner has been pretty well idle.
[...]
Is my choice unusual in any way?
============================================================ Gardner Buchanan <gbuchana@teksavvy.com> Ottawa, ON FreeBSD: Where you want to go. Today.
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/mailman/listinfo/gutvol-d

Thanks Don, I suppose I could have done the research to find that out myself, but it never occurred to me to do so. What you say shows that the concept is by now very routine. Hardly surprising; in university librares and even surreptitiously in bookshops, energetic anti-Luddites are busy snapping away at books and articles. Gardner, Thanks for the URL.Many of the devices it illustrates certainly are impressive, but all demand more elaborate mechanisms than I have hitherto so much as considered. I am not saying that this disqualifies them from reasonable consideration, and certainly if I am working in poor light it is necessary to scrounge a reading light, but so far I have managed acceptably with a hand-held camera for a few pages at a time for corrections or bits of newsprint etc, and the mutilated polystyrene table for a stand. I have occasionally used a tripod, which can be more suitable for some purposes. Another trick that is useful for more portable requirements is to se a stick such as a walking stick as a means of steadying the camera, a sort of monopodal tripod. It enables one to keep the camera steady enough for most purposes, plus controlling the distance well enough to use manual focus, which conserves battery power, gives more consistent results and increases speed. If I had to do it again, I probably would have chosen something longer; my little table is only 40 cm high, and 60-80 cm would give less distortion and more even focus. It is no problem for little paperbacks, but large pages are not so good, so I have to work out something to raise the level. I haven't done much scanning lately, but soon I may consider carving up an inverted dustbin or something if I can't find a higher small table. One thing I have not yet found is anything that I can use as a non-reflective overlay to flatten the pages without degrading the image or causing reflections. Picture glass doesn't work, unless there is a new grade that I don't know of. Something I have not yet got round to obtaining or jury-rigging, is one of those nice little cable-attached plungers, or better, a foot pedal for taking the snaps. After a few hundred pages, groping for the button is a nuisance. Sparr,
I have a few hundred thousand pages to scan, so a diy camera-style book scanner isn't appropriate, nor is a flatbed scanner. Thanks for the ideas, though.<
You are welcome. I did wonder. Then I assume that you are using a mechanical feed scanner. If so it is simply a matter of guillotining or otherwise amputating the gluey bits. Someone once said something like: "If you can neither avoid it nor fix it, don't worry about it; it isn't a problem; it's reality." Or as they said long ago, "What can't be cured must be endured." Now, I don't know your circumstances, so everything I say is highly context sensitive, and please don't bite me if I tell you obviosities that have nothing to do with your needs and constraints (not to mention tastes, as Gardner instanced.) BUT if the material cannot reasonably be chopped or automatically handled, then it might be time to reconsider. How many pages per second . . . AVERAGE, INCLUDING dealing with jams and messes . . . does your automated glue-hating system read clean? If you cannot comfortably produce properly readable, OCRable pages at better than one per second, then you had better think of a few hundred thousand seconds. One or a half per second is in any case what a camera with a system like mine could give you, once you are up to speed. I have occasionally torn glued pages apart for photographic work, but for me that was no problem, so guillotining and trimming did nt come into it. At eight hours per day, you should be able to capture more than 100000 pages per 5-day week. It certainly is not nice, but it beats a "faster" system that does not work, or at least does not work faster. Just thoughts, together with the thought: "Sooner you than me!" ;-) Go well folks, Jon On 2010/04/15 03:09 AM, don kretz wrote:
I bought a crappy digital camera to use "most of the time". It didn't even come with a manual - just a url to download it.
But it did have instructions on how to scan a book.
Don
On Wed, Apr 14, 2010 at 6:07 PM, Gardner Buchanan <gbuchana@teksavvy.com <mailto:gbuchana@teksavvy.com>> wrote:
Hi Jon,
Nope, I think you're part of a pretty popular movement there. There's a whole cottage industry of building home-made book scanners that consist of a jig to hold the book and a pair of digital cameras positioned to capture the two facing pages. Look at http://www.diybookscanner.org/
Personally, I still use a flatbed, but that's because I'm a Luddite.
On 14-Apr-2010 10:42, Jon Richfield wrote:
FWIW, Since I bought myself a digital camera for general use, plus a copy of Omniscan, my scanner has been pretty well idle.
[...]
Is my choice unusual in any way?
============================================================ Gardner Buchanan <gbuchana@teksavvy.com <mailto:gbuchana@teksavvy.com>> Ottawa, ON FreeBSD: Where you want to go. Today.
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org <mailto:gutvol-d@lists.pglaf.org> http://lists.pglaf.org/mailman/listinfo/gutvol-d
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/mailman/listinfo/gutvol-d

I have a few hundred thousand pages to scan, so a diy camera-style book scanner isn't appropriate, nor is a flatbed scanner. Thanks for the ideas, though. On Wed, Apr 14, 2010 at 10:42 AM, Jon Richfield <richfield@telkomsa.net> wrote:
FWIW, Since I bought myself a digital camera for general use, plus a copy of Omniscan, my scanner has been pretty well idle.
On Wed, Apr 14, 2010 at 9:07 PM, Gardner Buchanan <gbuchana@teksavvy.com> wrote:
Nope, I think you're part of a pretty popular movement there. There's a whole cottage industry of building home-made book scanners that consist of a jig to hold the book and a pair of digital cameras positioned to capture the two facing pages. Look at http://www.diybookscanner.org/
Personally, I still use a flatbed, but that's because I'm a Luddite.
participants (5)
-
don kretz
-
Gardner Buchanan
-
Greg Weeks
-
Jon Richfield
-
Sparr