
Jon Noring wrote:
In my system, the person only needs to read the PublisherPageID and enter that without having to figure out any letter prefixes -- this is easier and more reliable. It will also handle cases your (and Marcello's) system won't handle, such as backward-numbered pages and where page numbers are repeated (this example was actually brought up.)
Wrong. My system will handle backward-numbered and duplicated pages with a vengeance. But before I demonstrate this, I want to say that your better software architect (me) will design a system that handles 99 % of the cases in a simple, intuitive and straightforward manner, and not a system that handles 100 % of the cases -- but only theorically because it is so incredibly complicated and awkward that nobody can use it. Your system is incredibly complicated, awkward and fundamentally broken because you need too much information to successfully link to a page and once the link is set it will not survive the slightest reorganisation of the files. But now let me demonstrate how my system handles your abnormal cases with very little manual workaround. BACKWARD-NUMBERED PAGES 1. Scan the book "backwards", ie. starting with page 1. If you have a sheet-feeder this will take no more work than just flip the whole book over once. 2. Your scan software will save the pages as 1.tiff, 2.tiff etc. with every file containing the "real" page 1, 2, etc. You have instant feed-back on the correctitude of your scanning: if you put on page 314 your software should offer to save it to 314.tiff. If not, you know you have botched it and can go figure. 3. You run a perl-script that compresses 1.tiff to p0001.djvu, etc. 4. You assemble the multi-page djvu file backwards: djvm -c 12345.djvu `ls -r *djvu` Done. A "backward" book will take you about 5 seconds longer than a straight one. (If you need help on the perl-script drop me a mail.) DUPLICATED PAGE NUMBERS The only thing my system doesn't handle gracefully right out of the box -- being based on the real page number as key -- is duplicated page numbers. But there is an easy workaround: you have to manually edit the filename of the second page "42" to "p0042a". After assembling the multi-page djvu file you have to insert the duplicate pages like this: djvm -c 12345.djvu *[0-9].djvu djvm -i 12345.djvu p0042a.djvu 44 djvm -i 12345.djvu p0043a.djvu 45 Done. This will take you about 5 minutes longer than a "correct" book. -- Marcello Perathoner webmaster@gutenberg.org