
Geoff wrote:
Jon asked:
Anyway, what other unusual page numbering systems do we find in the old books? I'm sure the PG and DP veterans can share some unusual things they've encountered.
Denzinger's _Echiridion Symbolorum et Definitionum_ has:
Preferatory material with Roman numerals (fortunately starting with the title page as page I)
Text (arabic numerals)
An appendix with a separate arabic numeration scheme (page numbers followed by asterisks)
An index, with a separate arabic numeration scheme (page numbers in square brackets)
As an additional wrinkle, each paragraph is individually numbered, and the work is cited by paragraph number, not by page.
Wow! It seems to me that a page scan naming system has to include the following metadata (or that it can be inferred, calculated, etc., by machine processing): 1) The sequence number of the page in the scanning project. This way we know the exact order each page scan appears among all the pages scanned. This would include blank pages that do not contribute to the publisher-supplied page numbering. Of course, as usual, some foldouts (especially if exotic) throw us a few curve balls. 2) How the publisher named/numbered a particular page. So maybe we might do something like (strawman example): 00035-28 Where 00035 is the 35th page in the page sequence of the book (including blank pages), and "28" is what the original publisher used for the page number ("blank" could be used for a totally blank page.) If there are really oddball publisher/author page naming (like the examples Geoff gives), that whole string could be incorporated into the second part of the filename (and if needed, properly escaped.) If there are applications that require a different page scan naming convention (such as DjVu), a script can be run to autochange the filenames for the particular use. As noted above, if there are foldouts and other similar oddities, then that might cause some difficulties with this numbering system. So, will this system work in general, or will it cause problems? Handling of references to paragraphs, verses, etc., can be done within XML markup (and publisher/author page numbers can also be put within markup.) Jon