Page scans draft policy

As I said, there was subsequent discussion about the details of formatting...Here's some info about page scan formats. I note that point 4 is somewhat different than what I just typed in my other message, and seems a whole lot smarter. -- Greg ----- Forwarded message from Jim Tinsley <jtinsley@pobox.com> ----- From: "Jim Tinsley" <jtinsley@pobox.com> To: "Posted Etexts for Project Gutenberg" <posted@listserv.unc.edu> Subject: [posted] Posted (#12973, Butler) ! Date: Tue, 20 Jul 2004 20:24:32 -0700 (PDT) Personal Recollections of Pardee Butler, by Pardee Butler 12973 [Editor: Mrs. Rosetta B. Hastings] [Contributor: Mrs. Rosetta B. Hastings] [Contributor: Elder John Boggs] [Contributor: Elder J. B. McCleery] [Link: http://www.gutenberg.net/1/2/9/7/12973 ] [Files: 12973.txt; 12973-h.htm; 12973-page-images] Thanks to Roger for finding and scanning this book. This is the first PG book to be posted with page images. We are now beginning to accept page images along with the regular postings. Of course, DP has always preserved its page images, and those will eventually be uploaded in a big batch, or series of batches, but non-DP contributions may now begin adding page images. For now, we're setting the following guidelines for page image postings: 1. PG is now accepting page images of books posted. Page images will be posted _only_ as an addition to an etext posted in the normal way -- we will not post page images without plain text. 2. Page images are an option; they are not and will not be required for the posting of a text. 3. All page images should be good enough to work reasonably well with OCR packages, up to 600 dpi, and should be stored as black-and-white TIFFs with CCITT-4 (aka ITU-G4 or Fax Group 4) compression. This is important, so that we keep the overall file size down to a sustainable level. With this compression, a typical 600dpi page can be stored for about 40KB. Our ability to post these images depends on the file sizes staying fairly reasonable. Pages such as color pictures or greyscale photos that cannot reasonably be stored as black-and-white only should be stored as TIFF or JPEG with the best compression you can get for that image. (Note: Irfanview for Windows does this nicely individually or in batch. ImageMagick v 6.x: convert myimage.png -compress group4 myimage.tif ) 4. Each page image should be a separate file and named with the page number within the set; e.g. 001.tif, 002.tif, etc. Separate, non-page images, such as covers or color images scanned separately from the pages, should have suitable names, such as "cover.jpg" or "072-image.tif" All page images for the book will be zipped into one file, to be called FILENUMBER-page-images, e.g. 12345-page-images.piz (reverse the extension) for etext #12345, and stored in the main directory for that etext. It will unzip to a subdirectory ./page-images, but we will not post separate page images in that directory, since that would double the space used, and we believe that people who want to consult the images will probably want them all. So, for now at least, if you want the images, you download the PIZ (backwards again) file. jim ----- End forwarded message -----
participants (1)
-
Greg Newby