Re: [PGCanada] James website and more news
----- Original Message ----- From: Jon Noring <jon@noring.name> Date: Thursday, November 11, 2004 10:28 am Subject: Re: [PGCanada] James website and more news
3) Scan these texts, collect the metadata/catalog-info, and place the page scans online. (Optionally, OCR can be done on these scans, and the raw, uncorrected OCR text can be used to enable a "temporary" full-text-search capability of the collection of page scans.)
This last -- use of the raw, uncorrected OCR output -- is what drives projects like canadiana.org, ourroots.ca, newspaperarchive.com, and the cold north wind/paperofrecord.com family of products.
Wallace wrote:
Jon Noring wrote:
3) Scan these texts, collect the metadata/catalog-info, and place the page scans online. (Optionally, OCR can be done on these scans, and the raw, uncorrected OCR text can be used to enable a "temporary" full-text-search capability of the collection of page scans.)
This last -- use of the raw, uncorrected OCR output -- is what drives projects like canadiana.org, ourroots.ca, newspaperarchive.com, and the cold north wind/paperofrecord.com family of products.
Well, it shows PG-Canada can find partners in the long-term endeavor to convert the public domain works of Canadiana into high quality digital texts. Where PG-Canada should depart from PG-MotherShip is to work closely with other groups, not go it alone. PG-Canada can help bring all these other efforts together under one tent, to follow a single metadata standard, and to begin the process of conversion to digital texts. This long-term thinking of PG against the world needs to stop. I see PG-Canada organizing as a non-profit and building a strong Board of Trustees which includes representatives from other notable organizations and notable Canadians. This will increase the likelihood of both private and public (Canadian Government) funding, some of which can go to DP to help them develop the next generation, XML-based system. Jon
participants (2)
-
Jon Noring
-
Wallace J.McLean