Some humble suggestions... Re: [PGCanada] Re: PG-Canada / List of tasks to do

Jon Noring jon at
Thu Jan 27 16:44:12 PST 2005

Andrew Sly wrote:

> I'd suggest that getting content and finding Public domain books
> is not something we need to worry about. Having enough interested
> volunteers to help produce them will probably be more of a challenge.

What interests me is to split up the project into three areas/groups,
each of which can have its own volunteer base, its own group leaders,
and a level of autonomy with the other groups. The three are:

1) Scanning
2) Cataloging and Copyright Clearance
3) Conversion to structured digital text

The first, scanning, can be done independently. It would seek out old
texts to scan which, if public domain, can be placed online. Ask for
donations (with a tax deduction to the donor) of old books which are
otherwise falling apart, chop them, then run them through a sheet feed
scanner. The chopped books would then be put into ziploc bags with a
dessicant (or whatever other method is recommended) and archived away
in case there's interest in rescanning. I think it even possible to ask
Brewster Kahle at the Internet Archive for a donation of sheet feed
scanners in return for donating copies of the scans to IA. These
scanners, even rugged professional level models, are not overly
expensive (not like the orbital or robotic scanners, for example.)

(In other messages, I referred to the scanning project as Distributed

The second, cataloging/copyright clearance, will take the scans which
have been done, and put together MARC (or equivalent) records for the
works (a lot of data can be taken from other libraries.) In addition,
the group can do the research on the copyright of the works, which of
course the cataloging information is important in the process. And
finally, this group can look over the scans to determine if any pages
are missing or badly scanned (a sort of QC function). It may be
possible to find trained librarian volunteers to help out in this
group. Since there exists *excellent* commercial software for
cataloging, again the Internet Archive may be willing to buy licenses
for that software for the group to use in return for help in cataloging
IA's scanning project in Toronto. Alev Akman is the expert in the area
of cataloging who should be further consulted for this project (she is
a head librarian at CSU Fresno, and has an MLIS degree.) She highly
recommends using commercial software for generating the cataloging
records in MARC or MARC-XML -- she also recommends the project develop
an authority database for the various fields, such as author names.

And the third area I don't need to discuss since that is the area of
focus at this time.

Interestingly, I would expect the scanning group to greatly outpace
the group producing structured digital texts, at least early on. This
doesn't matter, really. I think from a political standpoint
(particularly with regards to influencing the Canadian government with
regards to proper copyright policy), it is wise for the scanning group
to go hog wild and get as many scanned books online as possible -- get
a half dozen sheet feed scanners and keep them running 24-7! This will
catch the attention of a lot of people, including the Internet Archive,
and lead to good things, such as closer association with the Canadian
government and various archives, great PR, and other benefits
(possibly even major long-term funding.)

Jon Noring

More information about the PGCanada mailing list