First of a series of notes I'm assembling in the blog to build from for a discussion of project 10X.
Currently the identification of books to digitize and convert to
ebooks is entirely up to the public. And it's not apparent that
people are beating down the doors with titles they demand be
published. The two publicly accessible DP sites often plead for new
material for the beginning rounds. We see that even the production we
do get is frequently a new version of an existing ebook. Yet we know
there are major, well-known titles that are not available. My own
projects, the volumes of Encyclopedia Britannica edition (the most
renowned and most recently public-domain-available,) is not online
in any quality digital form. There is no digital copy of Newton's
Principia Mathematica in the English language. No copy of Ptolemy's
Almagest.
One wonders how many historically significant books are being
irretrievably lost in the destruction and violence in Syria and Egypt
and Libya, literarily among the most historically active areas in the
world.
So it may not be the most critical step in the process, but it
will need to change and grow to support 10X, if the other steps
progress.
Even at the current rate, it would help to have some easily
accessible set of lists of culturally valuable works, in all
languages and traditions for people to use to search for sources,
and a place to just post the images and/or urls if they find them.
We now have a number of productive harvest sources for both images
and crappy text digitization, but I've not seen a list of
potential PG candidates, or a checklist of ebooks already in the
existing catalog to guide anyone wanting to cull. (In fact, is there
a checklist of existing titles in PG that's exhaustive and
accessible? The rdf files don't qualify for casual use.
Nearly every page of Encyclopedia Britannica is thick with both
bibliographies, and with articles about authors and texts that are
largely undigitized. I imagine many of them are already lost. How
many other ebooks in the catalog would be similarly rich in authors
and titles.
As DP-Europe brought to our attention, there are entire cultures
and languages whose written records are fast disappearing.
We often take the identification and selection of books to be
something we can assume takes care of itself; bu there's good reason
to think curation in this area is important and in fact requires more
attention if PG is to grow.
Some will argue that we shouldn't be concerned about new texts
while the existing catalog is in questionable condition. But that assumes
that one needs to choose between the two; and that we will continue
to suffer for the lack of a decent process for incremental
improvement - something we'll need to discuss in further installments.
But the most compelling counter-argument is PG's original mission
statement - more free books for more people - combined with the
known fact that books are disappearing from our reach daily.
Some will argue that we shouldn't be concerned about new texts
while the existing catalog is in questionable state. But that assumes
that one needs to choose between the two; and that we will continue
to suffer for the lack of a decent process for incremental
improvement - something we'll need to discuss in further installments.
But the most compelling counter-argument is PG's original mission
statement - more free books for more people - combined with the
known fact that books are disappearing from our reach daily.