
First of a series of notes I'm assembling in the blog to build from for a discussion of project 10X. Currently the identification of books to digitize and convert to ebooks is entirely up to the public. And it's not apparent that people are beating down the doors with titles they demand be published. The two publicly accessible DP sites often plead for new material for the beginning rounds. We see that even the production we do get is frequently a new version of an existing ebook. Yet we know there are major, well-known titles that are not available. My own projects, the volumes of Encyclopedia Britannica edition (the most renowned and most recently public-domain-available,) is not online in any quality digital form. There is no digital copy of Newton's Principia Mathematica in the English language. No copy of Ptolemy's Almagest. One wonders how many historically significant books are being irretrievably lost in the destruction and violence in Syria and Egypt and Libya, literarily among the most historically active areas in the world. So it may not be the most critical step in the process, but it will need to change and grow to support 10X, if the other steps progress. Even at the current rate, it would help to have some easily accessible set of lists of culturally valuable works, in all languages and traditions for people to use to search for sources, and a place to just post the images and/or urls if they find them. We now have a number of productive harvest sources for both images and crappy text digitization, but I've not seen a list of potential PG candidates, or a checklist of ebooks already in the existing catalog to guide anyone wanting to cull. (In fact, is there a checklist of existing titles in PG that's exhaustive and accessible? The rdf files don't qualify for casual use. Nearly every page of Encyclopedia Britannica is thick with both bibliographies, and with articles about authors and texts that are largely undigitized. I imagine many of them are already lost. How many other ebooks in the catalog would be similarly rich in authors and titles. As DP-Europe brought to our attention, there are entire cultures and languages whose written records are fast disappearing. We often take the identification and selection of books to be something we can assume takes care of itself; bu there's good reason to think curation in this area is important and in fact requires more attention if PG is to grow. Some will argue that we shouldn't be concerned about new texts while the existing catalog is in questionable condition. But that assumes that one needs to choose between the two; and that we will continue to suffer for the lack of a decent process for incremental improvement - something we'll need to discuss in further installments. But the most compelling counter-argument is PG's original mission statement - more free books for more people - combined with the known fact that books are disappearing from our reach daily. Some will argue that we shouldn't be concerned about new texts while the existing catalog is in questionable state. But that assumes that one needs to choose between the two; and that we will continue to suffer for the lack of a decent process for incremental improvement - something we'll need to discuss in further installments. But the most compelling counter-argument is PG's original mission statement - more free books for more people - combined with the known fact that books are disappearing from our reach daily.