Crowdsourcing

19 Feb 2012

      Very interesting. It appears that Wikicommons has implemented a more
refined publicly accessible text proofing process, and at least one of
PG's texts is part of their subject corpus.

Take, for example, PG #32607; particularly the article about diazo
compounds.

http://www.gutenberg.org/files/32607/32607-h/32607-h.htm#ar23

Notice the embedded images; for example

http://www.gutenberg.org/files/32607/32607-h/images/img174b.jpg

Here are two versions of the same text on eb.tbicl.org.

First, a clone of the PG ebook:

http://eb.tbicl.org/vol08/4/#ar23

and then after processing it into an article:

http://eb.tbicl.org/diazo-compounds

As we have discussed previously, I have added to the page numbers by
making them linkable to the page image on TIA.

-------------------------------------------------------------

Here is the same article at Wikisource.

http://en.wikisource.org/wiki/1911_Encyclop%C3%A6dia_Britannica/Diazo_Compou...

You can see that the cropping in the embedded image is identical to PG's.

Further, they have included a linkable page number (including an additional
one at the
heginning of the article.) But it doesn't link the same place. Instead it
links to

http://en.wikisource.org/wiki/Page%3AEB1911_-_Volume_08.djvu/191

which is their own editing interface for that page.

So there it is. The public can proof a text derived almost directly from PG,
using an interface that includes the page image and text text editor for
matching.

(Note to self: the image they use is from a source I wasn't aware of and the
quality is pretty good - I need to check
http://upload.wikimedia.org/wikipedia/commons/thumb/0/02/EB1911_-_Volume_08....
using a well-crafted page url template.)

One might choose to repudiate the term "crowdsourcing", but it would be
just semantics at that point.

don kretz

tags

participants (1)