does every conversation here _have_ to be tedious?
***
ok, let me 'splain you...
first, if you're gonna wait for users to report errors,
instead of proactively seeking them and fixing them,
you might as well _give_ your lunch to someone else
and then trot off to your mat so you can take a nap.
if the tech world has taught us anything at all lately,
it's that you'd better improve your product offerings
_yourself_, before any of your competitors do it first.
project gutenberg e-texts have one powerful benefit
-- they are cleaner than many other texts out there.
(but don't look back, because google's coming fast.)
the e-texts also have 2 significant related liabilities --
they have no provenance, so the accuracy isn't verifiable.
this means a competitor can knock you out quite easily,
and even use your own text to do it. they simply take it,
hook it up to a scan-set, and do a proof via comparison.
voila. they have _cleaner_ text. which _has_ provenance.
then they point to your absence of any provenance, and
draw attention to your errors -- which they've fixed! --
and boom, they've done a number on your reputation...
do that a couple dozen times, with high-profile classics,
in a way that goes viral, and the blow would be crushing.
crushing.
that's if they beat you to the punch. they eat your lunch.
there's your top-down seminar...
and now for the bottom-up part.
if you're going to build a system capable of pulling up
an arbitrary page from an arbitrary book in response to
an error-report regarding some of the text on that page,
then you might as well blossom it into a system that will
serve to correct all the text on every page of every book.
you dig?
-bowerbird