donovan said:
> I can see this revealing
> (or at least quantifying)
> the disturbingly high rate of
> spelling and grammatical errors.
actually, hit-differentials are
an excellent method to _detect_
spelling and grammatical errors,
so it shouldn't be that difficult to
clean the corpus quite thoroughly.
but that's not the "knowledge" that
google might glean that is so scary
to me. that involves putting together
disparate pieces of information that
were never intended to be connected,
but nonetheless exist out in cyberspace
and _can_ be joined with enough "smarts".
especially if you can dip into a few "private"
databases, like the ones with credit-card info,
you could build quite a dossier on any person
(or place or thing) out there...
-bowerbird
p.s. in the news yesterday were reports that
yet another credit-card database was hacked.
does it strike anyone else as odd that "security"
can be so lapse on this personal and private data
at the same time that the corporations are wanting
to "lock up" all their content? i'm beginning to think
we should just all _pretend_ that d.r.m. works great
to put them at ease, knowing that it'll all be cracked
a few years down the line, and we'll be done with it...