i said:
> p.s. plus, as voice recognition improves over the next 5 years or so,
> i expect that o.c.r. will take a back seat to voice-transcribed books...
then brad said:
> ROFL !!!
then kevin said:
> Ewe no, he mite bee rite. Wee maybe waisting oar thyme.
ya know, i never know what i'm gonna say that's gonna
set people off. (but i should have learned by now that
it'll probably be a throwaway line in the p.s. rather than
the meat of the substance in the body of the message.)
but hey, i don't mind the challenge. it helps me develop
some logic that i might not have bothered with otherwise.
obviously kevin here has never used voice-recognition,
because no system would give us the line he gives us.
that's _not_ to say voice-recognition is problem-free.
there are a lot of problems with it. a ton of problems.
but there used to be a ton of problems with o.c.r. too.
and people still slogged through it anyway, didn't they?
the reason people will slog through the problems anyway
with voice recognition is because it will be a lot more _fun_
and _easy_ to just _read_ a book through rather than to sit
inside an editing system, and that will make the difference.
over the past 5 years, some 35,000 people signed up at d.p.
roughly 10% -- about 3,500 -- were around when d.p. reset
its subscription base a while back. those were the top 10%,
so that wasn't a bad thing, but it does go to show that the
o.c.r. route is just a little bit too trying for the average bear.
even when you distribute out the work.
but hey, if that other 90% could do their part to help out
by simply recording a book -- they _did_ once express
enough interest in the cause to sign up, remember --
then maybe they could have been retained as helpers...
and maybe a whole order of magnitude of more helpers
could be _recruited_ if the means of helping were so fun.
with libre vox, people are already recording old books.
audiobooks, always popular, are getting even more so.
podcasting is growing the base of recording experience
(and audience) in the user-population at a _huge_ rate.
and, for those of us keeping track, there has already been
a message posted on the distributed proofing forums from
a person who reported using voice-recognition software
_within_the_current_d.p._system_. now that's dedication.
and as the form-factors of our machines continue to shrink,
voice-recognition will become more and more important,
and more ingrained, and some people will rely on it entirely.
and speaking of libre vox, it's important to keep in mind that
a _recording_ retains value even _after_ it has been turned into
digital text. heck, many people will prefer the .mp3 to the .txt.
there's sure a lot more player-hardware out here for the .mp3.
moreover, when a person creates a recording, that product
is _seeped_ with their contribution. with their own _voice_,
for crying out loud. can it get much more personal than that?
to some people, that will surely be more satisfying than the
simple credit-line at the top of a project gutenberg e-text...
and hey, it might mean a lot more to the _end-user_ as well!
i can tell you that i've looked at a lot of texts from jon ingram.
lots and lots of them. and they almost always look very nice.
but none of them has had the impact of the bit he recorded
for libre vox, where his accent had me muttering to myself,
"hey, i forgot, that bloody bloke is from _england_, isn't he?"
there's something very endearing and personal about a voice.
even one with a heavy english accent. ;+)
so a person who records a book is giving us _two_ products;
one is a route to obtaining digital text via voice-recognition,
and the other is a recording of that book in a human voice.
it might be that down the line, the second dwarfs the first.
it's also quite important to remind ourselves that these two
products are complementary, not competing with each other.
and it's not hard to imagine that the recording will become
_especially_ useful when it gets combined with page-scans.
a recording of each page playing when the scan is displayed
might become the most typical kind of "book" in the future!
likewise, it does _not_ have to be either/or between o.c.r. and
voice-recognition; we can instead make the two work together.
we could do o.c.r. on the scans, and then cross-check the o.c.r.
against the voice-recognition results, then concentrate on the
differences to intelligently remove errors from _both_ versions.
we would expect homonym problems in the voice-recognition,
for instance, and scannos in the o.c.r., so could control for that.
anytime you combine two different methods for the same result,
they can serve as a useful cross-check on each other. bingo.
in case you didn't know, some of the people who are obtaining
the highest accuracy in their e-texts use text-to-speech to get it.
what i'm talking about here can be viewed as the flip-side of that.
so, in summary, if you're "rolling on the floor laughing" about
voice-recognition and the possibilities it offers to digitizers,
you show your lack of vision. there's no other way to say it...
of course, your loss is the lurkers' gain,
because it gave me a reason to explain.
-bowerbird