re: !@!Re: [gutvol-d] Kevin Kelly in NYT on future of digital libraries

1 Jun 2006

      michael said:
...
Perhaps the way to think about this is to consider
   just how many more or less readers we would get if
   the file sizes were that much larger or smaller.
there are something like 100,000 books available at google.
d.p. digitizes about 2,000 books a year.   they can't keep up.
...
In the end, I think we should provide both.
in the end, users will turn exclusively to "digital reprints"
-- digital text that mimics the scans so accurately that
there's really no good reason to consult the scans at all.

after 10 or 20 years of nobody downloading the scans,
we'll be able to feel comfortable taking them offline...
...
Some operations deliberately do not put their high
  resolution scans online for downloading, rather an
  automated process reduces the resolution, so these
  scans are no longer suitable for OCRing.
yeah, that's sad.   but what are you gonna do about it?
...
The odds of being able to create a complete eBook,
  using those scans that are usually made available,
  perhaps about 1/4 to 1/3, based on the reports you
  have probably already seen.
yeah, that's sad too.   but that's a quality-control issue
that i suspect the scanning operations will solve soon...
...
Once you go through the effort of scanning missing
  pages, rescanning the pages that did not work with
  your OCR programs, etc., it often might seem worth
  the effort simply to scan the entire book with the
  higher resolution scans that you can then post for
  others to use.
i don't think -- for most books -- that will be the case.

but perhaps that's because i don't see much use for
high-resolution scans.   i am _not_ in love with scans.
like i said above, they will eventually be left behind.

the important point _today_, though, is that we have
a shitload of scan-sets, more than we can process now,
and it's silly to ignore them when we _could_ offer them
for people to _read_ now, even if they aren't digitized...
...
Do raw scans qualify as eBooks?
does it matter?   they are what they are.   no more, no less.
and almost everyone sees them for exactly what they are.
...
This is the "quick and dirty approach" and doesn't
   cost much in terms of time, effort or money
um, scanning does indeed take time, effort, and money,
at least if you're doing it on a scale of millions of books...
...
I suppose the real question comes down to 
   purposes for making eBooks.
i'm not sure of that.   we make e-books for people to read,
and so their text can be searched and easily repurposed...

scans get us part of the way.   digital text gets us the rest...
...
The various university projects still seem to be a
  great deal concerned with keep their eBooks out of
  the hands of the public, as has Google, though the
  Google philosophy may be in the process of change.
the michigan librarian pledged that all public-domain books
scanned from their library will be made available to the public.
i assume he meant the scan-sets.   but from them, we will soon
be able to automatically get digital text, so there's no difference.
...
Right now it's hard to tell what Google has chosen
  as their goal; will they really try to do millions
  of books in the next 54 months after perhaps stats
  of .1 million in the first 18 months? 
they most certainly will.
...
Will Google change their philosophy per downloading scans,
if we open up negotiations with them, _maybe_.   we can hope.
...
and or downloading their full text searching database?
they'll never make their text-database public, as that's the 
competitive edge for which they are paying many millions...

do you really think they're gonna hand it over to microsoft?
...
Until Google decides to actually proofread eBooks,
if you mean "ensure that their digital text is highly accurate"
-- which can be completely orthogonal to "proofreading" --
then you can be certain that they will "decide" to take that step.
inaccurate text gives bad search results; google won't tolerate that.
...
My own goal has always been for the public to have their own 
  home eLibraries, just as they have their own home computers. 
that's the goal for a lot of us.

-bowerbird

re: !@!Re: [gutvol-d] Kevin Kelly in NYT on future of digital libraries

Bowerbird＠aol.com