
joey said:
I have a 100Mb/s municipal fiber connection and almost 2 terabytes of disk space available, and "download[ing] vast portions of the library" is not an option for me.
well joey, i do look forward to your tool, when you find time to create it, because these general discussions we are having around this topic have a lot of fuzziness about them, which must all be resolved when one starts writing code. so i won't respond to all your points until i can see exactly what you meant by them. but this point here is quite easy to deal with. downloading the project gutenberg library -- even the whole thing -- can be a breeze. first of all, as is always the default with me, i'm only concerned with one version of each -- the "master version", in z.m.l. format -- as the other versions can be spun out of it. second, as i said, it's reasonable to eliminate big classes of e-texts from the downloading, such as the human genome files, audio/video, and books in languages that you don't read... third, there are a lot of duplicate files where pieces of a volume were presented separately, and then the volume as a whole in another file. now that we have the information (thanks greg), those separate-piece files can easily be ignored. fourth, there are some people who will not want the magazines that are being added increasingly. once you've eliminated all of these files from your download queue, you find the list is much smaller. on to the next step... i have written a program that lets a person click one button to start downloading e-texts as a background process on their machine. as soon as one e-text has been completely received, the next one is requested, thus the downloading is _relentless_, and you'd be surprised how fast it goes. for a d.s.l. person like myself, after doing the deletions i mentioned above, it will merely take _a_few_days_ to download all the e-texts. to get the _whole_ library, it might take you a week or so. but remember, during this whole time, you will not have to do a single thing. all you had to do was click that one button. plus, you do have to enter a code every 108 minutes, but it's just this sequence of 6 numbers, no big deal. ;+)
I also don't agree with the implied assertion here that having the full (or even "vast portions of the") library means that users don't want help identifying and locating content within that collection.
it was only because i knew some might _infer_ such an "assertion" that i closed my post with the explicit note that this later purpose _is_ still "handy", and therefore should be the _focus_ of this task. did you read that?
I generally avoid topics once you start weighing in on them, so I may have missed the applicable portions from the last time this topic came up.
well that's a remarkable admission. since i "weigh in" on every topic that is _interesting_ and usually "start" doing so fairly early in the thread, that must mean you're "avoiding" most of the posts, and all the interesting threads. life must be sad. :+) at any rate, i thank you for your candor. perhaps you will thank me for mine when i tell you that if you didn't read what i have written on this topic before, you're likely to take a path that will end up biting your ass. *** anyway, as i read your proposal, it's a social tagging scheme. as a general approach, that would be one way of doing things. again, the specifics are vital, so let us know when you have 'em. -bowerbird
participants (1)
-
Bowerbird@aol.com