
Sorry for the length, everyone, but I wanted to try and cover in words what I was unable to cover in production software. On Thu, Jul 13, 2006 at 05:42:29PM -0400, Bowerbird@aol.com wrote: ...
finally, i'm not sure that y'all understand the major need here. and i'm quite certain that library-school students will miss it.
answer this question: why should we categorize the e-texts?
if your response runs along the lines of "so end-users can find the book they want, and download it", you're on the wrong path.
that's the function catalogs used to serve, in the dead-tree world. ... but in our new era of high-bandwidth and terrabyte hard-drives, it's silly for a person to spend even mere seconds trying to decide _whether_or_not_ to download a book. it's _far_ more convenient to download vast portions of the library, since they can have their computer do it automatically while they are partying, or sleeping...
I disagree. I have a 100Mb/s municipal fiber connection and almost 2 terabytes of disk space available, and "download[ing] vast portions of the library" is not an option for me. I don't find it difficult to imagine that if I have a hard time accepting this answer, there are going to be others who do so as well, with far fewer resources at their command.
even the dial-up people can request the d.v.d., for free, and have the entire p.g. library sitting on their hard-disk in a week or so...
I also don't agree with the implied assertion here that having the full (or even "vast portions of the") library means that users don't want help identifying and locating content within that collection. Of course, this means that we'll want to help people who download the library get the catalog data that matches their portion of the library!
not only is it not wise to make people spend any time "choosing", it's at odds with the important concept of _unlimited_distribution_.
Having a catalog does not equate to making people use it. It's a tool for those who want to make use of it. That said, let's make sure that whatever tool(s) we come up with fit as many of the percieved needs as we possibly can! You clearly have different ideas of the use of a catalog than do I. As you've already enumerated some of the points of *my* use, perhaps you could elaborate on your ideas? (On the other hand, if you already did this, ignore this request. I generally avoid topics once you start weighing in on them, so I may have missed the applicable portions from the last time this topic came up.) --- So, on to my proposal. I had hoped to actually be able to provide a tool demonstrating it, but my day job interfered too much this week to allow me to realize that hope. So instead, let me see if I can lay out the concept. It's based on the tagging system known as the "Debian Package Browser" [1]. Some important parts of the idea that might be missed initially: * Every book gets tagged initially with a placeholder value * Wherever we can identify existing valuable tags, they are added to the initial load. Some examples of tags I'd want include: year published in PG; Author/Creator; Language; LoC Class; Copyright Status (sounding familiar to anyone?) * Tags need to be nestable. This is something the Debian system is not able to support, but I think it's very important. One example Browerbird already pointed out is the Amazon.com categorization scheme. * The default behaviour of the tagging system should be marking which of the existing tags are best applied to this book, but it also needs to be flexible enough to add new tags (and hierarchies thereof). Setting the default behaviour this way is one way of preventing the "del.icio.us syndrome" found in many folksonomies, where there are as many different ways of tagging a piece of content as there are users of the system. * It should be easy, when viewing a particular ebook, to do any of the following actions: view tags already on this book; see a list of "suggested tags", based on a weighted list of tags attached to content that has other tags in common with the current content; view other content tagged in common; add / remove tags. * It needs to be easy to see all content with a particular tag or tagset. I'm envisioning something akin to the Flamenco [2] system here. I envision a lot of things coming out of this effort, including an easier way for people to suggest content for the "Best Of" DVDs so that Greg doesn't have to do so much of the leg-work himself. As people come across suggestions, they tag them, then Greg can just pull a list of ebooks with that tag. I've done some work on a prototype, but as I said, the real world invaded and sapped my time. Then again, I know there are many others on this list that are talented software developers, so perhaps one of you will beat me to it...or propose an even better system. [1]: http://debian.vitavonni.de/packagebrowser/ [2]: http://flamenco.berkeley.edu/ If you'd like to see Flamenco at work, but don't have the resources to set it up yourself, drop me a line off-list and I'll provide you with a URL to one I've setup.