
carlo said:
Just a possibility: if a book quotes another, and both are available a link can be created from one to the other.
yes, an exact quote is pretty easy to locate. and of course, in most books, an exact quote will already have been credited explicitly there, so there's not even a need to do the search digitally (although it might be easier to do it programmatically than to try to locate all of these references "manually", at least if the false alarms from common expressions were manageable).
Easy if an exact quotation is available, more intriguing if the reference is vague.
the recent cases of plagiarism in books coming out of major publishing houses comes to mind here, doesn't it? it shouldn't be too hard to write a routine that would do a relatively good job rounding up "fuzzy" versions of quotes, as long as they had some substantial similarity...
Another could be to find similar content even in absence of an explicit quotation, and have a link from a section of a book to a list of sections of other books. This might be a detection of keywords coupled with a keyword search.
well yes, this is what i would think most people have in mind when they talk about "the pages of books reading each other". and since we will be indexing each book anyway, it would be rather easy to set up a "semantic profile" of each book detailing any idiosyncratic words that are fairly common within its pages. books that have similar profiles could be linked to each other, and then pages or sections within one book that are similar to those in the other book could be linked together too, certainly. amazon already does this in a fashion with their listings of the "statistically improbable phrases" -- sips -- within each book, and the "capitalized phrases" -- caps -- so that's fairly obvious. amazon's "sips" and "caps" are interesting because you can click on them, and see a list of other books where they also appear... amazon also does an overall concordance, but i'd guess that that isn't as useful for earmarking a book in a set of its peers. (well, it looks like they formulate what's called a "tag cloud", with links to the actual occurrences, so that could be useful. still, i think the "sips" and "caps" would be more meaningful.) lastly, amazon also lists "books on related topics" for each book. i don't know if the quality of these associations is up to snuff -- amazon's version of "collaborative filtering" is a very bad joke -- but i'd imagine that it has utility for a range of book-buyers... (ok, more exploration tells me they do indeed use an overlap of "sips" as their main tool in discerning "books on related topics".) for those people who've never fooled around over at amazon, i've appended a sample of some of their info for a specific book. of course, what we _really_ want is not just to be _informed_ about similar books, but to have actual, honest-to-goodness hyperlinks between 'em, so we can point-and-click at our desks, rather than just order paper-copies to be delivered to our desks.
Both are already possible at the present state of technology.
these and more, absolutely. tagging and annotation are other options that get thrown around a lot. the idea here would be that interested users would form a "folksonomy" that would link related books, perhaps with a commentary of their own. this would give us the type of "intelligence" that can only be exercised by actual human brains, and which might complement and/or supersede the "brute force" approach of automatic computerized semantic analysis. in another vein, the cats at the institute for the future of the book seem fond of author/reader interaction in the actual _writing_ of the book, in a process where a book grows "organically", against the backdrop of the cyberlibrary. in this approach, links might _predate_ the content -- in essence be a "cause" of the content, rather than merely an "effect" -- which is an interesting view... likewise, david rothman's hobbyhorse these days is "blogs inside of books". the initial version of an openreader viewer-app will support this capability, so david has been raving about this feature like it's some kind of epiphany. he is even of the opinion that amazon -- which announced such a feature will be available soon in mobipocket -- could save themselves "a fortune in development costs" by using openreader instead of mobipocket. given that it's relatively trivial to embed this capability, i'm not sure what he's thinking. i would go over to his blog and ask him, but i've been semi-officially banned -- i'm not banned, but many of my posts have now permanently disappeared -- because i have this annoying habit of saying things that do not go along with the official spin that he likes to hype over there. so i would certainly hope that his put-a-blog-in-your-book capability allows an author to ban any "trolls", because we wouldn't want to experience any disagreement now, would we? either way, amazon seems to have had no trouble implementing a "discussion" section feature -- currently labeled as "beta" -- on its webpage for each book. this is in addition to the "wiki" which it already had, the purpose of which i am not all that sure, and haven't investigated, because the overwhelming nature of all of the _stuff_ on each amazon page gives me bad information overload, and after a while i just feel a strong need to get the heck out of there! :+) *** at any rate, these are some of the ideas that are bubbling at the surface. and though some of them sound interesting, to be sure, i also find that i am left wondering if all of this "books reading each other" stuff is gonna lead to something immense that leapfrogs us to the next level of super-intelligence, or whether it's all much ado about not too much... time will tell, i guess... -bowerbird p.s. here is some of the information that amazon gives for a 1997 book... Internet Dreams: Archetypes, Myths, and Metaphors (Paperback) edited by Mark Stefik, with Introduction by Vinton Cerf First Sentence: We are born into a world rich in art, invention, and knowledge Statistically Improbable Phrases (SIPs): electronic mail metaphor, digital library metaphor, electronic sketch book, digital tickets, electronic brokerage effect, digital property rights, networked libraries, digital works, marketplace metaphor, superhighway metaphor, digital library system, trusted systems, digital book, dream session, editing test, digital reality, new design methods, usage rights, digital library project, electronic hierarchies, virtual rape, warrior archetype, digital publishing, fire bringer, electronic mail address Capitalized Phrases (CAPs): Library of Congress, Jeremy Taylor, United States, Gutenberg Bible, America Online, British Library, New York, Vannevar Bush, World Wide Web, Boston Spa, Joshua Lederberg, Lynn Conway, Palo Alto, San Francisco, Turing Test, Bungle Affair, Carver Mead, Challenging Assumptions, The Machine Stops, Yellow Pages, Alexander Eliot, Civil War, Digital Property Trust, Internet Companion, Libraries of the Future Concordance These are the 100 most frequently used words in this book: access another article available between book case changes come communication community computer copy costs course design different digital dream dreammc electronic even example experience first form get good group however idea information internet joannel2 journals kinds know knowledge large library life market may means meeting members message metaphor methods might mud need network new now number often others own paper part participants people place players problem process project provide public publishers publishing read real repositories research rights room say see sense several should social society system take technology text things thus time two use used users virtual without work world Text Stats These statistics are computed from the text of this book. Readability -- Compared with books in All Categories Fog Index: 15.9 -- 75% are easier, 25% are harder Flesch Index: 40.0 -- 69% are easier, 31% are harder Flesch-Kincaid Index: 13.0 -- 76% are easier, 24% are harder Complexity Complex Words: 18% -- 66% have fewer, 34% have more Syllables per Word: 1.7 -- 65% have fewer, 35% have more Words per Sentence: 21.3 -- 77% have fewer, 23% have more Number of: Characters: 804,122 -- 83% have fewer, 17% have more Words: 129,664 -- 85% have fewer, 15% have more Sentences: 6,078 -- 73% have fewer, 27% have more Fun stats Words per Dollar: 4,987 Words per Ounce: 5,332