
don said:
Slow down
let me come back to that...
read the *second* sentence.
i did read it. i wasn't impressed.
It's pure html in the mysql database.
precisely. that's my objection. and i stated it, as such, quite clearly, when i said:
so on top of the obfuscation of the text by the .html, it's now buried inside of a mysql database, meaning that anyone who wants to understand the system well needs to acquire an additional hairy set of experience.
in other words, you take clean text and gunk it up with .html, and then you also bury that gunked-up text in a database. and you want us to think that is a good system? that's like taking an art-book with nice pictures and drawing mustaches and beards on all the pretty women (that would be the .html coding), then putting the book into a bag (the database), and then expecting us to "appreciate" the artistry. um, no thanks. just leave the thing unmolested.
And if it's "impure", then blame me, because it's precisely what I imported from the DP project - for better or worse. Same for the CSS.
well, one of us is confused, that's for sure... _i_ thought we were talking about a system that could be used by an entity like d.p. to do book-digitization, with a suggestion wordpress would do the .html/.css. in other words, we put in .text, and get out .html/.css. but now _you_ seem to be discussing your own system, where you "imported" the .html (and the .css) from d.p. you seem to have already _started_ with the .html/.css. so one of us is confused about the topic of discussion. i don't really care what you use for your own projects. whatever makes you happy is absolutely fine with me. but if you're going to propose a system for volunteers, i'd say it must take their skill-sets into consideration, and attempt to make things as simple as they can be. and for book-digitization, text-files are all you need, a fact which i have demonstrated time and time again. text-files have the advantage of magical simplicity... the weird thing is that most geeks know this very well, and practice it near-religiously in their own workflows. a very big part of the unix philosophy is "piping" text... but all of a sudden, in dealing with _books_, which are (and always have been) predominately text (with some pictures thrown in), all of a sudden we need _markup_. what's up with that? especially since, in their own _manuals_, geeks such as the python community use -- ta-da! -- light-markup... i guess angle-bracket-tags are only for the suckers who are too stupid to realize that they don't really need 'em... *** don said:
Slow down
um, again, no thanks, don. i have a much better suggestion -- flip your script. let's instead have _you_ "speed up" a bit -- a lot! -- along with the rest of this listserve, if it's possible... you've become accustomed to the glacial pace of distributed proofreaders and project gutenberg and this listserve, where the topics of conversation haven't changed all that much in a decade when -- all around us -- e-books have finally "caught on". 8 years ago, this list was hung up on .xml, acting as if that was "the important thing" for discussion... history proved that was one big stinking dead-end. and now you're back on .html as "the big thing", but that's a dead-end too. do you really believe that in 2025, people will still be fussing with angle-brackets? get off the merry-go-round and shoot for the moon. do you want to know the most exciting development in e-books in recent months? it's amazon's "x-ray".
Amazon invented X-Ray, a new feature that lets customers explore the "bones of the book." With a single tap, readers can see all the passages across a book that mention ideas, fictional characters, historical figures, places or topics that interest them, as well as more detailed descriptions from Wikipedia and Shelfari, Amazon's community-powered encyclopedia for book lovers. Amazon built X-Ray using its expertise in language processing and machine learning, access to significant storage and computing resources with Amazon S3 and EC2, and a deep library of book and character information. The vision is to have every important phrase in every book.
if amazon pulls off "x-ray", it could be phenomenal. this is the kind of thing that p.g. could've "invented", since it once had the biggest public-domain corpus. but you guys here were caught up in _file-formats_, among the most tedious and irrelevant e-book trivia. so while you've focused on markup, other entities are doing stuff that's far more interesting and important. it's as if you were at the grand canyon, but instead of paying any attention to the marvelous natural features, you're all still at the tram arguing about what kind of upholstery will be best to use for the seats in the tram. and when i roll my eyes, you think i'm picking a fight. you're missing out on the real action, and you don't have the slightest clue that that's what's happening. so no, don, i _ain't_ gonna "slow down", because i'm _tired_ of going nowhere and nowhere and nowhere. and i firmly believe that you, don, should also decide not to let your good imagination be held back by the foot-draggers over at d.p. you're smarter than them. so demonstrate it, don, by letting your creativity soar. and believe me, a suggestion to "use wordpress" does _not_ accomplish that objective; it doesn't come close. -bowerbird