Re: [gutvol-d] roger, your code _could_ scale just fine, if only you'd...

keith said:
This wisdom is absolutely a must!
sigh. keith, you're a perfect exemplar of the technocrats that i was just talking about... and making fun of... they saw lots and lots and lots of cases where someone used the file-system as a database and then ended up having problems with scaling when it eventually grew. so those technocrats then came to believe that they'd seized upon something "inevitable", so they started to enforce the iron-clad rule that "you must _never_ use your file-system as a database, because it won't scale". and they're proud of themselves, and puff their chests, and slap themselves on the back for this bankable info. furthermore, they repeat the mantra to each other often, confident that they've come across some ultimate truth... the problem is, there are some exceptions. and this is one of them. as long as you have a script working on only one book, consisting of just a few hundred page-scan images and a few hundred corresponding text-files, and _further_, you have _no_more_ than a dozen people working on it (if that, and usually not simultaneously), then you are _not_ going to have any "scaling" issues with that script, especially if 80% of the text-files will never be changed. (and none of the graphics files will ever change, save for occasional swapping-out of some badly-imaged pages, which isn't even something that the script will manage.) and even if you have a team of a dozen people working simultaneously on a book, they'll finish in half-an-hour. so you are _not_ going to have any problems with "load". and the important thing is that we _know_, for _certain_, that the per-book task will _never_ exceed beyond that... so we will never be taken "by surprise" by scaling issues. we might gain a ton of users, and eventually be working on thousands of books at the same time, but _each_book_ will still be a very small bite, easily chewed and swallowed. more users would mean more hardware, and bandwidth, and more copies of the script running in different folders, but each script is still going to be doing a very easy task... that is why this particular usage is _an_exception_ to the rule that you shouldn't use the file-system as a database. so your "advice" is wrong, keith. it's just flat-out wrong. and i was watching closely as roger developed his system. so i saw him make the error as he was in the process of it, and i saw the ramifications of the error... all of a sudden, things were squirreled away in a database, thus opaque... in this type of workflow, you want content out in the open. and equally important here is the point that roger needs to know that he _can_ mount a full-on site, if he _wants_. he's now suffering from the misimpression that he can't. maybe he doesn't _want_ to do that, that i'd understand, but i don't want him to think he can't, when really he can.
What most do not understand that in order to have a "proper" database for any non-trivial task, you must add layers of abstraction in order to administer all transactions and function needed. Without them a system will eventually bog down and will not scale well.
this is the other sickness that technocrats have. first they introduce complexity. then, when the complexity starts to make it difficult to handle a very simple situation, they add _more_ complexity, as if that's gonna solve it all... and _that_ is what causes things to "eventually bog down"... -bowerbird

Hi BB, Am 19.10.2011 um 18:31 schrieb Bowerbird@aol.com:
keith said:
This wisdom is absolutely a must!
sigh. Below, you give me the very proof that a "proper" database is needed to administer the etext/ebook production.
Your mistake, is in that you say for a per book production. Yes, for a single book it might seem over kill. Yet. what happens if you have a single you irking on multiple books. Several users working on a single book or on multiple books simultaneously! I am not saying, your system will not work. Just it does not scale well. FURTHERMORE, your scripts do the work that I would have the database do!! LIKE, I said you have to add adequate abstraction to the database to handle the task. WHAT, you fail to see that a database is just not a conglomerate of data and relations. I can do a lot more, if you design it properly. Also, what you do not understand the database is not store on the users machine. the user checkout what he needs works on it and checks it in. The database can then merge different edits, and all the other neat stuff you like to have in a workflow. There is very little that has to be installed on the users system. Just a program or scripts as you say. The are in a sense a mini database. I never said it is an easy task of developing a "proper" database. I am avoid the use of database system most equate it to DBMS which is a different animal all together. The DATABASE, is a system of tables, relations and possibly other databases all interacting. WHAT, you suggest is just one data repository for the type of database I have in mind. What I have is more layers, that bind everything together. So, the load is distributed throughout different repositories and controlled by the overlying database. This again is distributed over different processes and even possibly servers all interacting with each other to be fast and scaled. You mention that certain texts and images will mostly be changed. so what does that have to do with the database. It can recognize this fact and not create a new file. Your argue falls short of any database design considerations. Consistency is the essence of databases. If a database just gathers all files that user check in the it will as you say not scale well. That is why I say you need adequate abstraction of the administration of data. You mention added complexity, but the complexity is not added by the use of a database, but in part of the task of administering the edits of a text/book. All, you do is pack the complexity into diverse scripts! And, if you will a database of sorts! regards Keith. P.S. I am not a technocrat. I am a programmer!
keith, you're a perfect exemplar of the technocrats that i was just talking about... and making fun of...
they saw lots and lots and lots of cases where someone used the file-system as a database and then ended up having problems with scaling when it eventually grew.
so those technocrats then came to believe that they'd seized upon something "inevitable", so they started to enforce the iron-clad rule that "you must _never_ use your file-system as a database, because it won't scale". and they're proud of themselves, and puff their chests, and slap themselves on the back for this bankable info. furthermore, they repeat the mantra to each other often, confident that they've come across some ultimate truth...
the problem is, there are some exceptions.
and this is one of them.
as long as you have a script working on only one book, consisting of just a few hundred page-scan images and a few hundred corresponding text-files, and _further_, you have _no_more_ than a dozen people working on it (if that, and usually not simultaneously), then you are _not_ going to have any "scaling" issues with that script, especially if 80% of the text-files will never be changed. (and none of the graphics files will ever change, save for occasional swapping-out of some badly-imaged pages, which isn't even something that the script will manage.) and even if you have a team of a dozen people working simultaneously on a book, they'll finish in half-an-hour.
so you are _not_ going to have any problems with "load".
and the important thing is that we _know_, for _certain_, that the per-book task will _never_ exceed beyond that... so we will never be taken "by surprise" by scaling issues.
we might gain a ton of users, and eventually be working on thousands of books at the same time, but _each_book_ will still be a very small bite, easily chewed and swallowed.
more users would mean more hardware, and bandwidth, and more copies of the script running in different folders, but each script is still going to be doing a very easy task...
that is why this particular usage is _an_exception_ to the rule that you shouldn't use the file-system as a database.
so your "advice" is wrong, keith. it's just flat-out wrong.
and i was watching closely as roger developed his system. so i saw him make the error as he was in the process of it, and i saw the ramifications of the error... all of a sudden, things were squirreled away in a database, thus opaque... in this type of workflow, you want content out in the open.
and equally important here is the point that roger needs to know that he _can_ mount a full-on site, if he _wants_. he's now suffering from the misimpression that he can't. maybe he doesn't _want_ to do that, that i'd understand, but i don't want him to think he can't, when really he can.
What most do not understand that in order to have a "proper" database for any non-trivial task, you must add layers of abstraction in order to administer all transactions and function needed. Without them a system will eventually bog down and will not scale well.
this is the other sickness that technocrats have.
first they introduce complexity. then, when the complexity starts to make it difficult to handle a very simple situation, they add _more_ complexity, as if that's gonna solve it all...
and _that_ is what causes things to "eventually bog down"...
-bowerbird _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/mailman/listinfo/gutvol-d
participants (2)
-
Bowerbird@aol.com
-
Keith J. Schultz