
What is misquoted below is that we will not annoint any author of any markup language as "King" at the expense of all other efforts. As volunteers we encourage people to try their own alternatives. What too many people here want is to have fewer alternatives. As we have always said: "Run it up the flagpole, and if people salute, you have victory." However, WE are not going to declare you victor beforehand, after, it will be obvious simply by the statistics of downloads. . . . Right now, for all the argumentation against them, it would appear that most downloads are .pdf files and go to Kindles...go figure!!! mh On Thu, 24 Feb 2011, Lee Passey wrote:
On Thu, February 24, 2011 1:58 am, Keith J. Schultz wrote:
Am 23.02.2011 um 18:03 schrieb Lee Passey:
[snip, snip]
My own view is that almost any markup language could satisfy (1), but that no markup language will be able to satisfy (2).
While I will agree pretty much with 1) the problem is 2)
The Problem is that d.p simply refuses to have a concise Mark-up language. That a Language that has a fixed set of markup commands and also enforce the rules thereof. Instead the they use a system where the contributors have too much freedom.
I think that the problem lies more at the feet of Project Gutenberg than Distributed Proofreaders, but I think we both agree that the problem is political, not technical. A solution is possible, it is just not acceptable.
While BowerBird, Mr. Hutchinson, Mr. Perathoner and I regularly disagree on details, I think we all agree that a single markup system, consistently adhered to, is required for any system to do automatic conversions between formats (I usually agree with Mr. Adcock, and believe that he and I could come to a meeting of the minds rather easily). But until Project Gutenberg is prepared to throw its weight behind some standard, nothing will be accomplished.
Because I do not believe that Michael Hart or the other Powers That Be that control Project Gutenberg will ever endorse any particular markup language, fully or in part, discussions about what the ideal markup would be are useless in this venue (except to the extent that they are of academic interest). People who care about electronic preservation of books and their pleasing presentation need to find a different organization to work with to accomplish that goal.
I have offered to develop a tailored language, but also demand that it will be used.
But that demand will never be accepted.
I would have help with the tool chain, etc. I would have work closely with them so that they would have exactly what they want.
The core controllers of Project Gutenberg only want "more". Quantity is everything, quality is nothing. Unless your tool chain can provide "more", it is of no interest to PG.
The Philosophy is simple. Develop a "medium" mark-up that can represent the content and structure of a book. Develop conversion routines that output end formats for the readers.
I agree 100%.
You can even add source information so that you have a master format that can aide in site management.
You can use RST, XML, HTML-like, etc. The only thing that needs to be done is it has to be designed.
From a practical standpoint, HTML is the markup that has won in the marketplace; there's no reason not to adopt HTML as the basis for all markup. While HTML is indisputably the best markup language to adopt, by itself it cannot capture the unique structure of a book. As I believe you are suggesting, for representing books, HTML (or any other markup language) must be further constrained by standardized best practices, and refined by a standard set of semantic classes.
That should take no longer than a half of an year.
No, it should not. There are some things I feel passionate about (<p> should only be used for paragraphs, not for anonymous blocks) and other things I do not (what is the best representation for a thematic break, frequently represented in books by a dingbat?). But I believe that if a core group got together in a spirit of cooperation and compromise this kind of a specification could easily be created.
But this kind of specification will never be adopted, endorsed or even promoted by Project Gutenberg, so there is some question as to its value.
So sure there is tons of books already in the old formats. But, they can be converted, naturally with some loss.
But that which is already lost breaks my heart. I believe the effort needs to be restarted with the goal of lossless digital transcriptions. A good digital edition of _Frankenstein_ seems much more important to me than a new edition of _History of the Ojibway Indians from 1830 to 1895_.
The reason that the lighter mark-up is being preferred is simple. It is not convoluted and less possibilities for people to break out! In other words not much to enforce. The restrictions and enforcement is kind of built in.
I have some degree of trepidation with the use of "light" as a modifier for "markup," because I don't understand what it means. Markup that is constrained, in that there is one, and only one, method to mark a specific construct is a good thing. Limiting the markup elements to a specific set of tags is also probably a good thing, although we must be mindful of Herr Einstein's famous recommendation: "things should be made as simple as possible; but no simpler." Discarding structure because there is not a convenient way to represent it is what got Project Gutenberg into this problem in the first place.
"Light" in the sense of "easy to overlook" is probably /not/ a good thing. I believe that markup should be explicit, not implicit. How good are you at distinguishing between 4 blank lines and 5 blank lines? Could you easily tell the difference between a line that starts with a tab character, and one that starts with eight spaces? (This distinction is important in creating Makefiles, and yet it still trips me up from time to time). You may not recognize that a line starts with one and only one space character, but you certainly can't confuse the meaning of a line that starts with <center> and ends with </center>.
But these arguments (fun arguments, interesting arguments, worthwhile arguments, but arguments nonetheless) are probably best made in some other forum, because whatever the outcome they are irrelevant to Project Gutenberg.
OTOH, this is a free and easy forum, so we /could/ have those arguments here so long as we all recognize that any application or implementation of any consensus would have to be applied in an organization other than PG.
On a related note, about 7 years ago the community at alt.binary.ebooks confronted the problem of HTML support in e-book hardware and software and came up with a recommendation as to how to mark up books using HTML to be most compatible with hand-held devices. If anyone's interested I could probably drag up that spec and post it here.
Cheers, Lee
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/mailman/listinfo/gutvol-d