
There is an interesting question not quite getting discussed in this debate about what exactly IS epub-friendly HTML. Greg's last append, and this isn't the first one, seems to imply that if only contributors would use only structural HTML, then all would be well. I spent much of the 1980's writing my technical reports in GML (a predecessor of HTML (it's where the <p>, <ol>, <li>, <h1>, <h2> etc. tags in HTML came from)) in which you could specify little MORE than document structure. For technical reports conforming to a well-understood corporate layout policy, this worked pretty well. When I started looking at how to improve PG epubs a year or two ago, I assumed initially that declaring document structure in this way was indeed the main objective. But it now seems to me that just declaring document structure in HTML is inadequate, because there is meaning in the layout of books, which isn't easily or obviously coded in simple HTML structural tags. One might object that HTML extended with CSS styles is much more flexible and customisable than plain old HTML - you can use the facilities of CSS to effectively create your own tags (e.g. lots of <div>'s and <span>'s of specifically declared classes). This is perfectly true, but the danger if you do this is that by doing so you have superimposed your own conceptual model of the structure of the book contents onto the author's original work. And what you may well actually be doing is recording your misunderstanding. To give an example, if you look at many 19th century novels there are quite often many letters given in the text (Austen, Brontes, Trollope etc). You could come up with a standard set of styles for letter headings, openings and closings which would then standardise how to format them. Initially I thought this was what I should do (and from a brief look at TEI that seems to be what it does), but looking a little closer, the particular style and layout of the headings, etc. often conveys something about the formality or intimacy of the relationship between writer and correspondent. To satisfactorily encode this into a discrete set of CSS styles one would first have to have a profound understanding of letter writing conventions when the book was written, and how they changed over time, and secondly a way to encode those conventions into HTML/CSS in a sufficiently expressive way. My current thinking is that although much of the coarse-grained structure of a literary work (the division into front matter, main body and back matter, the division into volumes, chapters and sections, the division of much of the text into paragraphs) CAN be satisfactorily encoded using standard HTML structural tags, for the finer-grained structure it is more realistic, and probably more likely to convey the author's original meaning, to try instead to simply 'virtualise' the original layout (using CSS styles to document the virtual layout). What I mean by virtualise, is to do the markup to preserve layout in as scalable, reflowable and device-independent a way as HTML allows. For example: - not quoting fixed font sizes - specifying margins in scalable units like ems - using the hanging-indent technique quoted by Jim Adcock a few months ago where appropriate I'm sure other contributors to this forum could produce many more such techniques. Whenever in the past I have suggested that a little standardisation of HTML might be a good idea this has been rejected by Greg, more or less out of hand on grounds like: - PG is not interested in individuals who want us to admire their pretty layouts - PG doesn't want to place artificial restrictions on submitter's HTML Actually I agree with both of these points, but my belief is that if PG wants to produce readable epubs (something which many people agree it isn't currently doing), there isn't a simple one-line answer that achieves this. PG is going to have to step up to the plate and think about standardising its HTML contributions. And (ahem) from what I can see of their contributions, it's not totally clear to me that the current PG inner circle necessarily have all the experience and skills to do this on their own. Bob Gibbins