
Almost exactly a year ago I put an append on this forum about what was wrong with PG-generated epubs, offering my own hand-crafted ones to replace them. The response at the time could probably be summarised as "yes, we know about that, go away" Over the last year I've probably re-mastered (for my own consumption) about 50 of the PG-generated epubs, many of them Anthony Trollope's novels produced by Joseph Loewenstein and others. Mostly what I am doing to the actual works is, as Greg Newby described in a recent append, removing stuff to do with formatting/layout and instead tagging with structural information that relates it to layout specified in a css file. I also have standard templates for the PG front and back matter wrappers and the xml files that the epub v2 standard requires. The css file I use is not particularly complicated, it contains remarkably few styles, those that are there being largely derived from existing PG stylesheets. I make no claims for the excellence of my markup or stylesheet, I'm still learning, no doubt you chaps could do better, but I achieve results that are satisfactory to me, and without using any more complicated tools than emacs and winzip. My main point is that to do this sort of thing you don't need to be an ace compiler writer or rocket scientist. A year later I look again at this forum and you are still arguing about it, but nothing seems to have been achieved! Some of you want to produce a grandiloquent change management/source code control systems, that gives fine-grained cooperative shared access to pages in the work. (Personally I found that the only significant productivity improvement I got was to take the generated epub and stick all the individual xhtml files back together so I could do emacs global changes on them (regular expressions are wonderful!)and then split them back up into individual chapter files afterwards.) Greg Newby seems to want to kind of shunt everyone's epub contributions into a siding where they will wither and die from not being updated. Nothing is sacrosanct except no one else is allowed to touch David Price or David Widger's work. It seems to me that really all that is required is to recognise that a lot of the html in PG's archive is pretty antiquated by modern standards and needs to be updated. And basically this is a labour-intensive task which can't be completely automated because it wasn't originally done to consistent standards, because it happened over a decade or two during which html and ideas about how to use it evolved. So really what needs to be done is to call for volunteers to do the clean up. And since Greg seems to imply that he is currently deluged with offers of documents from people who think they have already done this, it seems to me that there is plenty of enthusiasm out there (including me!). And before that happens, some PG group (contributors to this forum, Wwers, whatever) needs to get together and write a 'How To' describing at the nuts and bolts level how to do the clean up. This should include a recommended set of styles (and the css that implements them) lots of advice about what not to do for people that don't already know, also including access to templates, css style sheets, etc. As my ex-employer used to say, 'there are no technical answers to management problems, but sometimes there are management answers to technical problems'. Bob Gibbins

On Tue, January 31, 2012 12:48 pm, Robert Gibbins wrote: [snip]
And before that happens, some PG group (contributors to this forum, Wwers, whatever) needs to get together and write a 'How To' describing at the nuts and bolts level how to do the clean up. This should include a recommended set of styles (and the css that implements them) lots of advice about what not to do for people that don't already know, also including access to templates, css style sheets, etc.
I, for one, would like to see the style sheets you have created. Mr. Newby, can you provide us with some sort of repository where we can share this kind of stuff? Others have mentioned the "standard" PG style sheets or the "standard" DP style sheets. Are these documented some where, or are we just talking about the sometimes random style declaration that appears at the beginning of some PG documents?

And if there are (or are to be) such standards, why would we want to embed them in the document, invalidating their usefulness for differentiating among output requirements? On Tue, Jan 31, 2012 at 11:53 AM, Lee Passey <lee@novomail.net> wrote:
On Tue, January 31, 2012 12:48 pm, Robert Gibbins wrote:
[snip]
And before that happens, some PG group (contributors to this forum, Wwers, whatever) needs to get together and write a 'How To' describing at the nuts and bolts level how to do the clean up. This should include a recommended set of styles (and the css that implements them) lots of advice about what not to do for people that don't already know, also including access to templates, css style sheets, etc.
I, for one, would like to see the style sheets you have created. Mr. Newby, can you provide us with some sort of repository where we can share this kind of stuff?
Others have mentioned the "standard" PG style sheets or the "standard" DP style sheets. Are these documented some where, or are we just talking about the sometimes random style declaration that appears at the beginning of some PG documents?
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/mailman/listinfo/gutvol-d

On Tue, January 31, 2012 1:19 pm, don kretz wrote:
And if there are (or are to be) such standards, why would we want to embed them in the document, invalidating their usefulness for differentiating among output requirements?
We absolutely wouldn't, which leads me to rule #2: 2. No XHTML file will specify a <style> block. All classification of elements must reference external style sheets. Every XHTML file must contain two link elements, one to "pg.css" followed by a link to "pguser.css". "pg.css" will contain all of the style selectors agreed to as part of this effort, together with sample styles. "pguser.css" is reserved for user style preferences.

On Tue, Jan 31, 2012 at 12:53:06PM -0700, Lee Passey wrote:
On Tue, January 31, 2012 12:48 pm, Robert Gibbins wrote:
[snip]
And before that happens, some PG group (contributors to this forum, Wwers, whatever) needs to get together and write a 'How To' describing at the nuts and bolts level how to do the clean up. This should include a recommended set of styles (and the css that implements them) lots of advice about what not to do for people that don't already know, also including access to templates, css style sheets, etc.
I, for one, would like to see the style sheets you have created. Mr. Newby, can you provide us with some sort of repository where we can share this kind of stuff?
I already did, and you're using it: http://lists.pglaf.org/mailman/private/gutvol-d/ -- Greg

Almost exactly a year ago I put an append on this forum about what was wrong with PG-generated epubs, offering my own hand-crafted ones to replace them. The response at the time could probably be summarised as "yes, we know about that, go away"
Over the last year I've probably re-mastered (for my own consumption) about 50 of the PG-generated epubs, many of them Anthony Trollope's novels produced by Joseph Loewenstein and others. Mostly what I am doing to the actual works is, as Greg Newby described in a recent append, removing stuff to do with formatting/layout and instead tagging with structural information that relates it to layout specified in a css file. I also have standard templates for the PG front and back matter wrappers and the xml files that the epub v2 standard requires. The css file I use is not particularly complicated, it contains remarkably few styles, those that are there being largely derived from existing PG stylesheets. I make no claims for the excellence of my markup or stylesheet, I'm still learning, no doubt you chaps could do better, but I achieve results that are satisfactory to me, and without using any more complicated tools than emacs and winzip. My main point is that to do this sort of thing you don't need to be an ace compiler writer or rocket scientist.
A year later I look again at this forum and you are still arguing about it, but nothing seems to have been achieved!
Thanks for these comments. They are reasonably fair & accurate. Keep in mind that (as I just sent), we DO have a way of getting changes into the collection for the master formats. We do NOT have a good way of adding hand-crafted versions of files that are, otherwise, automatically generated from the master formats. Nor do we have a good way of recreating such hand-crafted versions, should the master format change. In short, I'd really like to be able to accept your hand-crafted epubs. The challenges focus on doing this in a scalable and sustainable manner. -- Greg On Tue, Jan 31, 2012 at 07:48:58PM -0000, Robert Gibbins wrote:
Some of you want to produce a grandiloquent change management/source code control systems, that gives fine-grained cooperative shared access to pages in the work. (Personally I found that the only significant productivity improvement I got was to take the generated epub and stick all the individual xhtml files back together so I could do emacs global changes on them (regular expressions are wonderful!)and then split them back up into individual chapter files afterwards.) Greg Newby seems to want to kind of shunt everyone's epub contributions into a siding where they will wither and die from not being updated. Nothing is sacrosanct except no one else is allowed to touch David Price or David Widger's work.
It seems to me that really all that is required is to recognise that a lot of the html in PG's archive is pretty antiquated by modern standards and needs to be updated. And basically this is a labour-intensive task which can't be completely automated because it wasn't originally done to consistent standards, because it happened over a decade or two during which html and ideas about how to use it evolved.
So really what needs to be done is to call for volunteers to do the clean up. And since Greg seems to imply that he is currently deluged with offers of documents from people who think they have already done this, it seems to me that there is plenty of enthusiasm out there (including me!).
And before that happens, some PG group (contributors to this forum, Wwers, whatever) needs to get together and write a 'How To' describing at the nuts and bolts level how to do the clean up. This should include a recommended set of styles (and the css that implements them) lots of advice about what not to do for people that don't already know, also including access to templates, css style sheets, etc.
As my ex-employer used to say, 'there are no technical answers to management problems, but sometimes there are management answers to technical problems'.
Bob Gibbins
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/mailman/listinfo/gutvol-d

Greg>Keep in mind that (as I just sent), we DO have a way of getting changes into the collection for the master formats. Correct me if I am wrong, but I have been assuming that "PG" is NOT receptive to "erratas" of the nature: "Pg11001.mobi contains obvious formatting errors when displayed on an actual mobi reader. I have tried this file on several different mobi readers and they all display the same formatting errors. I have confirmed that changing the <p> formatting statement in the css from: margin-top: 0.50em; margin-bottom: 0.50em; to: margin-top: 0.51em; margin-bottom: 0.49em; is in fact a reasonable practical solution to the problem."

On Tue, Jan 31, 2012 at 08:04:39PM -0800, Jim Adcock wrote:
Greg>Keep in mind that (as I just sent), we DO have a way of getting changes into the collection for the master formats.
Correct me if I am wrong, but I have been assuming that "PG" is NOT receptive to "erratas" of the nature:
You're fractionally wrong. We often forward such reports to Marcello (sometimes others) so that they can use it as feedback towards improvements. But generally you are right. My response when such things arrive in help@'s mailbox is along the lines of: - our auto-conversion is not always accurate, and is particularly thrown off by some fancy layout that is found in some of our HTML; and - we do not currently have a mechanism to tune those files, or add custom fixes to the collection, however, - we are hoping to have such capabilities in the future. -- Greg
"Pg11001.mobi contains obvious formatting errors when displayed on an actual mobi reader. I have tried this file on several different mobi readers and they all display the same formatting errors. I have confirmed that changing the <p> formatting statement in the css from:
margin-top: 0.50em; margin-bottom: 0.50em;
to:
margin-top: 0.51em; margin-bottom: 0.49em;
is in fact a reasonable practical solution to the problem."
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/mailman/listinfo/gutvol-d
participants (5)
-
don kretz
-
Greg Newby
-
Jim Adcock
-
Lee Passey
-
Robert Gibbins