Re: [gutvol-d] I'm sorry but I don't get it...

I started e-books in the old days when PG was only plain text. Then after quite a long lapse I had returned to discover that I could release a book in HTML if I wished, supplying a standard TXT along with it. I am happy with this arrangement, sometimes doing both HTML and TXT, and sometimes just TXT depending on how highly formatted the original was. I tend to work the opposite way, though, doing the HTML first (using a text editor incidentally), then stripping the code for the TXT. It is probably not the most efficient way, but hobbies are not supposed to be efficient. I am ignorant too about the acronyms you mentioned. I am also very pragmatic, and hope to remain totally ignorant of these until someone proves to me--with a history of examples--that it is worth it. TXT and HTML have such histories, so I shall stick with these for now. Regarding HTML, some thoughts. . . - Use the full range of tags when appropriate (but if possible stick with the older 3.2 tags unless necessary. I always try the simplest tool first that will do the job). There was a reply about the limitations in TXT with heading hierarchies. HTML has several levels of header tags that are meant to be used for this purpose. Other tags can be used creatively to achieve other ends. A list of the 3.2 tags are at http://www.htmlhelp.com/reference/wilbur/list.html (don't forget to validate, though). - The huge benefit of HTML (besides the text formatting that you mentioned) is the ability to insert images. Some books I would never have considered working on if could not have done an HTML. - Don't forget to set the background color if you want a specific color (in the BODY tag, or style sheet). I have seen hundreds of pages where the writer assumes that white is always the default background color for everyone (not true) intending the graphics to blend into the background. -----Original Message----- From: Joshua Hutchinson <joshua@hutchinson.net> Sent: Oct 15, 2004 7:08 AM To: Project Gutenberg Volunteer Discussion <gutvol-d@lists.pglaf.org> Subject: Re: [gutvol-d] I'm sorry but I don't get it... Steve makes a good answer in another post, but I wanted to add my personal holy grail that hopefully a TEI-Lite master format will help bring about... A single master document. Right now, I create a ASCII version and then a HTML version. If I make the ASCII version first, it almost never fails that I find at least one more mistake when I then do the HTML version. I fix it there, but I have to remember it and go back to the ASCII version and make the fix there. And god forbid the fix requires another rewrap. A master document format that is auto-converted to the others (at an acceptable level) would be wonderful and, imo, worth a little extra up front effort to prepare it. If someone could get a working bit of code in place, I'd be happy to start testing it like crazy and work on old texts to get it converted to that format. Josh John Hagerson wrote:
Please picture this scenario:
I'm a volunteer who has scanned a public-domain book and wants to make it available through the PG distribution mechanism (free of charge, available until the Internet collapses under the weight of spam and next-generation pornography, yadda, yadda, yadda).
Today, if I can convert this book to plain text (according to some stated formatting conventions), I may submit the book. If I'm ambitious, I can create an HTML version, which presents the same information, but allows "real" formatting rather than _italic_ and *bold*.
In the background, however, there is this Whole New World(tm) of semantic tagging, which presumably will allow the book to make snacks and provide entertainment during the reading process. But, for me, as a volunteer, who spends a considerable amount of time working on books, but enjoys actually finishing one and seeing it posted, I can't get my arms around the benefits.
Except for recognizing the acronyms, I am agnostic to XML/ZML/TEI/ABC/EIEIO.
Could someone please explain the benefit of semantic tagging and why it won't horribly lengthen the amount of time required to produce an eBook?
Thank you.
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d --------------------------- Dennis McCarthy nihil_obstat@mindspring.com

But PG has adopted standards which limit the range of tags and CSS you can use, so you may not be able to specify changes in background color or font, such as Alice in Wonderland. Some contributors put their HTML elsewhere, perhaps for this reason. Bad news. nwolcott2@post.harvard.edu Friar Wolcott, Gutenberg Abbey, Sherwood Forrest ----- Original Message ----- From: "Dennis McCarthy" <nihil_obstat@mindspring.com> To: "Project Gutenberg Volunteer Discussion" <gutvol-d@lists.pglaf.org> Sent: Friday, October 15, 2004 9:13 AM Subject: Re: [gutvol-d] I'm sorry but I don't get it...
I started e-books in the old days when PG was only plain text. Then after
quite a long lapse I had returned to discover that I could release a book in HTML if I wished, supplying a standard TXT along with it.
I am happy with this arrangement, sometimes doing both HTML and TXT, and
sometimes just TXT depending on how highly formatted the original was. I tend to work the opposite way, though, doing the HTML first (using a text editor incidentally), then stripping the code for the TXT. It is probably not the most efficient way, but hobbies are not supposed to be efficient.
I am ignorant too about the acronyms you mentioned. I am also very
pragmatic, and hope to remain totally ignorant of these until someone proves to me--with a history of examples--that it is worth it. TXT and HTML have such histories, so I shall stick with these for now.
Regarding HTML, some thoughts. . .
- Use the full range of tags when appropriate (but if possible stick with
the older 3.2 tags unless necessary. I always try the simplest tool first that will do the job). There was a reply about the limitations in TXT with heading hierarchies. HTML has several levels of header tags that are meant to be used for this purpose. Other tags can be used creatively to achieve other ends. A list of the 3.2 tags are at http://www.htmlhelp.com/reference/wilbur/list.html (don't forget to validate, though).
- The huge benefit of HTML (besides the text formatting that you
mentioned) is the ability to insert images. Some books I would never have considered working on if could not have done an HTML.
- Don't forget to set the background color if you want a specific color
(in the BODY tag, or style sheet). I have seen hundreds of pages where the writer assumes that white is always the default background color for everyone (not true) intending the graphics to blend into the background.
-----Original Message----- From: Joshua Hutchinson <joshua@hutchinson.net> Sent: Oct 15, 2004 7:08 AM To: Project Gutenberg Volunteer Discussion <gutvol-d@lists.pglaf.org> Subject: Re: [gutvol-d] I'm sorry but I don't get it...
Steve makes a good answer in another post, but I wanted to add my personal holy grail that hopefully a TEI-Lite master format will help bring about...
A single master document.
Right now, I create a ASCII version and then a HTML version. If I make the ASCII version first, it almost never fails that I find at least one more mistake when I then do the HTML version. I fix it there, but I have to remember it and go back to the ASCII version and make the fix there. And god forbid the fix requires another rewrap.
A master document format that is auto-converted to the others (at an acceptable level) would be wonderful and, imo, worth a little extra up front effort to prepare it.
If someone could get a working bit of code in place, I'd be happy to start testing it like crazy and work on old texts to get it converted to that format.
Josh
John Hagerson wrote:
Please picture this scenario:
I'm a volunteer who has scanned a public-domain book and wants to make it available through the PG distribution mechanism (free of charge,
available
until the Internet collapses under the weight of spam and next-generation pornography, yadda, yadda, yadda).
Today, if I can convert this book to plain text (according to some stated formatting conventions), I may submit the book. If I'm ambitious, I can create an HTML version, which presents the same information, but allows "real" formatting rather than _italic_ and *bold*.
In the background, however, there is this Whole New World(tm) of semantic tagging, which presumably will allow the book to make snacks and provide entertainment during the reading process. But, for me, as a volunteer, who spends a considerable amount of time working on books, but enjoys actually finishing one and seeing it posted, I can't get my arms around the benefits.
Except for recognizing the acronyms, I am agnostic to XML/ZML/TEI/ABC/EIEIO.
Could someone please explain the benefit of semantic tagging and why it won't horribly lengthen the amount of time required to produce an eBook?
Thank you.
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d
--------------------------- Dennis McCarthy nihil_obstat@mindspring.com
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d

On Fri, Oct 15, 2004 at 10:53:39AM -0400, Norm Wolcott wrote:
But PG has adopted standards which limit the range of tags and CSS you can use, so you may not be able to specify changes in background color or font, such as Alice in Wonderland. Some contributors put their HTML elsewhere, perhaps for this reason. Bad news.
A slight correction: it's true that if you submit HTML files, it's likely for CSS, bgcolors and other stuff to be stripped out. Part of this is our automated "add a header" programs. Part is a desire to let the HTML be fairly generic. But if you have an eBook that you'd really like to be displayed with particular colors, fonts, etc., just ask. The only real "standard" is that we strongly desire valid HTML (per http://validator.w3.org). The rest is processing, programs and procedures, which might have the same impact as a standard sometimes, but should not be mistaken for one. As MH likes to say, we're pretty well willing to try almost anything, at least in small quantities. Just ask. -- Greg
participants (3)
-
Dennis McCarthy
-
Greg Newby
-
Norm Wolcott