Generated HTML files

As part of my efforts to develop a consistent markup schema for HTML files, and to create a method for automatic conversion to practical formats such as ePub and Kindle, I have reviewed many of the automatic conversions now available at Project Gutenberg, as well as several of the "snowflake" versions produced by the Widger/Haines consortium. My conclusion is that for almost all purposes the automatic conversion is superior to any one of the multiple Widger versions. This conclusion leads to two questions: 1. How can I get an automated HTML conversion of a text which has a hand-crafted HTML version? 2. Does the automated HTML conversion tool set exist on readingroo.ms, so I can execute it locally in that environment?

On 02/25/2012 08:49 PM, Lee Passey wrote:
1. How can I get an automated HTML conversion of a text which has a hand-crafted HTML version?
You can't.
2. Does the automated HTML conversion tool set exist on readingroo.ms, so I can execute it locally in that environment?
If you are referring to the conversion from: - RST: install epubmaker or use the online epubmaker, - plain text: there is no other converter than the one at gutenberg.org. -- Marcello Perathoner webmaster@gutenberg.org

On 2/25/2012 1:29 PM, Marcello Perathoner wrote:
On 02/25/2012 08:49 PM, Lee Passey wrote:
1. How can I get an automated HTML conversion of a text which has a hand-crafted HTML version?
You can't.
Unfortunate, ill-advised, but not surprising. As long as we have two "snowflakes", why not add a third?
2. Does the automated HTML conversion tool set exist on readingroo.ms, so I can execute it locally in that environment?
If you are referring to the conversion from:
- RST: install epubmaker or use the online epubmaker,
No, At this point, ReStructured Text holds about the same amount of interest for me as z.m.l.; there are probably about the same number of books marked up in each of the two formats.
- plain text: there is no other converter than the one at gutenberg.org.
Mr. Newby, can you get the "plain" text converter from gutenberg.org mirrored to readingroo.ms as part of the Project Gutenberg mirror?

On Sat, Feb 25, 2012 at 02:22:52PM -0700, Lee Passey wrote:
On 2/25/2012 1:29 PM, Marcello Perathoner wrote:
On 02/25/2012 08:49 PM, Lee Passey wrote:
1. How can I get an automated HTML conversion of a text which has a hand-crafted HTML version?
You can't.
Unfortunate, ill-advised, but not surprising. As long as we have two "snowflakes", why not add a third?
2. Does the automated HTML conversion tool set exist on readingroo.ms, so I can execute it locally in that environment?
If you are referring to the conversion from:
- RST: install epubmaker or use the online epubmaker,
No, At this point, ReStructured Text holds about the same amount of interest for me as z.m.l.; there are probably about the same number of books marked up in each of the two formats.
- plain text: there is no other converter than the one at gutenberg.org.
Mr. Newby, can you get the "plain" text converter from gutenberg.org mirrored to readingroo.ms as part of the Project Gutenberg mirror?
While I have access to the full back-end of www.gutenberg.org, it's a rather twisty little maze. Marcello would be better equipped to help with this. If he can install it on pglaf.org, or point me to it, I could copy it elsewhere. -- Greg

Lee> 1. How can I get an automated HTML conversion of a text which has a
hand-crafted HTML version?
Marcello>You can't. Well, help apparently not forthcoming from the PG homefront, check out GutenMark: www.sandroid.org/GutenMark which works out "OK" in my experience. You might want to play around with the command line switches and decide what you like best. Many of the not-handcrafted files on freekindlebooks.org are via GutenMark, which I used simply because it gave me a simple path from .txt to .html -- the files on freekindlebooks being generate at a time prior to when PG was willing to even host generated epub and mobi files. GutenMark has at least one problem in that by default it tries to change straight quotes to curly quotes but fails in non-trivial circumstances.

Hi Marcello, Lee, Am 25.02.2012 um 21:29 schrieb Marcello Perathoner:
On 02/25/2012 08:49 PM, Lee Passey wrote:
1. How can I get an automated HTML conversion of a text which has a hand-crafted HTML version?
You can't.
Theoretically, it is possible, but the task is quite complex and tedious. You need a lot of heuristic and intelligence in the program. It would be a very tedious task! There is no fast answer! It would require a humungous amount of man hours to write such a program. This at least, is my first impression. I seriously doubt it would be worth the effort in the PG context. regards Keith
participants (5)
-
Greg Newby
-
Jim Adcock
-
Keith J. Schultz
-
Lee Passey
-
Marcello Perathoner