Re: [gutvol-d] Using ebookmaker to create PDFs for print on demand publishing

Eric, I have experimented with .tex and looked at the source code for ebookmaker and have concluded that updating that code to produce print- on-demand ready files would be more trouble than it is worth. This afternoon I took one of my own RST donations, ran it against rst2xetex.py and with about a half hour of simple edits I made a .tex file that produced a PDF that met all of Kindle Direct Publishing's requirements and looked great. Any .tex file I could produce with a modified ebookmaker would still need a few manual edits to get ready for publication, so I think there would be greater benefit if I just taught people to do what I just learned to do. I understand that RST is not a very popular way to prepare PG donations but I think it would be if more people knew about it. You do have a very short article on RST in the Wiki but it is hard to find, even harder if you don't know what RST is and just want some tips on how to do a PG donation with less work. I think I could write something more fleshed out that actually sells the idea of using RST. I could also do a second article on using RST files for print on demand. I did my first donation back in 2011 https://www.gutenberg.org/files/36378/36378-h/36378-h.htm and discovered RST the following year, after doing several books the hard way. https://www.gutenberg.org/ebooks/39442 James Simmons On Mon, 2020-05-25 at 09:55 -0500, nicestep@gmail.com wrote:
Eric, I'm going to try publishing my book with the .tex file I created manually. That way I'll have a good Tex file to use as an example for generating automatically. So far it turns out there is more to Tex than I thought there was. I have a fourth volume of Ramayana to finish, so I'll probably do that before I work on updating ebookmaker. I'll create my own branch to work in. I agree that I might be the last user of RST, but there is no reason it has to stay that way. I just looked at the PG wiki pages and there are no articles on RST there at all. I can't even remember how I found out about it. My first RST donation was done back in 2012. There is other information that I don't see on the Wiki either. For example, articles talk about getting old books from secondhand shops, but I don't see any mention of archive.org and the fact that they have page images and OCR'ed text for tons of books. I could contribute a Wiki article on RST if you like. I have some experience writing about e-books from the One Laptop Per Child project: https://archive.org/details/EBookEnlightenment/mode/2up James Simmons
On Sun, 2020-05-24 at 21:38 -0400, Eric Hellman wrote:
OK, that's interesting; I'm not sure why they didn't show up on my spreadsheet. I'm in a position now where I can fix things at PG and move them forward; don't hesitate to contact me if you think you can help. Everyone wants to do the right thing, but the code and especially the workflow is from another era. The ebookmaker software is in relatively good shape now.
Eric
On May 24, 2020, at 6:24 PM, nicestep@gmail.com wrote:
Eric, I would question your statement that the most recent RST sourced book is 5 years old. I donated four of them more recently than that: https://www.gutenberg.org/ebooks/61937 https://www.gutenberg.org/ebooks/57265 https://www.gutenberg.org/ebooks/57826 https://www.gutenberg.org/ebooks/60188 I have done PG donations before discovering RST, but I would not go back. The books I've been doing lately have hundreds of footnotes. RST makes this tolerable. I think if more people knew it existed more would use it. It saves a great deal of work. I'll have to look at the code again to see how docutils is being subclassed. I hadn't noticed that. I've done Python programming in the past, for the One Laptop Per Child project, but I'm rusty at it. James Simmons
On Sun, 2020-05-24 at 10:09 -0400, Eric Hellman wrote:
The ebookmaker rst code is basically subclassed docutils - so deeply that upgrades to docutils broke ebookmaker and I had to fix it. My impression is that Marcello, the original author of ebookmaker, was a contributor to docutils. There are 485 RST-sourced books in PG, the most recent is 5 years old.
the intermediate tex files generated by ebookmaker during rst processing are accessible but not exposed on the pg website. For example: http://www.gutenberg.org/cache/epub/48620/pg48620.tex
In the rest of the world, there's been a lot more development of rst-ish tools outside of rst. Github flavored Markdown is the most widely used, but even asciidoc has found some adoption in the publishing industry. Gitbooks is a rather nice markdown- based book generator.
By all means, if you can make improvements to the ebookmaker code, submit a PR on github. No one is working on anything related to tex or rst. https://github.com/gutenbergtools/ebookmaker A "PODWriter" class might be what you're thinking about.
Eric
On May 24, 2020, at 8:51 AM, nicestep@gmail.com wrote:
My first impression of the PDF's created by ebookmaker was that with a little work they might be suitable for print on demand publishing like I have done several times with Create Space. I have spent a few hours installing ebookmaker on my own computer and experimenting with Tex and the attached files represent the kind of thing that ebookmaker might be modified to produce. They are not perfect, but I think they are good enough to criticize.
Some observations:
1. The Tex file was created by running my original RST file against both ebookmaker and rst2xetex.py. The two files that were produced were combined to produce the Tex file attached. Rst2xetex.py does not understand the custom RST elements PG uses so I had to make a version of the RST file without them to get rst2xetex to run properly. 2. The output of rst2xetex was actually closer to what I wanted than what ebookmaker produced. Ebookmaker does a table of contents that is just links with no page numbers, and the headings are not regular headings and subheadings that Tex uses. Rst2xetex gave me regular headings and subheadings that could create the TOC I wanted. I had to run xelatex against the Tex file twice to create the PDF with a TOC. 3. I think that a Tex file is a more useful output for ebookmaker than a finished PDF. Create Space wants you to have ISBN and ISBN-13 numbers on the Verso, so having a Tex file to edit is easier than trying to edit a PDF. Also, PG has a lot of short stories that could be combined to create anthologies and that would be easier with Tex files. 4. Looking at the code for ebookmaker it looks like some of it might have been adapted from code written for docutils. Some of the code is clear enough, and some is not.
James Simmons
--

Eric, Last week I published four print on demand books in record time using my RST files, the output of rst2xetex.py, and a little editing. The results were excellent. You can check them out at Amazon.com at these URL's: https://www.amazon.com/dp/B089M6J5GZ https://www.amazon.com/dp/B089HZCGQD https://www.amazon.com/dp/B089HZJ7N2 https://www.amazon.com/dp/B089J2TV9G In the book descriptions I provide links back to the original PG texts. I've got one more volume of *Ramayana* to finish, then I'm quitting Hindu scriptures for a while and doing some books on early aviation. I'll publish them all the same way. It is so easy, there is no reason not to. I have published other PG books as print on demand titles, by taking the HTML, stripping out styles, etc. and importing the page into Libre Office, then manually changing it to a word processed document with the needed page size, margins, etc. It is a LOT of work. Converting an RST file is not. James Simmons On Thu, May 28, 2020 at 5:53 PM <nicestep@gmail.com> wrote:
Eric,
I have experimented with .tex and looked at the source code for ebookmaker and have concluded that updating that code to produce print-on-demand ready files would be more trouble than it is worth. This afternoon I took one of my own RST donations, ran it against rst2xetex.py and with about a half hour of simple edits I made a .tex file that produced a PDF that met all of Kindle Direct Publishing's requirements and looked great. Any .tex file I could produce with a modified ebookmaker would still need a few manual edits to get ready for publication, so I think there would be greater benefit if I just taught people to do what I just learned to do.
I understand that RST is not a very popular way to prepare PG donations but I think it would be if more people knew about it. You do have a very short article on RST in the Wiki but it is hard to find, even harder if you don't know what RST is and just want some tips on how to do a PG donation with less work. I think I could write something more fleshed out that actually sells the idea of using RST. I could also do a second article on using RST files for print on demand.
I did my first donation back in 2011
https://www.gutenberg.org/files/36378/36378-h/36378-h.htm
and discovered RST the following year, after doing several books the hard way.
https://www.gutenberg.org/ebooks/39442
James Simmons
On Mon, 2020-05-25 at 09:55 -0500, nicestep@gmail.com wrote:
Eric,
I'm going to try publishing my book with the .tex file I created manually. That way I'll have a good Tex file to use as an example for generating automatically. So far it turns out there is more to Tex than I thought there was.
I have a fourth volume of Ramayana to finish, so I'll probably do that before I work on updating ebookmaker. I'll create my own branch to work in.
I agree that I might be the last user of RST, but there is no reason it has to stay that way. I just looked at the PG wiki pages and there are no articles on RST there at all. I can't even remember how I found out about it. My first RST donation was done back in 2012.
There is other information that I don't see on the Wiki either. For example, articles talk about getting old books from secondhand shops, but I don't see any mention of archive.org and the fact that they have page images and OCR'ed text for tons of books.
I could contribute a Wiki article on RST if you like. I have some experience writing about e-books from the One Laptop Per Child project:
https://archive.org/details/EBookEnlightenment/mode/2up
James Simmons
On Sun, 2020-05-24 at 21:38 -0400, Eric Hellman wrote:
OK, that's interesting; I'm not sure why they didn't show up on my spreadsheet.
I'm in a position now where I can fix things at PG and move them forward; don't hesitate to contact me if you think you can help. Everyone wants to do the right thing, but the code and especially the workflow is from another era. The ebookmaker software is in relatively good shape now.
Eric
On May 24, 2020, at 6:24 PM, nicestep@gmail.com wrote:
Eric,
I would question your statement that the most recent RST sourced book is 5 years old. I donated four of them more recently than that:
https://www.gutenberg.org/ebooks/61937
https://www.gutenberg.org/ebooks/57265
https://www.gutenberg.org/ebooks/57826
https://www.gutenberg.org/ebooks/60188
I have done PG donations before discovering RST, but I would not go back. The books I've been doing lately have hundreds of footnotes. RST makes this tolerable. I think if more people knew it existed more would use it. It saves a great deal of work.
I'll have to look at the code again to see how docutils is being subclassed. I hadn't noticed that. I've done Python programming in the past, for the *One Laptop Per Child *project, but I'm rusty at it.
James Simmons
On Sun, 2020-05-24 at 10:09 -0400, Eric Hellman wrote:
The ebookmaker rst code is basically subclassed docutils - so deeply that upgrades to docutils broke ebookmaker and I had to fix it. My impression is that Marcello, the original author of ebookmaker, was a contributor to docutils.
There are 485 RST-sourced books in PG, the most recent is 5 years old.
the intermediate tex files generated by ebookmaker during rst processing are accessible but not exposed on the pg website. For example: *http://www.gutenberg.org/cache/epub/48620/pg48620.tex <http://www.gutenberg.org/cache/epub/48620/pg48620.tex>*
In the rest of the world, there's been a lot more development of rst-ish tools outside of rst. Github flavored Markdown is the most widely used, but even asciidoc has found some adoption in the publishing industry. Gitbooks is a rather nice markdown-based book generator.
By all means, if you can make improvements to the ebookmaker code, submit a PR on github. No one is working on anything related to tex or rst. https://github.com/gutenbergtools/ebookmaker A "PODWriter" class might be what you're thinking about.
Eric
On May 24, 2020, at 8:51 AM, nicestep@gmail.com wrote:
My first impression of the PDF's created by ebookmaker was that with a little work they might be suitable for print on demand publishing like I have done several times with Create Space. I have spent a few hours installing ebookmaker on my own computer and experimenting with Tex and the attached files represent the kind of thing that ebookmaker might be modified to produce. They are not perfect, but I think they are good enough to criticize.
Some observations:
1. The Tex file was created by running my original RST file against both ebookmaker and rst2xetex.py. The two files that were produced were combined to produce the Tex file attached. Rst2xetex.py does not understand the custom RST elements PG uses so I had to make a version of the RST file without them to get rst2xetex to run properly. 2. The output of rst2xetex was actually closer to what I wanted than what ebookmaker produced. Ebookmaker does a table of contents that is just links with no page numbers, and the headings are not regular headings and subheadings that Tex uses. Rst2xetex gave me regular headings and subheadings that could create the TOC I wanted. I had to run xelatex against the Tex file twice to create the PDF with a TOC. 3. I think that a Tex file is a more useful output for ebookmaker than a finished PDF. Create Space wants you to have ISBN and ISBN-13 numbers on the Verso, so having a Tex file to edit is easier than trying to edit a PDF. Also, PG has a lot of short stories that could be combined to create anthologies and that would be easier with Tex files. 4. Looking at the code for ebookmaker it looks like some of it might have been adapted from code written for docutils. Some of the code is clear enough, and some is not.
James Simmons
--
participants (2)
-
James Simmons
-
nicestep@gmail.com