
so, james, i'm finally getting around to trying to figure out what to do about you and your book, and whether i can help you. so i did a compare between a file of the book from december 22nd and just the other day... and i see some interesting stuff... :+) i expected to find you'd done a few more pages, with everything before them basically the same, and everything after them still left untouched... but what i actually found is that you have been doing some book-wide global-change editing, including a mass fix of spacey doublequotes, and some global-change name replacements. looks like you also did a hunt for em-dashes... well done, my friend. i'm sure it was all your idea, and a good idea it was, sir! i applaud your accomplishments... anyway... i think what i would need to do is to _rewrap_ the file back to its original p-book linebreaks, and reintroduce indicators for the pagebreaks. that's what it would take to put it in my system. i'm willing to _try_ to do those things, to see if it's relatively _easy_ to do them, and -- if so -- (but there are no guarantees!), then i would be able to build a system for you. will you use it? if not, i won't bother. but if so, i'll give it a try. -bowerbird

Bowerbird, I have been doing some mass search and replaces. I'm still putting in em-dashes by hand, but a lot of the names that need accenting I've been trying to do by mass search and replace. I am somewhat familiar with this book so I can sometimes guess which accented words will pop up in future pages and which won't. Roger's idea of running text through a preprocessor to try and guess which lines are poetry, which are blockquotes, etc. sounds appealing to me. A two-phase approach like that might make HTML conversions more accurate. I know that what I have now would not survive the GUIGUTS HTML converter. I'm not devoting a lot of time to the Bhaghavat Purana these days. It may be several weeks before I am finished with it. I could take a break in the middle of it to work on a Bhagavad Gita translation. That is a much shorter book, but one that is definitely worthy of PG. If you wanted to abandon BP and instead try to create something along the same lines for Bhagavad Gita that might be better. I could use your software from the beginning, doing it exactly the way you wish, and I would benefit from text that has em-dashes and maybe even italics and accents preserved. I cannot retrofit these things in BP easily, so I'd rather trudge on as I've been doing. BG would prove the value of your approach, I think. BP has so many tables and family trees that it really demands the approach I'm taking. I've had to rework the family trees to make them work as 70 column ASCII text. I don't think DP could have done this, for example. The Bhagavad Gita translation I'm leaning towards doing is this one: http://www.archive.org/details/srimadbhagavadg00swamgoog James Simmons On Thu, Jan 5, 2012 at 12:09 PM, <Bowerbird@aol.com> wrote:
so, james, i'm finally getting around to trying to figure out what to do about you and your book, and whether i can help you.
so i did a compare between a file of the book from december 22nd and just the other day...
and i see some interesting stuff... :+)
i expected to find you'd done a few more pages, with everything before them basically the same, and everything after them still left untouched...
but what i actually found is that you have been doing some book-wide global-change editing, including a mass fix of spacey doublequotes, and some global-change name replacements. looks like you also did a hunt for em-dashes...
well done, my friend.
i'm sure it was all your idea, and a good idea it was, sir! i applaud your accomplishments...
anyway...
i think what i would need to do is to _rewrap_ the file back to its original p-book linebreaks, and reintroduce indicators for the pagebreaks.
that's what it would take to put it in my system.
i'm willing to _try_ to do those things, to see if it's relatively _easy_ to do them, and -- if so -- (but there are no guarantees!), then i would be able to build a system for you. will you use it? if not, i won't bother. but if so, i'll give it a try.
-bowerbird
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/mailman/listinfo/gutvol-d

On Thu, January 5, 2012 3:08 pm, James Simmons wrote:
The Bhagavad Gita translation I'm leaning towards doing is this one:
Might I suggest this one? http://www.ebookcoop.net/ebookcoop/FromIA?srimadbhagavadg00swamgoog

Lee, I didn't get any response from that URL for a long time. When I got it the text was a mess. Without page images I wouldn't know how to format it. I suggested the one I did because the copyright date was good, it seemed to be well done, and we'd have page images and OCR'd text to work with. I will do a Bhagavad Gita sooner or later. The important thing is to have one that can meet Bowerbird's requirements for his ZML application. That will determine if I should put my longer book project on hold. James Simmons On Fri, Jan 6, 2012 at 2:26 PM, Lee Passey <lee@novomail.net> wrote:
On Thu, January 5, 2012 3:08 pm, James Simmons wrote:
The Bhagavad Gita translation I'm leaning towards doing is this one:
Might I suggest this one?
http://www.ebookcoop.net/ebookcoop/FromIA?srimadbhagavadg00swamgoog
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/mailman/listinfo/gutvol-d

On Fri, January 6, 2012 1:43 pm, James Simmons wrote:
Lee,
I didn't get any response from that URL for a long time.
That's to be expected, there's lots of processing going on in the background.
When I got it the text was a mess. Without page images I wouldn't know how to format it.
But you have the page images; they're right there at http://ia700406.us.archive.org/24/items/srimadbhagavadg00swamgoog/srimadbhag.... If the text is a mess, then you've got problems, because it is exactly the same text that you would get from IA, except italics, soft-hyphens and em-dashes have been preserved, and every word is marked if 1. it could not be found in FineReader's dictionary or 2. FineReader is uncertain if it got the OCR correct. Maybe you like going back and adding all that stuff in by hand, but I don't, so I try to profit from every advantage I can.

On Fri, January 6, 2012 1:43 pm, James Simmons wrote:
Lee,
I didn't get any response from that URL for a long time.
That's to be expected, there's lots of processing going on in the background.
When I got it the text was a mess. Without page images I wouldn't know how to format it.
But you have the page images; they're right there at http://ia700406.us.archive.org/24/items/srimadbhagavadg00swamgoog/srimadbhag.... If the text is a mess, then you've got problems, because it is exactly the same text that you would get from IA, except italics, soft-hyphens and em-dashes have been preserved, and every word is marked if 1. it could not be found in FineReader's dictionary or 2. FineReader is uncertain if it got the OCR correct. Maybe you like going back and adding all that stuff in by hand, but I don't, so I try to profit from every advantage I can.
participants (3)
-
Bowerbird@aol.com
-
James Simmons
-
Lee Passey