letter from blind PG user in Thailand RE: .txt vs. anything else

I just got an extremely distressed snailmail letter from a retired professor of Oriental and Buddhist studies, who is blind and lives in Thailand. I quote him: "Now as far as I can tell there are no more e-books available in "txt" format but only in "HTM" format. I am blind as I stated at the outset. I use an Apple computer with a screen reading programme called "OutSpoken" which conflicts with many document formats. It also conflicts with many common programmes, but that's another story which has nothing to do with you. "My problem with the Project Gutenberg Archive in its present form is that when I download a book, the file on my computer includes many rubbish characters which make it virtually impossible to read." <<a couple of paragraphs omitted>> "So in conclusion I thank you for services rendered me in the past and regret that they are no longer available to me because of the remarkable technological advances that you are incorporating into your archive. Yours, frustrated and regretful, Dr. Peter Della SAntina." I have told him that most books still are in .txt as well as other things, and that when they aren't, the problem usually is that they are prohibitively long and/or contain characters which we cannot use to post in .txt. I then told him that whenever he runs into this problem, he is to email me and I personally will send him a copy of the book in .txt; if I don't get it to him within 3 days, he is to assume I'm ill and send the request to Aaron Cannon. I know what he's talking about because I downloaded a copy of Kipling's story "The Brushwood Boy" a couple of weeks ago in .htm form and found a lot of rubbish characters in it, but I just read around them despite feeling rather exasperated. I had been seized with an acute wish to read the story at three AM, and after getting up, booting my computer etc., downloading it, putting it on my ebook reader, shutting everything down again, and going back to bed, my desire to redownload it at that time was nonexistent. Anyway, that could not possibly have been a recent post, though for all I know it might be a recent REpost. Most .htm files work just fine on that ebook reader. I told Dr. Santina that I appreciated his letting me know, and asked him to notify us if he ran into such problems in the future. I think this is a strong argument for continuing Michael's practice of posting everything in English in .txt AND whatever else rather than in whatever else by itself. Anne

Gutenberg9443@aol.com wrote:
"Now as far as I can tell there are no more e-books available in "txt" format but only in "HTM" format. I am blind as I stated at the outset. I use an Apple computer with a screen reading programme called "OutSpoken" which conflicts with many document formats. It also conflicts with many common programmes, but that's another story which has nothing to do with you.
"My problem with the Project Gutenberg Archive in its present form is that when I download a book, the file on my computer includes many rubbish characters which make it virtually impossible to read."
He should have told what model Apple computer he is using and what charcter encoding(s) his program expects. The online recoding service offers recoding into "Apple MacIntosh" character set. -- Marcello Perathoner webmaster@gutenberg.org

On Jan 20, 2005, at 2:32 PM, Gutenberg9443@aol.com wrote:
I know what he's talking about because I downloaded a copy of Kipling's story "The Brushwood Boy" a couple of weeks ago in .htm form and found a lot of rubbish characters in it, but I just read around them despite feeling rather exasperated. I had been seized with an acute wish to read the story at three AM, and after getting up, booting my computer etc., downloading it, putting it on my ebook reader, shutting everything down again, and going back to bed, my desire to redownload it at that time was nonexistent. Anyway, that could not possibly have been a recent post, though for all I know it might be a recent REpost. Most .htm files work just fine on that ebook reader.
It sounds like the file's bytes are being interpreted as the wrong text encoding. If I'm not mistaken, this is a problem with 8-bit ASCII, because there are various ways of using the upper 128 bits to represent characters. This especially is a problem with accented characters. Your correspondent may be using a program which assumes a different text encoding than is used in the PG files he has been opening. It may be assuming "Mac OS" encoding, when the text is in something else. This can also afflict text in an HTML file. If you go to a page in a foreign language, and the characters are not represented correctly even though you have a compatible font, it's probably the text encoding. On the Mac, the Safari browser has a submenu (View->Text Encoding) which allows you to select from a variety of text encodings. Find the right one, and the text will appear correct, without wrong characters or missing glyphs. This may be the problem with your ebook reader. Does it allow you to select the encoding used to interpret the html? Your friend in Thailand might also want to check for such a feature in his software. - Jon Hendry

On Thu, Jan 20, 2005 at 02:32:32PM -0500, Gutenberg9443@aol.com wrote:
I just got an extremely distressed snailmail letter from a retired professor of Oriental and Buddhist studies, who is blind and lives in Thailand.
I quote him:
"Now as far as I can tell there are no more e-books available in "txt" format but only in "HTM" format.
<snip> This seems very odd to me. I just ran a check, and I can find exactly 59 etext numbers for which there is HTML but no equivalent text file. Most of these are collections of images. A few are cases where the HTML was posted as a different number from the existing .txt. We don't do that any more, but there are some old cases. One was a bad upload, which I'm hunting around for a fix for now. Seriously, are you sure he's actually looking at _our_ site, as opposed to some other site like Blackmask?
I know what he's talking about because I downloaded a copy of Kipling's story "The Brushwood Boy" a couple of weeks ago in .htm form
And this is why I'm asking. The only copy of this title I can find in PG is in "The Day's Work" collection, in file dyswk10.txt, which is plain text. I don't see how you could have downloaded this from PG in HTML format. jim

PG often gets credit, or discredit, as the case may be, for all sorts of eBooks all over the world that we had nothing to do with, other than starting the eBook idea. Michael
participants (5)
-
Gutenberg9443@aol.com
-
Jim Tinsley
-
Jonathan Hendry
-
Marcello Perathoner
-
Michael Hart