Send gutvol-d mailing list submissions to
gutvol-d@lists.pglaf.org
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.pglaf.org/mailman/listinfo/gutvol-d
or, via email, send a message with subject or body 'help' to
gutvol-d-request@lists.pglaf.org
You can reach the person managing the list at
gutvol-d-owner@lists.pglaf.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of gutvol-d digest..."
Today's Topics:
1. File Recode Service is not functional (Rick Tonsing)
2. UTF-8 File Names (Rick Tonsing)
3. Re: UTF-8 File Names (Greg Newby)
4. Re: File Recode Service is not functional (Greg Newby)
5. Re: UTF-8 File Names (Al Haines)
---------- Forwarded message ----------
From: Rick Tonsing <okrick@gmail.com>
To: gutvol-d@lists.pglaf.org
Cc:
Bcc:
Date: Sat, 4 Jan 2020 16:43:08 -0800
Subject: [gutvol-d] File Recode Service is not functionalFYII noticed that your File Recode Service is not functional. Specifically, there is no method for entering the Filename for testing. It appears to be intentional.However, if the site is intentionally disabled then you should probably also disable the link from "V.74. What characters can I use?"Cheers,Rick
---------- Forwarded message ----------
From: Rick Tonsing <okrick@gmail.com>
To: gutvol-d@lists.pglaf.org
Cc:
Bcc:
Date: Sat, 4 Jan 2020 23:00:28 -0800
Subject: [gutvol-d] UTF-8 File NamesThe subject of file names with xxxx-lt1.txt or xxxx-lat1.txt for Latin-1 and the xxxx-utf8.txt for UTF-8 text files came up in a DP forum.The gist of the discussion (as I see it) is that we select the file type on the direct upload screen. Do we even need an indicator of the file type in the file name? If so, then do we need both? UTF8 is be the default so xxxx.txt without the (-UTF8) should be sufficient for UTF-8 files. The -lt1 or -lat1 could remain the backup signal that the file needs to go through the upconvert program to UTF-8.Cheers,Rick
---------- Forwarded message ----------
From: Greg Newby <gbnewby@pglaf.org>
To: Project Gutenberg Volunteer Discussion <gutvol-d@lists.pglaf.org>
Cc:
Bcc:
Date: Sun, 5 Jan 2020 06:55:12 -0800
Subject: Re: [gutvol-d] UTF-8 File Names
Hi, Rick. The whitewashers get whatever you upload. No transformation or conversion is done automatically: it's just a .zip file that they download and process.
Informative filenames are fine. Any text should end in .txt
- Greg
On Sat, Jan 04, 2020 at 11:00:28PM -0800, Rick Tonsing wrote:
> The subject of file names with xxxx-lt1.txt or xxxx-lat1.txt for Latin-1
> and the xxxx-utf8.txt for UTF-8 text files came up in a DP forum
> <https://www.pgdp.net/phpBB3/viewtopic.php?p=1189531#p1189531>.
>
> The gist of the discussion (as I see it) is that we select the file type on
> the direct upload screen. Do we even need an indicator of the file type in
> the file name? If so, then do we need both? UTF8 is be the default so
> xxxx.txt without the (-UTF8) should be sufficient for UTF-8 files. The
> -lt1 or -lat1 could remain the backup signal that the file needs to go
> through the upconvert program to UTF-8.
>
> Cheers,
>
> Rick
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> https://lists.pglaf.org/mailman/listinfo/gutvol-d
> Unsubscribe: https://lists.pglaf.org/mailman/options/gutvol-d
---------- Forwarded message ----------
From: Greg Newby <gbnewby@pglaf.org>
To: Project Gutenberg Volunteer Discussion <gutvol-d@lists.pglaf.org>
Cc:
Bcc:
Date: Sun, 5 Jan 2020 07:07:08 -0800
Subject: Re: [gutvol-d] File Recode Service is not functional
Hi, Rick. It looks like that was never working. It doesn't have any code for actually uploading or processing a file.
I've edited it to indicate it's no longer in use.
Concerning the FAQ, etc.: This will soon be deprecated. The new site is finally nearly ready, and can be viewed here:
https://dev.gutenberg.org
We've tried to move/remove everything outdated. Stuff that is moved to the "atic" area is clearly marked as no longer being accurate.
There are just a few more things to be fixed before we open this for more widespread testing. There are also a few things that will not move yet.
Fixes/edits can be emailed, or use a pull request here: https://github.com/gbnewby/gutenbergsite
- Greg
On Sat, Jan 04, 2020 at 04:43:08PM -0800, Rick Tonsing wrote:
> FYI
>
> I noticed that your File Recode Service
> <https://www.gutenberg.org/catalog/world/recode> is not functional.
> Specifically, there is no method for entering the Filename for testing. It
> appears to be intentional.
>
> However, if the site is intentionally disabled then you should probably
> also disable the link from "V.74. What characters can I use?
> <https://www.gutenberg.org/wiki/Gutenberg:Volunteers%27_FAQ#About_the_characters_you_use>
> "
>
> Cheers,
>
> Rick
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> https://lists.pglaf.org/mailman/listinfo/gutvol-d
> Unsubscribe: https://lists.pglaf.org/mailman/options/gutvol-d
---------- Forwarded message ----------
From: Al Haines <ajhaines@shaw.ca>
To: <gbnewby@pglaf.org>, "'Project Gutenberg Volunteer Discussion'" <gutvol-d@lists.pglaf.org>, "Joseph E. Loewenstein, M.D." <loewenstein@sssnet.com>, "'Al Haines'" <ajhaines@shaw.ca>, "'Chuck Greif'" <cbgrf@yahoo.com>, "'David Widger'" <cdwidger@gmail.com>
Cc:
Bcc:
Date: Sun, 5 Jan 2020 11:51:53 -0800
Subject: Re: [gutvol-d] UTF-8 File Names
UTF8 text files should have "-utf8" in their file names, e.g.
"myfile-utf8.txt". This forces PG's posting software to treat text
files as UTF8, rather than its defaulting to Latin1/ASCII. (This has
been a de facto standard for some years now.) It's not necessary to do
this for HTML files.
If necessary, the posting software will prompt for the correct character
set, but it's an easy prompt to skip through, or give the wrong answer
to. The presence of "-utf8" in the filename obviates the need for the
prompt. Latin1/ASCII text files don't trigger the prompt at all.
If PG's upload check reports a UTF8 text file without the "-utf8", then
I rename the file inside the zip file, before unzipping it.
Conversely, if a text file arrives with "-lat1", "-ltn1", "-iso",
"-asc", or some such Latin1/ASCII indicator, I rename the file to remove
it, since addhd handles Latin1/ASCII files correctly.
Also, to avoid spurious files generated by the posting software, I make
sure the base name of the zip file is the same as the base names of the
text and HTML files, e.g. myfile.zip contains myfile.txt (or
myfile-utf8.txt) and myfile.htm.
Re HTML files: PG's extension for them is ".htm". When an uploaded zip
file contains an HTML file with the extension ".html", I rename it, as
mentioned above. If this isn't done, the posting software copies the
file to a new file with the extension ".htm", e.g. "myfile.html" is
copied to "myfile.htm", leaving a spurious ".html" file. (BTW, the
posting software never, ever, modifies the submitted files--it copies
them to new files with the required name, then works with the new
files.)
While I'm at it, more on file names...
As mentioned above, I rename text and HTML files, and sometimes zip
files, so that their base names are the same (except for the "-utf8"
part of text files). There's no need for text/HTML files to have some
versioning component. For example, I recently handled a zip file named
diam-pg.zip, which contained diam-b.html and diam-cc-utf8.txt. I
renamed the files to remove the "-b" and "-cc" parts, and the zip file
to remove the "-pg" part, of their respective names.
I've also seen names like "myfile-text.txt" and "myfile-html.html", and
"myfile-8.txt" and "myfile-h.html". There's no need for such
double-indicators of a file's type.
There have been any number of similar variations, some probably related
to the PPer's versioning system as they work through the PPing process;
some related the PPing software.
In one of the DP posts are these fragments:
> All of PG uploads are currently UTF-8, even if the file is Latin-1.
No idea where this idea comes from, but it's wrong. See below.
> Or even do away with the appending altogether. The direct upload panel
lets
> the White Washers know what encoding to expect. So they know to run
the
> program to convert the Latin-1 characters to UTF-8.
Also wrong. The WWers don't see PG's upload screen (except for their
own projects). They also never convert Latin1 to UTF8--that's done to
posted Latin1/ASCII files by PG's behind-the-scenes software, which the
WWers have no control over.
I think that's sufficient food for thought/discussion for now...
Al
> -----Original Message-----
> From: gutvol-d [mailto:gutvol-d-bounces@lists.pglaf.org] On
> Behalf Of Greg Newby
> Sent: Sunday, January 05, 2020 6:55 AM
> To: Project Gutenberg Volunteer Discussion
> Subject: Re: [gutvol-d] UTF-8 File Names
>
>
> Hi, Rick. The whitewashers get whatever you upload. No
> transformation or conversion is done automatically: it's just
> a .zip file that they download and process.
>
> Informative filenames are fine. Any text should end in .txt
> - Greg
>
>
>
> On Sat, Jan 04, 2020 at 11:00:28PM -0800, Rick Tonsing wrote:
> > The subject of file names with xxxx-lt1.txt or xxxx-lat1.txt for
> > Latin-1 and the xxxx-utf8.txt for UTF-8 text files came up in a DP
> > forum
> <https://www.pgdp.net/phpBB3/viewtopic.php?p=1189531#p1189531>
.
>
> The gist of the discussion (as I see it) is that we select the file
> type on the direct upload screen. Do we even need an indicator of the
> file type in the file name? If so, then do we need both? UTF8 is be
> the default so xxxx.txt without the (-UTF8) should be sufficient for
> UTF-8 files. The -lt1 or -lat1 could remain the backup signal that
> the file needs to go through the upconvert program to UTF-8.
>
> Cheers,
>
> Rick
> _______________________________________________
> gutvol-d mailing list
> gutvol-d@lists.pglaf.org
> https://lists.pglaf.org/mailman/listinfo/gutvol-d
> Unsubscribe: https://lists.pglaf.org/mailman/options/gutvol-d
_______________________________________________
gutvol-d mailing list
gutvol-d@lists.pglaf.org
https://lists.pglaf.org/mailman/listinfo/gutvol-d
Unsubscribe: https://lists.pglaf.org/mailman/options/gutvol-d
_______________________________________________
gutvol-d mailing list
gutvol-d@lists.pglaf.org
https://lists.pglaf.org/mailman/listinfo/gutvol-d