[posted] Language detection error? (Re: Posted (#40466, Sifferath) !)

Al Haines ajhaines at shaw.ca
Sat Aug 11 13:21:18 PDT 2012


When a copyright clearance is submitted, the only allowed language entry
on the clearance form is a language code, e.g. "en".  When the finished
ebook files are later uploaded, that "en" is translated into "English",
which is what comes to the WWers.  

I suspect that if this translation can't be done, as apparently with
"oj", either because the copyright submitter or uploader entered an
incorrect or unknown code, the code is passed as-is to the WWers, unless
the uploader intervenes with the full name of the language.

Al



> -----Original Message-----
> From: Blower Nigel [mailto:NBlower at Queenedith.cambs.sch.uk] 
> Sent: Saturday, August 11, 2012 12:22 PM
> To: Al Haines; gbnewby at pglaf.org
> Cc: 'Project Gutenberg Postings Announcements'; 'Peter Podgor 
> ek'; dp-post at pgdp.net; 'Andrew Sly'; 'Marcello Perathoner'
> Subject: RE: Language detection error? (Re: [posted] Posted 
> (#40466, Sifferath) !)
> 
> 
> Thanks for clarifying Al.
> 
> On the upload form, where you enter the language, it actually 
> says "Main language (two-letter code or single-word language 
> name):". The reason I put "oj" instead of "Ojibwa" is that I 
> had seen it spelt Ojibwa and Ojibwe, and thought if I entered 
> the ISO code there wouldn't be any confusion, and the code 
> would match whichever spelling had been decided upon for 
> cataloguing at PG.
> 
> It appears as though I inadvertantly caused confusion - 
> sorry. What I could have done in retrospect is to also add 
> the full language name alternatives in the note to WWer. I 
> hadn't realised that the language part of it needed manual 
> intervention by Al.
> 
> Perhaps if the 2 letter code is not good for the WWers, the 
> wording on the upload form should be changed.
> 
> Thanks to all for sorting this out
> Nigel
> 
> ________________________________________
> From: Al Haines [ajhaines at shaw.ca]
> Sent: 11 August 2012 19:42
> To: gbnewby at pglaf.org; Blower Nigel
> Cc: 'Project Gutenberg Postings Announcements'; 'Peter Podgor 
> ek'; dp-post at pgdp.net; 'Andrew Sly'; 'Marcello Perathoner'
> Subject: RE: Language detection error? (Re: [posted] Posted 
> (#40466, Sifferath) !)
> 
> A couple of things happened with this one's language.
> 
> Nigel says he entered "oj" on the upload form, which I 
> changed to "Ojibwa".  (I had to figure this out from the 
> uploaded text file.)  BUT, I forgot to save the change, which 
> left "oj".  I'm guessing (and *only*
> guessing) that the posting software saw "oj", didn't 
> understand it, and put "English", which is what ended up in 
> the catalog.  If I had saved the change, "Ojibwa" would have 
> been added to PG's list of languages, and the catalog page correct.
> 
> When Nigel informed me of this book's bibrec page showing 
> English, I tried to correct 40466's language to "Ojibwa", but 
> there's no mechanism (that I could see) in the catalog 
> back-end software to manually add a new language.  (You *can* 
> add new authors.)  When I did a wild-card search (*) to get 
> the full list of available languages, I found "Ojibwa, 
> Western", and used it.
> 
> 
> Hint to DP uploaders: language names should always be written 
> in full. It saves the WWers from having to figure out, or ask 
> the uploader, what a code means.
> 
> Al
> 
> 
> 
> > -----Original Message-----
> > From: Greg Newby [mailto:gbnewby at pglaf.org]
> > Sent: Saturday, August 11, 2012 11:22 AM
> > To: Blower Nigel
> > Cc: Project Gutenberg Postings Announcements; Peter Podgor ek; 
> > dp-post at pgdp.net; Andrew Sly; Marcello Perathoner; Al Haines
> > Subject: Language detection error? (Re: [posted] Posted (#40466, 
> > Sifferath) !)
> >
> >
> > Thanks for this closer look, Nigel.  In response, I also 
> just took a 
> > closer look, and now wonder whether there was a glitch in the Web 
> > cataloging or human cataloging.  I don't think it was Al 
> that entered 
> > "Ojibwa, Western," but the automatic post-processing & 
> cataloging that 
> > happens when new files are posted.
> >
> > Within the text (HTML and .txt) you can see the language is 
> Ojibwa, as 
> > you submitted:
> >   Language: Ojibwa
> >
> > But the bibrec page lists "Ojibwa, Western:"
> >   http://www.gutenberg.org/ebooks/40466
> >
> > It might be that Marcello's automatic cataloging somehow 
> matched on a 
> > more specific language code (perhaps simply selecting the latest 
> > sorted code with a matching string).
> >
> > Based on your input, and the fact that the books do indicate Ojibwa 
> > withIN them, I think we should recode ISO 639-3 code "oji", as you 
> > indicated below.
> >
> > I'm cc'ing Andrew Sly, who (along with Marcello and I, and a few
> > others) who can "make it so" in the bibrec.  But we can see whether 
> > others have different opinions or diagnostics.
> >
> > Even if it's "Ojibwa," rather than "Ojibwa, Western", it's a new 
> > language for Project Gutenberg.  Thanks again,
> >   -- Greg
> >
> > On Sat, Aug 11, 2012 at 05:39:52PM +0100, Blower Nigel wrote:
> > > Hi all
> > >
> > > I'm not sure it is *Western* Ojibwa.
> > >
> > > The project at DP was labelled as Ojibwa, and after I PPVed
> > it, when I
> > > uploaded to PG, I entered the 2 character language code
> > "oj" which is
> > > the ISO 639-1 code for Ojibwa. The WWer, Al Haines, entered
> > "Ojibwa,
> > > Western" in the Bibrec, which is ISO 639-3 code "ojw".
> > >
> > > In my ignorance, I assumed that "Western Ojibwa" was the
> > full name for
> > > Ojibwa. Since Greg's email, I've investigated a bit more, 
> and there 
> > > are several Ojibwa dialects. Since on the title page Sifferath is 
> > > described as Missionary of the Ottawa and Otchipwe Indians,
> > and this
> > > page
> > (http://home.kpn.nl/cvkolmes/ojibwe/Siff/Sifferath.htm) describes
> > > the book as Sifferath's Odaawaa Catechism, perhaps the
> > language would
> > > be better described as "Ottawa", which is ISO 639-3 code 
> "otw", or 
> > > maybe just Ojibwa, ISO 639-3 code "oji" which is an 
> inclusive code, 
> > > would be sufficient.
> > >
> > > If you search for Ojibwa on the gutenberg site, some books
> > do come up
> > > which are labelled North American Indian "nai".
> > >
> > > Sorry if any of this confusion is my fault - do let me 
> know if you 
> > > need me to do anything about it.
> > >
> > > Regards
> > > Nigel
> > >
> > > ________________________________________
> > > From: Greg Newby [gbnewby at pglaf.org]
> > > Sent: 11 August 2012 15:52
> > > To: Project Gutenberg Postings Announcements
> > > Cc: Blower Nigel; Peter Podgor ek; dp-post at pgdp.net
> > > Subject: Re: [posted] Posted (#40466, Sifferath) !
> > >
> > > This is our first eBook in the language of Western Ojibwa!
> > >   -- Greg
> > >
> > > On Thu, Aug 09, 2012 at 01:57:51PM -0700, Al Haines wrote:
> > > >
> > > > A Short Compendium of the Catechism for the Indians, by
> > N. L. Sifferath  40466
> > > >   [Subtitle: With the Approbation of the Rt. Rev. 
> Frederic Baraga,
> > > >               Bishop of Saut Sainte Marie]
> > > >   [Other: Frederic Baraga]
> > > >   [Language: Ojibwa]
> > > >   [Link: http://www.gutenberg.org/4/0/4/6/40466 ]
> > > >   [Files: 40466.txt; 40466-h.htm]
> > > >
> > > > Thanks to Peter Podgor?ek, Heiko Evermann and the Online
> > Distributed
> > > > Proofreading Team at http://www.pgdp.net (This book was 
> produced 
> > > > from scanned images of public domain material from the
> > Google Print
> > > > project and from
> > > > Canadiana.org)
> > > >
> > > >
> > > >
> > > > Regards,
> > > > Al
> > >
> > >
> > > Dr. Gregory B. Newby
> > > Chief Executive and Director
> > > Project Gutenberg Literary Archive Foundation www.gutenberg.org A
> > > 501(c)(3) not-for-profit organization with EIN 64-6221541 
> > > gbnewby at pglaf.org
> > >
> > > The information in this email is confidential and may be legally 
> > > privileged. It is intended solely for the addressee. If 
> you receive 
> > > this email by mistake please notify the sender and delete it 
> > > immediately. Opinions expressed are those of the individual
> > and do not
> > > necessarily represent the opinion of The Queens?
> > Federation. All sent
> > > and received emails from The Queens? Federation are automatically 
> > > scanned for the presence of computer viruses and security issues.
> >
> 
> The information in this email is confidential and may be 
> legally privileged. It is intended solely for the addressee. 
> If you receive this email by mistake please notify the sender 
> and delete it immediately. Opinions expressed are those of 
> the individual and do not necessarily represent the opinion 
> of The Queens' Federation. All sent and received emails from 
> The Queens' Federation are automatically scanned for the 
> presence of computer viruses and security issues.
> 




More information about the posted mailing list