Re: [gutvol-d] Language free version of guiguts?

On Wed, 18 Jan 2006 20:19:33 -0500, Jim Tinsley <jtinsley@pobox.com> wrote: |On Wed, 18 Jan 2006 11:44:46 +0000, Dave Fawthrop <hyphen@hyphenologist.co.uk> wrote: | |>On Mon, 16 Jan 2006 15:54:37 -0500, Jim Tinsley <jtinsley@pobox.com> |>wrote: |> |>|On Mon, 16 Jan 2006 17:13:12 +0000, Dave Fawthrop <hyphen@hyphenologist.co.uk> wrote: |>| |>| |>|>I am told by the whitewashers that it is *essential* that all text for PG |>|>pass guiguts. Because this assumes that the language scanned is American |>|>it gives 90% plus false positive errors, on my books, which is totally |>|>unsatisfactory for any piece of test software. |>|> |>|>Is there a language free version of Guiguts? |>| |>|I'm not quite sure which question you're asking, and about which |>|checking tool, but I think there is some confusion somewhere, of |>|emphasis if not of fact, and I'm continually surprised by people who |>|don't know the origins of really quite recent procedures I remember |>|vividly, and I've had several threads recently about this general |>|subject of checking, so please bear with me while I regurgitate |>|history. I hope you'll find a satisfactory answer in here somewhere. |> |>I could only find one tool which shows on my Win XP computer that is |>guiguts. This as far as I can ascertain has various subroutines which |>are very badly tied together, and in no way at all follow the Windoze |>interface. |> |>|Anybody can use any programs they like to make texts, and different |>|people do use different tools, according to their own needs or the |>|needs of the individual texts. Considering that we get French and |>|German and Esperanto and Chinese texts, not to mention older English, |>|there is no one-size-fits-all solution for language. |> |>To get things past whitewashers one apparently must use this, or things get |>rejected. Your assertion is therefore clearly theoretically correct, but |>in reality absolutely wrong | |Now, this I can flatly deny. There is no such thing as a standard without a test to show that the test has been passed. Schools teach to the exam, which some think wrong, but it happened to me and judging by the media Brouhaha still happens in the UK. I was an Engineer, and there a draughtsman who failed to put a test (tolerance) on anything in a drawing, was exposed to public ridicule. If such a drawing got onto the shop floor the production departments would deliberately fail to follow any reasonable tolerance. | I can think of half-a-dozen people |offhand, regular producers, who don't use gutcheck in any form. |They don't need to. Their quality standards are high enough that |it won't find any real errors. I do run it on their texts, as a |matter of form, but I know in advance what the result will be. |For all I know, there are others, equally good, but I don't know |that they don't use gutcheck because the subject never comes up. |Most of us, of course, are not that good. You have just admitted that gutcheck is the standard on PG. | |Bill Flis, who wrote the GutWrench package, uses his own checkers |exclusively, and I know equally well that I won't find any errors |that can sanely be caught by automation in his texts either. You can |find them, if you're interested, at http://www.pgdp.net/tools/GW.zip | |>|Once, there were no checking tools at all, except for spellcheckers |>|built into Word Perfect and Word, which is what most people used, and |>|I could tell you some stories about having to convert those! |>| |>|David Price and Martin Ward and I made checkers that we used for |>|ourselves. There may have been others, but those are the ones I'm |>|aware of. Everything else was Mark One Eyeball. |>| |>|I had done a lot of cleaning-up work on a lot of texts for various |>|people, and I would then send those on to Michael for posting. They |>|would commonly take hours of work each. In self-defense, I wrote a |>|checker I (later) renamed to gutcheck. When the WWs were formed in |>|2001, I brought gutcheck with me, and we all used it to find errors |>|quickly in incoming texts. |> |>But gutcheck gives 90% plus false positive errors, many hundreds on my |>texts in Yorkshire Dialect, mostly poems. It enforces the American |>language, and American punctuation conventions. It objects to most |>Yorkshire abbreviated words such as t' which occur dozens of times |>in the poems I work on. It also objects to non standard punctuation |>which occur in my texts as an example "? whereas American convention |>apparently is ?" . |> |>Writing as one who has designed, written and sold language software for |>some 20 years (see my web site). The *first* stage in the design of any |>software involving language is how other languages will be treated. This |>is usually done by putting all the features of one language, in a specific |>data structure(s) and/or subroutine(s) which can be used or not as |>required. |> |>All I asked for was a copy of gutcheck with the features specific to |>American removed which should be a very short editing and recompiling job. | |I'm not sure how you define "American", but ALL gutcheck features are |language-specific, one way or another. You really appreciate this when |checking Hebrew or Tagalog! Even the relatively familiar French, |German and Spanish have various punctuation features quite |incompatible with gutcheck's assumptions. I'm talking with various |LOTE producers about language-specific versions, but have not yet |decided to take any action. Then gutcheck should be modified to have versions for many languages. If you read the Subject of this thread, you will find: "Language free version of guiguts?" | |>Worse the only way to view output is on a screen. Copy does not work so it |>is impossible to copy the output to a text file and edit the repeated false |>positives out of the list. It is totally unacceptable to distribute a GUI |>program where the standard Copy and Paste functions do not work |> |>Worse still and absolutely ***unforgivable*** in any GUI program the |>settings places the settings file on ***THE DESKTOP***. Deleting it loses |>all settings. |> | |I can't comment on GuiGuts. As a command-line guy, I don't use it all |that much, except sometimes, when I find some specific feature |invaluable. If you want to comment, the appropriate place is in the |GuiGuts thread of the Tool Development forum at DP, which Steve reads |and answers questions and requests in. |http://www.pgdp.net/phpBB2/viewforum.php?f=13 I have asked the question here. I do not do forums. |>|Up till then, there was really no difference between DP and Other |>|texts, though because the people who mostly submitted from DP were |>|experienced, and because DP favored simple texts |> |>DP is by its nature not suitable for my texts, because the language is as |>different from American as say French. A non Tyke (yorkshireman) as has |>been shown in the past, has extreme difficulty understanding the text. | |Well, considering that they regularly do several languages, I doubt if |Yorkshire dialect would stand out much. Right now, in round 1, I find: |English, German (math, with LaTeX), Finnish, French with Scots, Middle |English, Middle French, Portuguese, English with Ancient Greek, |Spanish, Italian, Dutch, German, English with Breton, French, Tagalog, |Latin, and I just know there's some Esperanto around somewhere. I know |they've also done Irish (sean-litriú), because I had a hell of a time |finding all the correct characters for the UTF-8 version (and I'm |still not convinced about Tironian-et). Of course, if you want real |variety, you need to hit the European DP. | |>|And GuiGuts and gutcheck have accreted features ever since. If you |>|have GuiGuts, then you have gutcheck, since Steve bundles it with |>|GuiGuts -- and you also have a large number of other tools that may |>|or may not be useful for the particular text you're working on. |>| |>|There are many other checkers available as well, and I'd love to |>|ramble on about them, but this is too long already, and it doesn't |>|bear on your question. |> |>|This is how it comes -- by evolution, not by fiat -- |> |>Untrue! |>I am *forced* to use guiguts/gutcheck by the Whitewashers. | |I say again: not everyone does. Just eradicate all mistakes and nobody |will ever know what you used. | |>Gutcheck does not work on Windoze. | |It runs in a Win32 command prompt, but it doesn't have a GUI on any |platform. "You have to be joking MAN" | |>| that incoming |>|texts are checked with _several_ tools, according to what seems |>|appropriate for the text, but most commonly with gutcheck and/or |>|GuiGuts. |> |> |>Finally guiguts is as it stands unusable on my texts. No doubt I will find |>other equally drastic problems |> |>As all my work goes on my own web site, and gets copied from there onto |>many other sites, PG is just a nice add on and could be ditched if it were |>to take too much effort. |> |>The text which WW objected to so strongly has been on my site for a couple |>of years, and absolutely *nobody* has noticed the ?errors? People read it |>for the dialect, not the punctuation. I have however had several |>appreciative emails. | |Well, I'm very familiar with that condition, but that's a whole |'nother argument. A text does not have to be perfect to be valuable. |We have many older texts, especially, that have many errors. That |doesn't make them useless. I handle most of the errata reports for PG, |and nearly all of then express appreciation for the availability of |the text, along with their handful of reported errors. I may find |another hundred or so problems when I check the text out, but these |readers never noticed them. Two million downloads a month, with (I |estimate) about one million errors among 17,000 books, and we get |about one errata report per day. | |And there are many people who do want to make etexts but don't want |to live within the constraints of PG -- some don't want the |quality-checking, some complain that we don't quality-check enough, |some don't want to work in plain text, some don't want to go through |the clearance procedures, and so on. | |We have 40 to 60 submitted texts in the average week, and three WWs |active to take them at the moment. If everything in an incoming text |is perfect, one of us will spend about an hour on it. Plus a load of |time on other activities. We can't accommodate everyone on everything, |and there is no doubt that the quality gets higher as time goes on, |because of the processing that we do. This is what we have to do, |to keep the operation moving and the quality high. Not everyone is |going to be happy with the process. Some will choose not to send their |texts to PG. I'm sorry about that. | |>|(Now tell me that all you wanted was the -t switch. :-) |> |>I am not going back to the bad old Unix days, when each program had to be |>learned individually. Come back Bill Gates. All is forgiven. | |Well, I say again, if you don't want to use it, you don't have to; |not everyone does, and especially not everyone does for all texts. |It's essentially a collection of regexes, selected to give, on |average, the best results for the most common type of PG files. |Many |DPers who work on other types of texts just put together their own set |of regexes, and run them through GuiGuts or GutWrench or from a *nix |command line, whichever they prefer. I do not do windoze programming. You are essentially saying that a non programmer can work for PG. :-( Did you really mean that? You have agreed with me above that gutcheck is the standard which must be passed to get. I am just trying to find a version of that standard which will run on my machine, with the text As I understand it that answer to a perfectly reasonable request see Subject from PG was: ****************** ***GET STUFFED.*** ****************** I will look for a workaround. -- Dave Fawthrop <dave hyphenologist co uk> 17,000 free e-books at Project Gutenberg! http://www.gutenberg.net For Yorkshire Dialect go to www.hyphenologist.co.uk/songs/

I have times ago done some work to build a multilingual form of gutcheck, (and I still think that it is a very reasonable aim) but I stopped when Jim refused the very idea that this should be done. I am still using my (now obsolete) version of gutcheck with the french customization. My idea was that some constants, (for example, the list of vowels and the list of strings suspicious inside a word) instead of being had-wired in the code should be contined in constants defined in header files included at compile time. If you want, I can try to update my version, and discuss extensions to other languages. Carlo Traverso

I have developed programs to help me proof faster/better. I work mainly in French but they seem to work well in other latin alphabet languages (I tried them a little in English, Spanish). http://www.pgdp.net/phpBB2/viewtopic.php?p=158673#158673 (get in touch with me if you want to give them a try; the CVS-commited version is not the very latest one) I use them to do R1/R2, P1/P2, and, as of recently, P0 that is to say quick preparation of OCR'd texts before publication on PGDP Int'l. I define language-related things (constants, suffixes, prefixes). Right now, apparently being the only user and developer of these programs, there are many special cases for French. But it could be easy to add things for other languages. As an example a French rule is: the word is accepted if it starts with "j'" and continues with a vowel and the rest is an accepted word. For example: "j'aime" (I love) is accepted because "aime" (love) is. "j'arbre" (I tree) is accepted because "arbre" (tree) is. This means nothing of course, but a proofer is bound to spot that: it is not a scanno (and not likely to happen in OCR anyway). Kicking some grammatical checks in would be the next step. Right now the programs are just working on a syntactical basis. I have a list of French words with all their possible grammatical natures (noun / adjective / conjugated verb for this tense and this person...) but unfortunately it was published by ABU under a restrictive license which makes it difficult for me to repackage and reuse. The free list of words I found in Debian packages is very incomplete (it is missing many simple passé simple conjugated verbs, most if not all subjonctif imparfaits...) In English we could for example decide "<word>'s" is accepted if "<word>" is (and does not finish with an "s"). I am planning to think and develop or reuse things to do PM later on, probably focusing more or less on producing XML TEI.

On Thu, 19 Jan 2006 11:21:45 +0100, Carlo Traverso <traverso@dm.unipi.it> wrote:
I have times ago done some work to build a multilingual form of gutcheck, (and I still think that it is a very reasonable aim) but I stopped when Jim refused the very idea that this should be done.
"Opinions about Tolstoy and his work differ, but on one point there surely might be unanimity. A writer of world-wide reputation should be at least allowed to know how to spell his own name. Why should any one insist on spelling it "Tolstoi" (with one, two or three dots over the "i"), when he himself writes it "Tolstoy"? The only reason I have ever heard suggested is, that in England and America such outlandish views are attributed to him, that an outlandish spelling is desirable to match those views." Love that quote. From Louise Maude's Translator's Preface to "Resurrection". I really must re-scan that, if only to capture the image of Tolstoy's signature -- with a "y" -- above those words. I'm not a writer of world-wide reputation, of course, but I've recently heard such outlandish views attributed to me that I'm beginning to think of signing myself "Jim Tinslei", with, possibly, three decadent dots over the "i". So, did I ever "refuse the very idea that this should be done"? The society that is one of the referents of "Project Gutenberg", as I understand it -- and I'm not at all sure that I do -- is a pretty good model of a Libertarian society. It's even better than a Real Life Libertarian society, since anyone can opt out; try doing that next time your local tax-collector sends you a letter. People do (a) what they want to do and (b) what they think should be done strongly enough that they're willing to spend the hours of their lives doing it. Some people also do (c) what Other People want them to do. Occasionally, or usually. PPVs and WWs spend a lot of time on projects in which they, personally, have no interest. Toolmakers try to accommodate the people who use their tools. DP admins solve problems. Gravediggers move difficult projects along. Like that. The complex society that exists today in and around PG would all fall apart if some people didn't offer themselves as something of a "public utility" in some limited sphere. So I'm used to the idea that people write to me out of the blue asking for help or advice, just as I ask other "public utilities" within the project for help and advice. But in such cases I, or they, are free to refuse, or do something different. Project Gutenberg, however you define it, doesn't sign anybody's paycheck, or make anybody do anything at all. Neither do Michael or Greg as individuals. In fact, those two worthies would walk a country mile in tight shoes to assure you -- gesticulatingly -- that they have about as much influence over what I do as Uri Geller's daily horoscope has over the shape of Reese Witherspoon's toenails. What's more, having the experience of being a public utility in PG yourself, you know this better than most, which is why, when I dug your original email on the subject in July 2002 out of the dumpster of my archives, I was a little annoyed all over again that you had copied Michael and Greg on it. 'Sfunny: I didn't remember the thread, but when I saw the e-mail, I did remember that little sting of annoyance at the assumption that either of them had anything at all to do with my decisions about gutcheck. Which is, of course, as nothing to the annoyance gutcheck has inflicted over the years on various producers, so I guess, karmically, I have it comin'. People who "grew up" in DP were "born" with others looking over their shoulders, and so expect their homework to be corrected unmercifully. Everybody there has fully internalized the knowledge that they, and everyone else, makes mistakes, or, as Juliet more correctly and insightfully remarked, _overlooks_ mistakes. Your own recent excellent work on quantifying that will be invaluable in several ways. Producers who had been making certain kinds of mistakes for years without being aware of it, or having anyone correct them, though, fully appreciated the pun in the name. It is no fun at all having these things pointed out to you for the first time by someone else. Dave's comments are really quite temperate compared to many of the love-notes I received back in 2000-2002. My favorite was "DON'T YOU DARE RUN THIS THING OVER MY OLD TEXTS!!" I was more than somewhat sick myself when I first exposed some of my old work to jeebies, and saw the full extent of my own heebieness, but at least in that case, nobody else saw my shame, and I had no-one to be annoyed at but myself. I mention my annoyance because on re-reading what I wrote in response to your proposal, it does jump off the screen at me, and I apologize belatedly for that. It doesn't, however, have any bearing on my decisions then or now. What you actually proposed was that you should carve up gutcheck into separate files, dealing with separate languages. If there had ever been a day when I decided to sit down and write gutcheck, that's what I might well have done from Day One, but there never was such a day. To me, it's just a handy platform into which I can plug checks that I find useful. As I said at the time, and so often before and since, I don't actually think that the language-specific typo-checking functions should really be in there at all; every text needs a spellcheck, and for texts that have been spellchecked, these functions are only a source of false positives. That's why I added the -t switch when I started sending it out to other people. For me, they were handy as a quick way of getting a hint whether an incoming file had been spellchecked or not. Unfortunately, some producers lulled into a false sense of security by not seeing typos flagged in gutcheck didn't do the spellcheck, which was a problem I had to address around that time, but it seems to be resolved now. That's what I said, and that's what I still believe. I do think that punctuation checks for LOTE are an appropriate add-in, but as a devout monoglot, I'm in no position to define them. I don't have the experience of finding certain error-patterns by hand in LOTE texts. People have suggested specific changes like this from time to time, and I have usually incorporated them, where they don't cause problems somewhere else. A few days ago, I asked any PPVs who want certain punctuation type checks (or removal of existing checks) for LOTE to define some for me. We'll see what comes out of that. Until I see what the requested checks are, I'm not going to decide how to make the changes. I'm certainly not going to refactor the code, or commit myself to working with somebody else's refactored code, in advance of knowing in what way it needs to be changed. Reading over old emails is weird; it brings back context. You wrote to me when I was just setting up the SF site, to get it installed before I released the FAQ and to give the Software Site a permanent link. Up until then, people had got gutcheck directly from me, and often asked for individualized versions, which I mostly made for them. If the checks seemed good by my usual tests, I added them to "my" gutcheck as well. That was the way it worked in that era. I looked forward, at that time, to getting the damthing OUT, so that people could do their own customizing, and I would be free. Free!! Bwahahahah!! Heh. I wished you well in your own customization, and I still do. The volume of LOTE is much greater than it was then, and maybe somebody working in that area (those areas?) will do their own thing. Great! Maybe they'll ask me to customize some specific checks. Very occasionally, people do that still. Maybe some PPVs in specific languages will get together and suggest a coherent agenda to make gutcheck (or some variant thereof) friendly to those languages. I hope they do. They haven't yet. Until that happens, I have more than enough things I want to do, and think should be done, not to spend my limited PG time chasing Other People to tell me things they want me to do . . . or, for that matter, self-indulging in writing long posts to the vandalized wasteland that was once a productive resource for people making etexts. I really have been very lazy since the Christmas break. Back to the grindstone. jim

On Sun, 22 Jan 2006 13:28:33 -0500, Jim Tinsley <jtinsley@pobox.com> wrote: |What you actually proposed was that you should carve up gutcheck into |separate files, dealing with separate languages. If there had ever |been a day when I decided to sit down and write gutcheck, that's what |I might well have done from Day One, but there never was such a day. |To me, it's just a handy platform into which I can plug checks that I |find useful. IMO with the advent of huge memory in even the entry level computers, All tests should be in the one program, and the different language versions should be handled by simple switches/radio buttons, as with the various sorts of angle brackets ATM. OK the switches will inevitably become complex and difficult. -- Dave Fawthrop <dave hyphenologist co uk> 17,000 free e-books at Project Gutenberg! http://www.gutenberg.net For Yorkshire Dialect go to www.hyphenologist.co.uk/songs/

On Sun, Jan 22, 2006 at 09:26:24PM +0000, Dave Fawthrop wrote:
IMO with the advent of huge memory in even the entry level computers, All tests should be in the one program, and the different language versions should be handled by simple switches/radio buttons, as with the various sorts of angle brackets ATM. OK the switches will inevitably become complex and difficult.
You're right, of course. I _think_ you might even go one better. I've used the occurrence of 50 instances of something recognizable as the English word "the" as an indicator that a file is (at least partly) in English, and a high number of certain types of characters to suggest that the file is in ISO-8859 or UTF-8, and a high number of strings within <> to indicate some flavor of *ML. I suspect that a similar technique might be useful in multilingual checkers in general, and if I wrote one I would certainly consider it. jim

On Sun, 22 Jan 2006 16:39:44 -0500, Jim Tinsley <jtinsley@pobox.com> wrote: |On Sun, Jan 22, 2006 at 09:26:24PM +0000, Dave Fawthrop wrote: |> |>IMO with the advent of huge memory in even the entry level computers, All |>tests should be in the one program, and the different language versions |>should be handled by simple switches/radio buttons, as with the various |>sorts of angle brackets ATM. OK the switches will inevitably become |>complex and difficult. | |You're right, of course. | |I _think_ you might even go one better. I've used the occurrence |of 50 instances of something recognizable as the English word "the" |as an indicator that a file is (at least partly) in English, and |a high number of certain types of characters to suggest that the |file is in ISO-8859 or UTF-8, and a high number of strings within |<> to indicate some flavor of *ML. | |I suspect that a similar technique might be useful in multilingual |checkers in general, and if I wrote one I would certainly consider it. There has been a lot of academic work on detecting language by counting frequently used short words. All languages have a different set of frequently used short words. IIRC it is not particularly accurate, and naturally falls down on text in two or more languages, I have a book in Yorkshire and English on my desk ATM. IMO Asking the user which language he/she is using would be easisier and more reliable. -- Dave Fawthrop <dave hyphenologist co uk> 17,000 free e-books at Project Gutenberg! http://www.gutenberg.net For Yorkshire Dialect go to www.hyphenologist.co.uk/songs/

Would anyone like to do a little good old-fashioned proof-reading in French? (Or perhaps what would be called "smooth reading" by DPer's) I have the text of two short novels by Laure Conan already prepared. But then, I found an edition which contains these as well as three short stories. I've gone ahead and typed out the first two of thes stories and run them through a spell check. (The third is in progress) I would greatly appreciate a read-through from someone who knows the language. They can be found at: http://www.victoria.tc.ca/~sly/conan4.txt Thanks, Andrew

Andrew Sly a écrit :
Would anyone like to do a little good old-fashioned proof-reading in French? (Or perhaps what would be called "smooth reading" by DPer's)
I have the text of two short novels by Laure Conan already prepared. But then, I found an edition which contains these as well as three short stories. I've gone ahead and typed out the first two of thes stories and run them through a spell check. (The third is in progress) I would greatly appreciate a read-through from someone who knows the language.
They can be found at: http://www.victoria.tc.ca/~sly/conan4.txt
Thanks, Andrew
I am a french speaker and I could perhaps do it if the texts are not too long, but the link seems broken. koxinga

First, some general musing here... It's interesting to see how as the number of people involved with PG in one way or another keeps growing, so does a general misunderstanding about the nature of the project. Somehow, the impression of some people is that it all runs like clockwork, and all little ambiguities are swiftly and efficiantly dealt with. I can understand that when someone sees the sheer amount of what has been accomplished so far, it can be easy to assume that "of course, _this_ has been done--it wouldn't make sense otherwise." Realistically, the processes that are in place grew up over time, with volunteers doing their best to deal with the demands of the moment and string something together that would work. And it is not static either, it keeps changing. I'm not kidding when I say that the few people who do the majority of the back-end stuff that keeps PG growing have a backlog of years of PG-related tasks to tackle. So, on the specific topic at hand... On Thu, 19 Jan 2006, Dave Fawthrop wrote:
You have just admitted that gutcheck is the standard on PG.
From the cataloging point of view, I've regularly had help from native speakers of various languages (Finnish and Tagalog spring to mind) which has helped me to make bibliographic data more
Yes, it is used a lot. Often it's a very useful tool (sometimes even for non-english texts.) However, I would not call it a standard in the sense of being a "test" that a given text has to "pass" (such as a test for valid markup on an html file). Rather, it is a tool which just about every text being added to the collection is run through, as a way of 1) assesing the over-all level of the text, and 2) guarding against last-minute gremlins that do unexpected things to a text (and yes, interesting things do happen sometimes.) I have submitted some German and French texts to PG which I have reformatted from other sources, and, as expected, a run through gutcheck resulted in many places being questioned that were just fine in the given languages. So, if I thought it needed, I just added a note when submitting the texts that "gutcheck flags a lot of false positives on this one." It looks like the source for gutcheck is availible at http://gutcheck.sourceforge.net/ if you are interested in modifying it for your own uses. (If you are just dealing with one or two texts, it might not be worth the bother, but if you foresee working through lots of Yorkshire text, it could be more worthwhile.) ...... So, will the conditions I discussed above change? Well, PG is certainly more organized in some ways than it used to be, and I could see it going further in that direction. However, I don't realistically see it ceasing to be run by volunteers, which does set some of the tone. I'm not pretending that I think PG is perfect here. Like anyone else who is involved, I have my own issues (one of my pet peeves is if the stated character encoding in the header does not match what is actually in the text), but I know they will not likely be dealt with unless I go ahead and try to work on them. I've found a good approach is building consensus with others. precise than I ever could have managed on my own. As well I've occasionally sent queries to the reference desks of libraries in many corners of the world. If I can get it organized, I'm hoping to make a sub-project where I can target a few wikipedia users who have indicated they have fluency in both English and Chinese, and give them a way to help improve the consistency of the author data for our Chinese texts. I'd better stop now, before I meander off-topic too much... But I hope this has helped somewhat. And thanks for caring about Project Gutenberg. :) Andrew
participants (6)
-
Andrew Sly
-
Carlo Traverso
-
Dave Fawthrop
-
Jim Tinsley
-
koxinga
-
Sebastien Blondeel