[gutvol-p] Re: Getting Involved
john_redmond at optusnet.com.au
Fri Dec 18 18:35:10 PST 2009
Hello Al, Juliet and Jim:
Thanks for your detailed replies. I now start to understand the train of
dependencies. The credit line was what had bothered me in particular,
given that a PDF file cannot be altered after submission.
More generally, I now understand that, if I can contribute at all, it
would be to help in packaging content for others to convert to PDF, XML,
or whatever other format may become fashionable In this connection, the
reply from Juliet looks very enticing.
To explain: I am one of the XML true believers and TEI, or something
like it, is ultimately the way to go. But TEI seeks to be all-inclusive
and doomed to be very big and complicated (think SGML). And working with
XML is so painful and error-prone for humans. I don't know how big the
PGTEI subset is, but there is a good chance that it might be expressible
in lightly marked-up text, which can easily be parsed into XML. If that
were the case, I can become usefully involved at the DP end.
To state Basil Fawlty's bleedingly obvious, it might then become PG's
long-term aim to provide PGTEI versions of all texts, from which all
styled versions can be derived--and the only one version to be
maintained. But where is a spec for PGTEI? And samples? If I could have
a look at them, I could very quickly decide whether I could be of any
In summary, without being very clear about it, I had thought that I
might be able to contribute to PG by generating more refined documents
from existing books (gratuitous, I admit); but now I suspect that I
might be more useful by wrestling with the software.
(Notes for Juliet:
1. I could not find a spec for PGTEI on the pgdp.net site. Is one
2. I am a Linux user by choice, but should I presume that all software
is required for Windows?
Note for all responders: Thanks for your thoughtful responses; I am
starting to learn the issues!
In conclusion: many thanks to Al, Juliet and Jim for their detailed
On Wed, 2009-12-16 at 15:58 -0800, Al Haines (shaw) wrote:
> If a PDF, or any other format, is generated from an existing PG text, it
> won't get a new number. It would be bundled in with all other files for
> that etext number, and would appear in PG's catalog as an addition filetype.
> To use Copperfield as an example, if it was in PG originally as only a text
> file, then at a later date an HTML version was generated from the text file,
> the text and HTML files would appear as two filetype entries under that
> particular Copperfield. If a PDF file was then added, generated from either
> that Copperfield's text or HTML file, the PDF file would appear as another
> New numbers are given to ebooks that are new to PG, or are created from a
> different edition, with significant enhancements/differences, than a current
> PG ebook.
> On occasion, a new number is assigned if a new set of files is created from
> the same source edition, but the new version has significant enhancements,
> e.g. illustrations, an index, etc, that may have been omitted from the
> current PG version. This usually applies only to PG's oldest texts, before
> HTML/images/ISO files were commonly provided.
> One other point, again with Copperfield as the example. You say that yours
> was generated from one of PG's editions, but you appear to have stripped out
> the producer's credit line ("Produced by..."). Some PG files, usually the
> older ones, may not have originally had such a credit line, but if the file
> is cleaned up and reposted some time after its original submission, it's
> standard practice to add "Produced by an anonymous Project Gutenberg
> Whatever the case, stripping out a credit line is a distinct no-no. The
> original producers always get credit for the original production, with the
> producer of the new format getting additional credit. For example, PG#552
> (The People that Time Forgot) was produced in 1996 by Judith Boss. In July
> 2008, I created an HTML version from her text file. She retains basic
> credit; I took credit only for the HTML file. If you created a PDF file
> from either of those two files, your credit would be added to the other two.
> These credit lines are respected by most harvesters of PG files.
> ----- Original Message -----
> From: "John Redmond" <john_redmond at optusnet.com.au>
> To: "Al Haines (shaw)" <ajhaines at shaw.ca>
> Sent: Wednesday, December 16, 2009 2:42 PM
> Subject: Re: [gutvol-p] Getting Involved
> > Hello Al:
> > Thanks for responding. I will certainly work through all the links that
> > you have listed. I can't help feeling, though, that what I want to do is
> > somewhat different from the usual:
> > 1. I see my contribution, apart from providing the software, is to
> > value-add on existing books. For example, the files on my site
> > (www.limpidsoft.com) are derived from PG books, but I presume that they
> > will have new catalog numbers.
> > 2. I can provide XHTML files -- although there is no shortage of them in
> > PG. So my particular contribution would be PDF files, possibly with the
> > associated LaTeX files. Now, because PDF files are locked, it will not
> > be possible to include any statements after they are built. As I see it,
> > then, I would need to tie up all this detail before submitting.
> > 3. Plain text versions are automatically accounted for (see 1. above),
> > but it would probably be appropriate to identify these somewhere in the
> > PDF.
> > John Redmond
More information about the gutvol-p