(sending this time to the proper list address) Prior to etext #10000, Gutenberg filenames were strings that didn't have anything to do with the etext numbers (except that the number indirectly determined which etext* directory files would be in, since both the numbers and the etext* directories were tied to release date). As these early etexts get revised, they typically move into the new numerically-based-filename system. However, there are still slightly more than 6000 Gutenberg etexts that use the old-style filenames. Information about the filenames, new and old, is included in the RDF file. For instance, if you are interested in etext #3167, you'll find a metadata record for it early in the file in a pgterms:etext element with ID "etext3167". Later in the RDF file, you'll see two pgterms:file elements that have an isFormatOf relationship with the ID "etext3167". Those elements in turn specify information about the two files associated with that etext, including name, MIME type, length, and last-modified date. The name of the file, in particular, is in the rdf:about attribute of the pgterms:file element. In this case, you'll find that the two files associate with etext #3167 are in etext02/wsxpm10.txt and etext02/wsxpm10.zip (relative to the top-level Gutenberg text directory). If for some reason the RDF itself is too big for you to handle easily, it looks like it's auto-generated, so you could probably write a script in Perl or some other suitable text-crunching language to extract only the information you're interested in, in some more compact form. I have my own independent copy of some of this information, in my own format, which I assembled before the RDF directory was made available. But if I were to start over again, I'd probably just pull straight from the RDF. (And try to be more proactive about getting Gutenberg to fix its metadata at various spots instead of just fixing it on my end.) I hoep this helps. John