
On Tue, Jul 28, 2009 at 02:16:15PM +0000, Joshua Hutchinson wrote:
Any chance of creating on the fly zips of some of the books? For instance, the audio books are huge and usually divided along chapter lines. Single file zips are very useful (and something we've done on some of them manually) but the space waste is huge. On the fly zipping of those files would save huge in storage space.
Josh
Somebody would need to write the software :) Zipping an mp3 is not a winning strategy: they really don't compress much, if at all. Putting multiple mp3 files for a single eBook in one file, on the fly, would be a great move - making it easier to download a group of files. A more general approach would be to let visitors to www.gutenberg.org put their selected files (including those generated on-the-fly) on a bookshelf (i.e., shopping cart), then download in one big file, or several small ones. This would involve some fairly significant additions to the current PHP-based back-end at www.gutenberg.org, but is certainly not a huge technical feat. -- Greg
On Jul 28, 2009, Greg Newby <gbnewby@pglaf.org> wrote:
On Tue, Jul 28, 2009 at 09:16:41AM +0200, Ralf Stephan wrote: > I confirm that neither the Plucker nor the Mobile formats > are mentioned in the catalog file. Do you have an > explanation, Marcello?
I believe Marcello is out on vacation for 2 weeks.
But I know the explanation: the epub, mobi and a few other formats are not part of the Project Gutenberg collection's files, so not part of the database.
They are generated on-demand (or cached if they were generated recently enough), from HTML or text.
We are planning many more "on the fly" conversion options for the future. I have one for a mobile eBook format (for cell phones), and hope to have a PDF converter (with lots of options). We've been working on some text-to-speech converters, too, but that work has gone slowly.
The catalog file only tracks the actual files that are stored as part of the collection (stuff you can view while navigating the directory tree via FTP or other methods). -- Greg
> On Jul 27, 2009, at 8:42 PM, David A. Desrosiers wrote: > >> On Mon, Jul 27, 2009 at 1:45 PM, Ralf Stephan<[1]ralf@ark.in-berlin.de> >> wrote: >>> My, can't we admit that XPath is a bit over our head, >>> so we prefer confronting the admin we're supposed >>> to be cooperating with? Wrt resources, my guess it's >>> about par traffic-wise (1-5k per book vs. megabytes >>> of RDF) but much better CPU-wise. That is, if you don't >>> want the RDF for other fine things like metadata etc. >> >> I think you've missed my point. >> >> The RDF flat-out cannot tell me which of the target _formats_ are >> available for immediate download to the users. I'm not looking for >> which _titles_ are available in the catalog, I'm looking for which >> _formats_ are available. Also note that I'm already parsing the feeds >> to see what the top 'n' titles are already, so parsing XML via >> whatever methods I need is not the blocker here. >> >> Let me give you an example of two titles available in the catalog: >> >> Vergänglichkeit by Sigmund Freud >> [2]http://www.gutenberg.org/cache/plucker/29514/29514 >> >> The Lost Word by Henry Van Dyke >> [3]http://www.gutenberg.org/cache/plucker/4384/4384 >> >> Both of these _titles_ are available in the Gutenberg catalog, but the >> second one is not available in the Plucker _format_ for immediate >> download. Big difference from parsing title availability from the >> catalog.rdf file. >> >> Make sense now? >> _______________________________________________ >> gutvol-d mailing list >> [4]gutvol-d@lists.pglaf.org >> [5]http://lists.pglaf.org/mailman/listinfo/gutvol-d > > Ralf Stephan > [6]http://www.ark.in-berlin.de > pub 1024D/C5114CB2 2009-06-07 [expires: 2011-06-06] > Key fingerprint = 76AE 0D21 C06C CBF9 24F8 7835 1809 DE97 C511 > 4CB2 > > > > > _______________________________________________ > gutvol-d mailing list > [7]gutvol-d@lists.pglaf.org > [8]http://lists.pglaf.org/mailman/listinfo/gutvol-d _______________________________________________ gutvol-d mailing list [9]gutvol-d@lists.pglaf.org [10]http://lists.pglaf.org/mailman/listinfo/gutvol-d
References
Visible links 1. mailto:ralf@ark.in-berlin.de 2. http://www.gutenberg.org/cache/plucker/29514/29514 3. http://www.gutenberg.org/cache/plucker/4384/4384 4. mailto:gutvol-d@lists.pglaf.org 5. http://lists.pglaf.org/mailman/listinfo/gutvol-d 6. http://www.ark.in-berlin.de/ 7. mailto:gutvol-d@lists.pglaf.org 8. http://lists.pglaf.org/mailman/listinfo/gutvol-d 9. mailto:gutvol-d@lists.pglaf.org 10. http://lists.pglaf.org/mailman/listinfo/gutvol-d
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/mailman/listinfo/gutvol-d