
There is a new DVD image some folks might like to check out. You can get it interactively here: http://snowy.arsc.alaska.edu/pgjun05 or download the full ISO here (size=4668391424 bytes, MD5sum=eb9d00a4b1e4cb30d801709ced6da282): ftp://snowy.arsc.alaska.edu/pub/gbn/pgjun05.iso This is the first major output of Craig Stephenson's program to allow people to build their own CD/DVD ISOs. I'll send a URL to the program in another week or two (it's not quite ready yet for multiple users). We started with the Best Of CD titles as core, getting updated files with an emphasis on HTML. Then, we blindly added lots more HTML, uncompressed, for a pleasurable "unzip-free" reading experience. I also made sure a few particular authors were included, in the Best Of tradition. There are a few things I know are problematic, but please inform me of any others that you spot: - a few copyrighted files snuck in (some MP3 audio and a Kafka) - the author/title index files are mixed case, and would be better in a subdirectory - there might be some Complete volumes that are partially duplicated by individual volumes. If you spot any, let me know - the author/title index pages need something like a "Link: " label for the eBook file, and also a "Language: " field. We might add a "by-language" index, in addition to the Author and Title indexes. Although I made a bunch of these for Michael Hart's visit to Alaska (public talk=Wednesday June 22 at the Fairbanks Public Library 7:00 pm), and to try to give away to AK libraries, I don't expect this to be quite polished enough to redistribute en masse. But I hope it might be the core of a new DVD option to supplement our "PG 10K Special" from December 2004. (That DVD, which is eBook 11800, is mostly zipped .txt files -- about 9400 titles). This new DVD image contains about 5100 eBooks. In a nutshell, Craig's program parses the RDF/XML catalog into a MySQL database. Then, PHP is used to provide a user with an iterative, interactive set of steps to add and delete eBooks and their formats from the ISO. Building an online browsable prototype of the ISO is simple and fast, because we use hard links (on the same filesystem as the collection mirror). Once it looks good, the actual ISO is built with mkisofs (which takes a little while) and becomes available for download via FTP (or HTTP if it's < 2GB). We'll be doing features etc., and making the code widely available (though it basically requires a complete PG mirror to work). Enjoy, and please send feedback! -- Greg

Greg Newby wrote:
In a nutshell, Craig's program parses the RDF/XML catalog into a MySQL database. Then, PHP is used to provide a user with an iterative, interactive set of steps to add and delete eBooks and their formats from the ISO.
Who do we target, the PG DVD team or the user at large? Where is this program supposed to run when it is ready? -- Marcello Perathoner webmaster@gutenberg.org

On Thu, Jun 09, 2005 at 11:38:11AM +0200, Marcello Perathoner wrote:
Greg Newby wrote:
In a nutshell, Craig's program parses the RDF/XML catalog into a MySQL database. Then, PHP is used to provide a user with an iterative, interactive set of steps to add and delete eBooks and their formats from the ISO.
Who do we target, the PG DVD team or the user at large?
The user at large. But there are benefits for the DVD team and other purposes, as well. For example, someone will be able to "save" their ISO configuration, then return later to get *updated* files for the same eBooks. This will be particularly useful for doing things like quarterly updates of "theme" CDs or DVDs, such as Col Choat's idea of an "explorers" collection.
Where is this program supposed to run when it is ready?
On a beefy server. Right now it's on snowy.arsc.alaska.edu, and I imagine snowy will be suitable for relatively large-scale use. I hope the program will be available at other mirror sites, too. I think it will be too intensive in disk & CPU for iBiblio, but you never know... if this sounds computationally unrealistic to offer to the general reader, to you, read my work .sig below :-) -- Greg Dr. Gregory B. Newby, Chief Scientist, Arctic Region Supercomputing Center Univ of Alaska Fairbanks-909 Koyukuk Dr-PO Box 756020-Fairbanks-AK 99775-6020 e: newby AT arsc.edu v: 907-450-8663 f: 907-450-8601 w: www.arsc.edu/~newby

Greg Newby wrote:
For example, someone will be able to "save" their ISO configuration, then return later to get *updated* files for the same eBooks. This will be particularly useful for doing things like quarterly updates of "theme" CDs or DVDs, such as Col Choat's idea of an "explorers" collection.
This will be great for the DVD team. I don't know about the users at large though. Some people (not mirrors!) are roboting our whole site once a week in search for new books. I wonder how the DVD maker will scale under similar load conditions. I was just wondering if it wasn't more realistic to use jigdo on the users side. People who burn DVDs do have a little knowledge so they could manage to install that. Jigdo advantages: no big single chunk file transfers. jigdo will get the ebook files from the ftp server and build the DVD image on the users PC. On updates the user has to transfer just the changed files not the whole DVD image. By building our own jigdo files we could round robin the ftp load to different mirrors. Jigdo disadvantages: user has to install the jigdo client. We have to somehow build a jigdo control file (but jigdo is open source, so we can figure that out.)
Where is this program supposed to run when it is ready?
On a beefy server. Right now it's on snowy.arsc.alaska.edu, and I imagine snowy will be suitable for relatively large-scale use. I hope the program will be available at other mirror sites, too. I think it will be too intensive in disk & CPU for iBiblio, but you never know... if this sounds computationally unrealistic to offer to the general reader, to you, read my work .sig below :-)
Of course, if you throw a NetApp terabyte server at the problem... You'll need a place to store all those custom DVD images until the user has retrieved them. (How to detect that? You can't rely on the user notifying you.) Retrieving DVD images has been a PITA even with fast DSL modems, so you'll have to save the images for at least a couple of days.
Dr. Gregory B. Newby, Chief Scientist, Arctic Region Supercomputing Center
That's a good idea. You will save big on your air-conditioning bill. :-) -- Marcello Perathoner webmaster@gutenberg.org

On Fri, Jun 10, 2005 at 11:54:53AM +0200, Marcello Perathoner wrote:
Greg Newby wrote:
For example, someone will be able to "save" their ISO configuration, then return later to get *updated* files for the same eBooks. This will be particularly useful for doing things like quarterly updates of "theme" CDs or DVDs, such as Col Choat's idea of an "explorers" collection.
This will be great for the DVD team. I don't know about the users at large though.
Some people (not mirrors!) are roboting our whole site once a week in search for new books. I wonder how the DVD maker will scale under similar load conditions.
We will see, but I don't think the DVD maker will be robot-able at all. There are also provisions for load balancing....for example, when a user has the CD/DVD contents specified and says, "make me the ISO file," the ISO happens on an "as-available" basis, and the user gets email when it's ready. It's not going to be a viable tool for resource discovery.
I was just wondering if it wasn't more realistic to use jigdo on the users side. People who burn DVDs do have a little knowledge so they could manage to install that.
Jigdo advantages: no big single chunk file transfers. jigdo will get the ebook files from the ftp server and build the DVD image on the users PC. On updates the user has to transfer just the changed files not the whole DVD image. By building our own jigdo files we could round robin the ftp load to different mirrors.
Jigdo disadvantages: user has to install the jigdo client. We have to somehow build a jigdo control file (but jigdo is open source, so we can figure that out.)
I'm 100% in favor of jigdo, and can set you up on snowy if you (or someone else) would like to get it configured. -- Greg
Where is this program supposed to run when it is ready?
On a beefy server. Right now it's on snowy.arsc.alaska.edu, and I imagine snowy will be suitable for relatively large-scale use. I hope the program will be available at other mirror sites, too. I think it will be too intensive in disk & CPU for iBiblio, but you never know... if this sounds computationally unrealistic to offer to the general reader, to you, read my work .sig below :-)
Of course, if you throw a NetApp terabyte server at the problem...
You'll need a place to store all those custom DVD images until the user has retrieved them. (How to detect that? You can't rely on the user notifying you.) Retrieving DVD images has been a PITA even with fast DSL modems, so you'll have to save the images for at least a couple of days.
Dr. Gregory B. Newby, Chief Scientist, Arctic Region Supercomputing Center
That's a good idea. You will save big on your air-conditioning bill. :-)
-- Marcello Perathoner webmaster@gutenberg.org
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d
participants (2)
-
Greg Newby
-
Marcello Perathoner