
We are ready to migrate the web site to the new fast file server. Also some slight changes were made to the online catalog to make it better cacheable: The dynamic authrec pages have been dropped in favour of the static browse-by-author pages. Browse-by-author now includes all information from the authrec pages. Redirects are in place. The search has been optimized to redirect simple searches (searches for author only, title only) to the appropriate browse-by-author and browse-by-title pages. A preview is online at: www-dev.gutenberg.org Please test and report any oddities. -- Marcello Perathoner webmaster@gutenberg.org

On Wed, 2 Mar 2005, Marcello Perathoner wrote:
We are ready to migrate the web site to the new fast file server.
Also some slight changes were made to the online catalog to make it better cacheable:
I have an issue with the way that following an author name from a bibrec page leads to an anchor in a "author by first letter of last name" page. To me, this does not look like a long-term solution. It could work for a while, but as the collections continues to grow, these files will inevitably get too large to be easily useful for general browsing. Take a look at the New General Catalog of Old Books and Authors where Phillip has begun to break some of the files of author records into smaller sub-groupings. We could certainly do something like that here as well, but that would create extra work to identify which files are largest, and what the best way to split them up would be. Andrew

Andrew Sly wrote:
I have an issue with the way that following an author name from a bibrec page leads to an anchor in a "author by first letter of last name" page.
To me, this does not look like a long-term solution. It could work for a while, but as the collections continues to grow, these files will inevitably get too large to be easily useful for general browsing.
The old author pages had the problem that they were too many to generate statically (5000+) and very database-intensive to generate on-the-fly. We have a fair share of obnoxious robots visiting us (kids on a dsl line that want to grab everything and don't respect robots.txt) and every such visit costs us 5000+ heavy database hits. (The bibrec pages are much lighter on database resources.) I'll try this way to see how it performs. The script uses a list of regexes to fill the pages with authors. If the "B" page (currently 219 KB) gets too big, we'll split it into "BA" and "BM". Also, modern browser will request compression, so the 219 KB page will boil down to a ~50 KB transmission. Many other web sites use images that big. Here are the actual sizes. -rw-r--r-- 1 marcello pgweb 120171 Mar 2 15:47 a.php -rw-r--r-- 1 marcello pgweb 219237 Mar 2 15:47 b.php -rw-r--r-- 1 marcello pgweb 168136 Mar 2 15:48 c.php -rw-r--r-- 1 marcello pgweb 124726 Mar 2 15:48 d.php -rw-r--r-- 1 marcello pgweb 54900 Mar 2 15:48 e.php -rw-r--r-- 1 marcello pgweb 68002 Mar 2 15:49 f.php -rw-r--r-- 1 marcello pgweb 93415 Mar 2 15:49 g.php -rw-r--r-- 1 marcello pgweb 182640 Mar 2 15:50 h.php -rw-r--r-- 1 marcello pgweb 17617 Mar 2 15:50 i.php -rw-r--r-- 1 marcello pgweb 61671 Mar 2 15:50 j.php -rw-r--r-- 1 marcello pgweb 52031 Mar 2 15:50 k.php -rw-r--r-- 1 marcello pgweb 132947 Mar 2 15:50 l.php -rw-r--r-- 1 marcello pgweb 184111 Mar 2 2005 m.php -rw-r--r-- 1 marcello pgweb 29596 Mar 2 2005 n.php -rw-r--r-- 1 marcello pgweb 38429 Mar 2 2005 o.php -rw-r--r-- 1 marcello pgweb 9530 Mar 2 03:20 other.php -rw-r--r-- 1 marcello pgweb 110174 Mar 2 03:19 p.php -rw-r--r-- 1 marcello pgweb 11253 Mar 2 03:19 q.php -rw-r--r-- 1 marcello pgweb 85506 Mar 2 03:19 r.php -rw-r--r-- 1 marcello pgweb 195736 Mar 2 03:19 s.php -rw-r--r-- 1 marcello pgweb 88693 Mar 2 03:19 t.php -rw-r--r-- 1 marcello pgweb 29340 Mar 2 03:20 u.php -rw-r--r-- 1 marcello pgweb 148515 Mar 2 03:20 v.php -rw-r--r-- 1 marcello pgweb 139151 Mar 2 03:20 w.php -rw-r--r-- 1 marcello pgweb 7759 Mar 2 03:20 x.php -rw-r--r-- 1 marcello pgweb 18127 Mar 2 03:20 y.php -rw-r--r-- 1 marcello pgweb 15734 Mar 2 03:20 z.php -- Marcello Perathoner webmaster@gutenberg.org

On Wed, 2 Mar 2005, Marcello Perathoner wrote:
We are ready to migrate the web site to the new fast file server.
Also some slight changes were made to the online catalog to make it better cacheable:
I got an email from one person who suggested that how to volunteer should be listed up with the donation finromation in addition to where it is in the "In Depth" section [marked <<< below]. Apparently some people don't read "In Depth" until they are already involved, and this person just wanted to know how volunteer. + Donate. How to make a donation to Project Gutenberg. + News and Events. The news. + Contacts. How to get in touch. + Partners, Affiliates and Resources. A collection of links. + Credits. Thanks to our most prominent volunteers. * In Depth Information. All you ever wanted to know about Project <<< Gutenberg. + Volunteering. How you can help Project Gutenberg. <<<

Michael Hart wrote:
I got an email from one person who suggested that how to volunteer should be listed up with the donation finromation in addition to where it is in the "In Depth" section [marked <<< below]. Apparently some people don't read "In Depth" until they are already involved, and this person just wanted to know how volunteer.
+ Donate. How to make a donation to Project Gutenberg. + News and Events. The news. + Contacts. How to get in touch. + Partners, Affiliates and Resources. A collection of links. + Credits. Thanks to our most prominent volunteers. * In Depth Information. All you ever wanted to know about Project <<< Gutenberg. + Volunteering. How you can help Project Gutenberg. <<<
Duplicating menu entries just creates confusion. We could move "Volunteering" into the "About" section, but I think its better placed in the "In Depth" section. -- Marcello Perathoner webmaster@gutenberg.org

I suppose while these updates are going on, we should also update 13,000 to 15,000 in the opening: Project Gutenberg is the oldest producer of free electronic books (eBooks or etexts) on the Internet. Our collection of more than 13.000 <<< eBooks was produced by hundreds of volunteers.

It's too bad we can't make that dynamic, feeding off of a database =) -brandon Michael Hart wrote:
I suppose while these updates are going on, we should also update 13,000 to 15,000 in the opening:
Project Gutenberg is the oldest producer of free electronic books (eBooks or etexts) on the Internet. Our collection of more than 13.000 <<< eBooks was produced by hundreds of volunteers.
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d

Brandon Galbraith wrote:
I suppose while these updates are going on, we should also update 13,000 to 15,000 in the opening:
It's too bad we can't make that dynamic, feeding off of a database =)
Not worth the trouble ... First, we had to agree on what counts as an ebook in its own right. Eg. we have a Bible in the collection, where every chapter got its own ebook number. Also, many books are posted in parts, and every part got its own number besides the complete book. To get a meaningful count of ebooks we first had to get rid of such shameless stuffings. -- Marcello Perathoner webmaster@gutenberg.org

On Fri, Mar 04, 2005 at 09:09:59PM +0100, Marcello Perathoner wrote:
Brandon Galbraith wrote:
I suppose while these updates are going on, we should also update 13,000 to 15,000 in the opening:
It's too bad we can't make that dynamic, feeding off of a database =)
Not worth the trouble ... First, we had to agree on what counts as an ebook in its own right.
Eg. we have a Bible in the collection, where every chapter got its own ebook number. Also, many books are posted in parts, and every part got its own number besides the complete book.
To get a meaningful count of ebooks we first had to get rid of such shameless stuffings.
That's an unwarranted poke, Marcello. We do have a count, and it's eBook #s as used as the primary access point to our files. Agreeing on what counts as an eBook is not necessary. We know how many eBook #s we have, even if there is disagreement on what counts as an eBook. There are plenty of words (in GUTINDEX.ALL and elsewhere) to augment this simplistic number. -- Greg

We resisted the temptation to divide the Bible and Shakespeare into various sections when others were claiming AEsop's Fables each as an individual eBook to pad their bibiographies. However, when people started requesting individual Shakespeare plays and books of the Bible for research purposes, we did as they asked, which we nearly always try to do for our readers. I'm sure some people would also try to prevent paper publishers and libraries from publishing individual Shakespeare plays or books of the Bible. BTW, I think we put all the shortest books in one file, at last that was my intention. However, when someone donates a Shakespeare or Bible in their own particular favorite format and breakdown, that's totally up to them, and I'm not about to fight with them about it. . . . If someone wants a verse by verse eBible, I think we should zip it all in one huge file, but still let it unzip in the manner they prefer. mh On Fri, 4 Mar 2005, Greg Newby wrote:
On Fri, Mar 04, 2005 at 09:09:59PM +0100, Marcello Perathoner wrote:
Brandon Galbraith wrote:
I suppose while these updates are going on, we should also update 13,000 to 15,000 in the opening:
It's too bad we can't make that dynamic, feeding off of a database =)
Not worth the trouble ... First, we had to agree on what counts as an ebook in its own right.
Eg. we have a Bible in the collection, where every chapter got its own ebook number. Also, many books are posted in parts, and every part got its own number besides the complete book.
To get a meaningful count of ebooks we first had to get rid of such shameless stuffings.
That's an unwarranted poke, Marcello.
We do have a count, and it's eBook #s as used as the primary access point to our files.
Agreeing on what counts as an eBook is not necessary. We know how many eBook #s we have, even if there is disagreement on what counts as an eBook. There are plenty of words (in GUTINDEX.ALL and elsewhere) to augment this simplistic number. -- Greg
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d
participants (5)
-
Andrew Sly
-
Brandon Galbraith
-
Greg Newby
-
Marcello Perathoner
-
Michael Hart