HTML and text statistics

15 Feb 2012

      Having recently downloaded the PG 2010 DVD (thank you very much PG), I
started looking at the books:
- The DVD seems to contain about 34,000 zip files.
- A small number of these (about 50) seem to be zipped mp3 files (*m.zip).
- About 1800 files seem to be zipped html files (*h.zip).

Does this mean that PG (at that time) only had about 1800 'real' (as opposed
to generated-from-the-text-file) html books, or have I misunderstood
something?
Presumably many more books have been added since, I think I recall an append
from Greg that the number is now more like 40,000, and I get the impression
that most of these will come from DP and have html versions. But even so
this seems to imply that there are about 28-30,000 books as text files for
which no html version exists other than the generated one.

Is this number about right? or have I missed something obvious?

Bob Gibbins

Robert Gibbins

Greg Newby

Jim Adcock

don kretz

Lee Passey

tags

participants (5)