
PG is a repository of digitised books with emphasis on pure text. Why? Michael thought far ahead; look how other formats are changing the whole time, and that's why you continue to have your religious wars without end or solution (I'm not going to be involved!).
First of all, what PG requires sadly is NOT a "text" file. For example I can create a "text" file from an html file in two seconds doing "cut and paste." The result is a text file. Is it something that PG will accept as their "PG-text" file? Sadly, no. Michael thought the details of text format choices didn't matter, and then proceeded to make a "fatal error" in what detailed text format he chose to standardize on. Namely he insisted on detailed and gratuitous extraneous newline inclusion rules. Also his rules for even how "PG-text" files work have changed over the years! Even his "bottom-line" standard isn't standard! Secondly, the theory is, or was, that one could move forward from a PG text file to more modern formats in a relatively small, finite amount of work -- such that the text files represent a "baseline" format. Well, is this true? Not really. The tale of 76.txt shows that this is not true. While working forward from an existing PG txt file to a "modern" version of an ebook file format is certainly LESS work than starting from scratch, it is still a tremendous amount of work, and requires continued access and comparison to the original page images. If one insisted on a true baseline format, what would it be? Answer: high resolution digitized page images. How useful would that be? Well, I've read old books in digitized page image form, and it wasn't a whole lot of fun. Almost better to read the raw OCR'ed versions.