
----- Original Message ----- From: "Jim Adcock" <jimad@msn.com> To: "'Project Gutenberg Volunteer Discussion'" <gutvol-d@lists.pglaf.org> Sent: Monday, September 14, 2009 8:31 PM Subject: [gutvol-d] Re: In search of a more-vanilla vanilla TXT
....but as long as you get the words, who cares what the quote marks look like?
There are a lot of texts where you cannot "get" the words from just the words. There are also texts with quotes within quotes, where if you don't care what the quote marks look like _you cannot read it!_
I think I, and any other followers of this thread, will need an example of "not getting the words from the words". I've seen any number of instances of nested quotes, mostly nested doublequotes, lots of triple-nested double-single-double quotes, and some triple-nested single-double-single quotes (mostly in British-published books) and I have yet to encounter any that I couldn't read, either in the original source or when they've been etexted.
Certainly a text like Tristram Shandy demonstrates there are books which are NOT just about the words -- where rather, the artistry of representing word on paper -- including careful choice of fonts, puncs, etc. is a central part of the artistry -- as one can easily see by comparing a bad publication of this work to a good one! The good publications represent the work of the artist, the bad one's clearly do not. And a txt representation would be just so many chicken scratchings in the mud.
I've looked at PG's text and HTML version of Shandy, and several PDFed scansets in Internet Archive. Unless I'm missing something, they all look like standard prose to me. If you've got an edition as difficult to transcribe as you seem to indicate, and it's not in Internet Archive, you should scan it, and if you have no interest in producing it yourself, upload the zipped scanset via FTP to PG (I can give exact instructions to you privately). As long as it's clearable, it may be possible to arrange for it to go into PG's Preprints page where it'll be available as a project for someone.
I'm sure there are many here who would say "but I don't like Tristram Shandy" -- and that would be my point. By bringing a prejudice to the table that only texts worth representing in txt are worth representing, you prejudice what books PG is allowed to preserve, and you censor the choice of artists that others are permitted to preserve. You represent some artists, and consign the others to oblivion.
Personally, I'm book-agnostic--as long as it's in English, a book is a book is a book. I'm would assume that those who produce books for PG in other languages feel the same way about books in those languages. Distributed Proofreaders, at least once, has produced a book in a language none of its proofers understood (#27120).
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/mailman/listinfo/gutvol-d