
gardner said:
I'm going to take this as a jumping off point for a more general question about whether pagination of a published edition, is worth saving. Obviously there is a range of opinion. I'll give you mine.
yes, there is a range of opinion. i can give arguments -- even good arguments -- on both sides. which obviously means that _some_ people have good arguments for retaining pagination. and you disenfranchise those people entirely when you throw out the pagination, no matter how good your intentions might be for doing so. i'd rather not disenfranchise those people... so i think it's necessary to include the pagination, and the original linebreaks, with end-line hyphenates. now, i think it's _imperative_ that we give people tools that enable them to discard that pagination, and unwrap those original linebreaks, and rejoin end-line hyphenates. to do otherwise would be to disenfranchise _those_ people, and i'd rather not disenfranchise them either. so, for me, the answer to the question is extremely simple.
What I believe, philosophically, I am shooting for is to capture the core content, and reject the details that have mainly to do with the medium of publication.
i can understand that perspective. i can also understand the other perspective. and i see no reason that anyone has to be unhappy here. it's very important to understand that this does _not_ have to be an either/or question. we _can_ do _both_...
Maybe there are possible future uses of my text that would want the things that I left out. I tend to doubt that this could ever be very important.
well, then, your imagination is starting to lag behind... :+) because we are now right in the middle of a situation here where "the things that you left out" _are_ "very important", namely a reproofing of your book, to test your accuracy... it's an order of magnitude more difficult to proof a book when the text has lost all of its linebreaks and pagination. are you of the opinion that the future will simply _accept_ that you did a perfect job in the digitization of your books? or do you think they will want to verify the quality of them? if you make it too difficult for them to undertake that job, they will just toss out your text and start anew. your loss...
As an individual contributor I do not feel that my time is best spent capturing and encoding that, and so I don't.
except the info was already captured. then you threw it away.
And I am happy that PG finds my efforts acceptable despite this deficiency.
except the future will throw out all the d.p. works because your deficiency is shared by the entire d.p. corpus, sadly... (even the d.p. people who save pagination toss linebreaks.)
I haven't done any sort of real research, but a quick look shows me that not many texts attempt to preserve line endings in any way. Preserving line endings seems quite unpopular.
the future needs to future-proof tens of millions of books... they can afford to throw out everything done up to this point, if they feel they need to, and they will, they most certainly will. (and advances in o.c.r. and o.c.r. correction will make it easy.) (well, as i pointed out, they might use some of the current texts to proof the new o.c.r. that they do, but then they'll toss them.)
Bowerbird wants to keep both
actually, i don't need to take a personal stand on the issue, not as an end-user. and that's a good thing, because often i don't have a need for the original linebreaks or pagination. so i definitely want the option of discarding that information. what i am saying is that, as a "best practice" for digitization, the discarding of such information is clearly a terrible mistake. and if you're doing it simply because _you_ don't see a need for that information, then you're being selfish and shortsighted. plus you're giving an ultimatum to people who want that info: they're forced to toss your text as failing to meet their needs. and i am positive that you're going to lose that bet, gardner... -bowerbird p.s. by the way, i found 23 discrepancies in the paragraphing between your version of "the advocate" and archive.org's o.c.r. however, 21 were errors in the o.c.r., and only 2 in your book. the two errors in your book, both of them missed paragraphs:
http://z-m-l.com/go/gardn/gardnp087.html "What, take the bird back to the bush where we
http://z-m-l.com/go/gardn/gardnp101.html Yet who shall blame the sun and moon for that?