Re: !@!!@!!@!Re: Re: so what is so important about pagination?

michael said:
I had to have read the whole thing to get to the part I quoted. . .duh!
you can quote something without reading it. but i should have said i wish you would have _understood_ what i was saying, because then you would have understood that your response didn't address the point that i was making.
When you ask people to pay attention, it helps to PAY ATTENTION.
i am paying attention, michael. even though i've heard what you're saying lots and lots and lots of times before. and it makes sense when you are addressing the people who are making _other_ points. but they left long ago...
It addresses EXACTLY the point you made that I quoted. . . .
no, it doesn't.
Then SAY that!!! Right up front in plain language!!!
i _have_ said it, every time i've talked about this issue... including once when marcello brought up this same point you brought up (think about that, michael), about editions. (it was back in september of 2007, if you're curious.) marcello pointed out that the various different editions of "pride and prejudice" had different pagination, and he asked which of the editions should be used to do the pagination... here's the reply i made to marcello's point:
if you don't care to read all of it, here are the guts of that reply:
the answer to the question as to which set of linebreaks and pagebreaks to use is this: the ones in the edition you digitize.
plain old common sense. if you didn't already know the answer, perhaps you might want to exercise your brain a little bit more...
if you're digitizing the 1844, use its linebreaks and pagebreaks. if you're digitizing the 1853, use its linebreaks and pagebreaks. if you're digitizing the 1870, use its linebreaks and pagebreaks. if you're digitizing the 1892, use its linebreaks and pagebreaks.
and here is a web-page showing the first page of those 4 editions:
and yes, in case you're wondering, if a p-book was important enough to go through different editions, we should digitize _every_ edition... i'm not going to tell you which of those 4 editions you _should_ use, which one is the "right" one. whichever one you want to use is "right". and you should be able to determine if any specific e-book _does_ or _does_ not match the edition you want to use, or some other edition. *** by the way, there's another web-page of interest in this directory:
this fascinating page shows some work done by jose menendez. jose adopted my suggestion that the e-book be able to mimic the p-book, and he created a series of .pdf books that did just that... shown on this web-page are some screenshots of his .pdf-books, compared to the page-scans from those pages. he did a great job. of course, since a .pdf-book is unable to reflow its text, jose's work doesn't fit the more-important criterion of reflowability, but it does show that the ability to mimic the original can be extremely valuable.
However, that still relegates us to being a Xerox machine, no?
no. because a xerox machine can't do reflow. or fix typos. or pull in spacey contractions. or change the font, or size. look, i understand the appeal of digital text _extremely_ well. i've made all of the arguments myself, so there's no need for you to repeat 'em back at me, you're just wasting your breath. but there's a problem looming here, a problem that the future will have to face, and solve, and i'm telling you what you need to do, so that you can _help_ the future _solve_ that problem, such that your e-texts will continue to be used, and not tossed. i'm on _your_ side, michael... i've got your back, good buddy... so you need to get that through your skull and start listening...
I'm never going to get into any of these semantic arguments!!!!!!! Mimic means to copy as closely as possible. . . . Synonym: copy.
it's not a semantic argument, michael. it's protective coloration. if your copy isn't capable of _assuming_the_look_and_feel_ of the thing that it _purports_ to be copying, nobody will trust it. you seem to be forgetting that you are claiming to _be_ a copy. perhaps you are an "improved" copy, but you are _still_ a copy. certainly if you came out and said "we rewrote parts of the middle, because the original was too boring", you would expect that people would throw you away. but what if someone points out a few errors in your work, and says "see, you can't trust this work, it hasn't been faithfully transcribed," then what is your defense? you can say that "it was just a few errors", but what if they then point out a few more, and a few more after that? at what point can you no longer expect the end-user to believe you?
As I have said before, if you would listen, I am not AGAINST keeping a copy with such pagination for such purposes
well, good, and bully for you, and all that, but the fact of the matter is that project gutenberg is not, at this point in time, actually doing that.
but I draw the lines, pun intended, at keeping every character in the same page position when there is no need for pages, in all available PG editions.
and, as i have said before, if you would listen, i'm not suggesting, in any way, shape, or form, that pagination and linebreaks need to be kept "in all available p.g. editions". that'd be stupid, absolutely and totally and ridiculously _stupid_, and i don't usually feel a need to rule out stuff that is absolutely, totally, ridiculously stupid...
I want our eBooks to be optimally readable: Minimal end of line hyphenation. No page headers or footers. Just plain reading.
i'm 100% in favor of that. and i have demonstrated before, and will be happy to demonstrate once again, any time that you like, how p.g, could save its texts in a format that allows verification of the type that i am talking about, _and_ allows the end-user to have the text exactly like you specify.
Once again, I have no stance AGAINST people who want pagination, I just don't want for force any such arbitrary formats on anyone and neither should you or anyone else. STOP TRYING TO FORCE YOUR OPINIONS ON OTHERS, MAKE THEM OPTIONS!
see, that's precisely why i said you're not listening to me, because there's no way in the world i would try to "force" this on end-users. i've never, ever, said _anything_ even _remotely_ like that, in all the years i've been on this listserve, or the decades i have done e-books, so i don't know who you're having this conversation with, michael, but it's obviously not me.
I CAN tell you that most of the paper editions' page numbers will fade along with the hyphenation.
no, they won't. because our cultural heritage is full of references to page-numbers, and it'll be several orders of magnitude cheaper and more efficient to keep track of those page-numbers than to attempt to re-do all those references using hyperlinks or whatever.
Last time I looked there were still pretty ubiquitous programs to lay out all such differences.
IFF you have such deep interests, you can simply put up two editions side by side when you look at them. . .I do. . . .
If not, then you aren't really that interested. . .it's all smoke.
this is very amusing. i do this kind of work, michael. regularly. and i can tell you that it's not nearly as simple as you make it sound.
"_RIGHT_" copy??? Now you've contradicted yourself back into the ivory tower. . . . "_RIGHT_" copy, indeeeeed. . . .
you just can't _wait_ to jump to the wrong conclusion, can you? by "right" copy, i mean the one that the person _wants_ to see.
This will ONLY do you any good if you manage to find that edition, out of all the other paper editions in the world.
again, "that edition" is whatever edition the person wants to use. i have a paper copy of "catcher in the rye" on my bookshelf now. let's say, 10 years on, i can find a dozen digital versions online. let's also say that some analysis shows differences between them. i haven't compared all, not in full, but i know there's some diffs. i don't want a dozen different versions. i want the one that matches the paper copy that has been sitting on my bookshelf for 4 decades. how do i determine which one -- _if_any_ -- is the same version as the one that is sitting on my shelf? that's the difficult question.
Sorry, but I anticipated ALL of these questions when I first started, and have answered, and will continue to answer, at length.
no, you didn't answer the question. so i just asked it to you again.
Why can't you just propose your ideas as OPTIONS, not CARVED IN STONE?
stop making the thread ridiculous. no one can carve anything in stone any more. -bowerbird

On Mon, 22 Feb 2010, Bowerbird@aol.com wrote:
michael said:
I had to have read the whole thing to get to the part I quoted. . .duh!
you can quote something without reading it.
YOU can. . .or at least without understanding it.
but i should have said i wish you would have _understood_ what i was saying, because then you would have understood that your response didn't address the point that i was making.
Since it is YOUR point, only YOU c"ould have understood. . . ."
When you ask people to pay attention, it helps to PAY ATTENTION.
i am paying attention, michael. even though i've heard what you're saying lots and lots and lots of times before.
Pleaes elucidate, without all the verbiage. . . .
and it makes sense when you are addressing the people who are making _other_ points. but they left long ago...
It addresses EXACTLY the point you made that I quoted. . . .
no, it doesn't.
As above. . . .please.
Then SAY that!!! Right up front in plain language!!!
i _have_ said it, every time i've talked about this issue...
you're not saying it here and now. . . .
including once when marcello brought up this same point you brought up (think about that, michael), about editions. (it was back in september of 2007, if you're curious.)
marcello pointed out that the various different editions of "pride and prejudice" had different pagination, and he asked which of the editions should be used to do the pagination...
here's the reply i made to marcello's point:
if you don't care to read all of it, here are the guts of that reply:
the answer to the question as to which set of linebreaks and pagebreaks to use is this: the ones in the edition you digitize.
plain old common sense. if you didn't already know the answer, perhaps you might want to exercise your brain a little bit more...
if you're digitizing the 1844, use its linebreaks and pagebreaks. if you're digitizing the 1853, use its linebreaks and pagebreaks. if you're digitizing the 1870, use its linebreaks and pagebreaks. if you're digitizing the 1892, use its linebreaks and pagebreaks.
and here is a web-page showing the first page of those 4 editions:
If you had mentioned. . . .
and yes, in case you're wondering, if a p-book was important enough to go through different editions, we should digitize _every_ edition...
I'm still happy to put out one edition that gets across 99% of meaning, at least until we pass into the millions of books, with few exceptions. Since I actually remember some of what I read in P&P, I am not at all sure I would have gotten an extra 1% out of it, no matter how well done. Not sure I could say the same about Shakespeare, though, though THAT was botched hugely in a least one edition, eh? Italians says you only need ONE edition of Dante. . . .
i'm not going to tell you which of those 4 editions you _should_ use, which one is the "right" one. whichever one you want to use is "right". and you should be able to determine if any specific e-book _does_ or _does_ not match the edition you want to use, or some other edition.
As mentioned elsewhere, this is someone's SUBJECTIVE DECISION. Sorry, that's where I get off this train of thought.
***
by the way, there's another web-page of interest in this directory:
this fascinating page shows some work done by jose menendez.
jose adopted my suggestion that the e-book be able to mimic the p-book, and he created a series of .pdf books that did just that...
shown on this web-page are some screenshots of his .pdf-books, compared to the page-scans from those pages. he did a great job.
of course, since a .pdf-book is unable to reflow its text, jose's work doesn't fit the more-important criterion of reflowability, but it does show that the ability to mimic the original can be extremely valuable.
My comments about .pdf are well known.
However, that still relegates us to being a Xerox machine, no?
no. because a xerox machine can't do reflow. or fix typos. or pull in spacey contractions. or change the font, or size.
OK, a fancy Xerox, but still confined to one edition, still makes it hard to be sure. What about "freindship" or whatever?
look, i understand the appeal of digital text _extremely_ well. i've made all of the arguments myself, so there's no need for you to repeat 'em back at me, you're just wasting your breath.
Then there's no sense talking, is there?
but there's a problem looming here, a problem that the future will have to face, and solve, and i'm telling you what you need to do, so that you can _help_ the future _solve_ that problem, such that your e-texts will continue to be used, and not tossed.
i'm on _your_ side, michael... i've got your back, good buddy...
so you need to get that through your skull and start listening...
I'm never going to get into any of these semantic arguments!!!!!!! Mimic means to copy as closely as possible. . . . Synonym: copy.
it's not a semantic argument, michael. it's protective coloration.
if your copy isn't capable of _assuming_the_look_and_feel_ of the thing that it _purports_ to be copying, nobody will trust it.
This is exactly what they said about The Gutenberg Press. . . . And why Gutenberg wasted so much time putting in scriptorum marks. "Noboby will trust it"??? When we get to the point where people are arguing about which eBook of P&P is the best, and I have given away more than one edition, it becomes moot that eBooks will have already won the day when this is the case with all eBooks, or even most of them. . . . At that point _I_ am willing to shout "GAME OVER. . .I WIN. . .!!! And leave the field of play to the nitpicker ivory tower types. My audience are those who are being exposed to Shakespeare, Austen, Dante, Doyle, etc., not those who are already infected. . . .
you seem to be forgetting that you are claiming to _be_ a copy. perhaps you are an "improved" copy, but you are _still_ a copy.
certainly if you came out and said "we rewrote parts of the middle, because the original was too boring", you would expect that people would throw you away.
Non-sequitur.
but what if someone points out a few errors in your work, and says "see, you can't trust this work, it hasn't been faithfully transcribed," then what is your defense? you can say that "it was just a few errors", but what if they then point out a few more, and a few more after that? at what point can you no longer expect the end-user to believe you?
Just a byte sequitur. Same true of paper editions.
As I have said before, if you would listen, I am not AGAINST keeping a copy with such pagination for such purposes
well, good, and bully for you, and all that, but the fact of the matter is that project gutenberg is not, at this point in time, actually doing that.
Only a bother to a minimal portion of the audience.
but I draw the lines, pun intended, at keeping every character in the same page position when there is no need for pages, in all available PG editions.
and, as i have said before, if you would listen, i'm not suggesting, in any way, shape, or form, that pagination and linebreaks need to be kept "in all available p.g. editions". that'd be stupid, absolutely and totally and ridiculously _stupid_, and i don't usually feel a need to rule out stuff that is absolutely, totally, ridiculously stupid...
Then say it is an OPTION. . . .
I want our eBooks to be optimally readable: Minimal end of line hyphenation. No page headers or footers. Just plain reading.
i'm 100% in favor of that. and i have demonstrated before, and will be happy to demonstrate once again, any time that you like, how p.g, could save its texts in a format that allows verification of the type that i am talking about, _and_ allows the end-user to have the text exactly like you specify.
Only in that limited specification did I specify, not in general.
Once again, I have no stance AGAINST people who want pagination, I just don't want for force any such arbitrary formats on anyone and neither should you or anyone else. STOP TRYING TO FORCE YOUR OPINIONS ON OTHERS, MAKE THEM OPTIONS!
see, that's precisely why i said you're not listening to me, because there's no way in the world i would try to "force" this on end-users.
/then say OPTION!!!
i've never, ever, said _anything_ even _remotely_ like that, in all the years i've been on this listserve, or the decades i have done e-books, so i don't know who you're having this conversation with, michael, but it's obviously not me.
Learn to speak more clearly and lucidly to get the answers you want.
I CAN tell you that most of the paper editions' page numbers will fade along with the hyphenation.
no, they won't. because our cultural heritage is full of references to page-numbers, and it'll be several orders of magnitude cheaper and more efficient to keep track of those page-numbers than to attempt to re-do all those references using hyperlinks or whatever.
Only for that minimal audience following such references. However, if references were made to PHRASES and NOT PAGES, this would work EVER SO MUCH BETTER!!!!!!!
Last time I looked there were still pretty ubiquitous programs to lay out all such differences.
IFF you have such deep interests, you can simply put up two editions side by side when you look at them. . .I do. . . .
If not, then you aren't really that interested. . .it's all smoke.
this is very amusing. i do this kind of work, michael. regularly.
and i can tell you that it's not nearly as simple as you make it sound.
But it's not DONE by people "nearly as simple as you make it sound."!!!
"_RIGHT_" copy??? Now you've contradicted yourself back into the ivory tower. . . . "_RIGHT_" copy, indeeeeed. . . .
you just can't _wait_ to jump to the wrong conclusion, can you?
by "right" copy, i mean the one that the person _wants_ to see.
SUBJECTIVE DECISION. . .not my bailiwick. I'm not getting into making THAT kind of choice.
This will ONLY do you any good if you manage to find that edition, out of all the other paper editions in the world.
again, "that edition" is whatever edition the person wants to use.
Not MY person. . .YOUR person. YOU address YOUR audience, _I_ will address MINE. You will have the field all to yourself all too soon as I shuffle off.
i have a paper copy of "catcher in the rye" on my bookshelf now.
let's say, 10 years on, i can find a dozen digital versions online. let's also say that some analysis shows differences between them. i haven't compared all, not in full, but i know there's some diffs.
Not that it will take you that long to do so. . . .
i don't want a dozen different versions. i want the one that matches the paper copy that has been sitting on my bookshelf for 4 decades.
I think you meant "matches the most closely," eh? Rots of ruck finding one that really matches, eh?
how do i determine which one -- _if_any_ -- is the same version as the one that is sitting on my shelf? that's the difficult question.
Not if you just compare the first and last page. If you are NOT willing to do that, you have no right to an answer. You have to LOOK AT THE QUESTION to get an answer, even as a child, though the steps will be smaller.
Sorry, but I anticipated ALL of these questions when I first started, and have answered, and will continue to answer, at length.
no, you didn't answer the question. so i just asked it to you again.
You did??? Did you make it obvious??? Does ANYONE here know the question???
Why can't you just propose your ideas as OPTIONS, not CARVED IN STONE?
stop making the thread ridiculous.
no one can carve anything in stone any more.
Then don't make it sound like that's what you mean. . . . I'm trying to show/teach you how to make your points. . . . You will appreciate that more after I am gone. . . . mh

The real issue at play here is, I think, one of manuscript preservation versus text preservation. Manuscript preservation is only really relevant to academia, but text preservation is relevant to anyone interested in content. What Bowerbird has been talking about preserving, pagination (for each edition; this seems extreme--how many bargain basement editions have there been of Austen? or Twain? Few of these have any critical significance), hyphenation, etc. are manuscript details. Generic machine-readable formats like HTML & Text are terrible for manuscript preservation. They make wonderful text preservation and reading formats, but there is no way to accurately reproduce the nuances of a manuscript on them. To the discerning reader who has need of this information, the sad fact is that they would be much better examining page scans, than they ever would be using a text or HTML version of the book. Images are, and always will be, the only way to put all of the print information in the user's hands. Again, let us not fool ourselves. The only people who will be interested in the manuscripts are academians and the odd-ball home enthusiast. Readers do not care about the original pages. There have been many editions of Twain or Shakespeare. If I read an etext, I cannot give a page number. But if I read a paper copy, the page number does not do much by way of good in ordinary conversation. It is only in an academic atmosphere, where the ability to check sources is crucial where this matters. In any sort of informal discussion, the answer would be to use textual land marks. "Chapter III of Huck Finn", "Book I, line 120" of Paradise Lost", etc. The ultimate form of manuscript preservation is not digitization nor is it transcription, but it is rounding up all the manuscripts to be preserved and sinking them in a huge concrete bunker (like the US Archives do the Declaration of Independence and the Constitution). So far, PG has been about text preservation. To the larger world, this is of far more importance. Few of us care about the manuscripts of the Iliad, but many of us (including me!) would love the missing texts of the epic cycle. It comes down, then, to what you are interested in. There is a place, I think, for both, but the intersection is a little peculiar. Scans without any further human intervention are the most time effective way to digitize a volume. It is also the way that has the smallest use case. Transcription (whether through human-proofed OCR or manual typing is irrelevant) has the widest use case, but the greatest time investment. So, Bowerbird, are you interested in text preservation or in manuscript preservation? If the former, than artifacts of print are immaterial. Not only will they change with digital, but they have and will change with print as well. If the latter, than why not create DJVU scans of all pages, replacing the OCR with highly proofed text (drawn, perhaps, from the PG archive)? This would, after all, get you the best of both worlds. A searchable, print-faithful computer readable file. Excerpts from Bowerbird's message of Mon Feb 22 18:23:44 -0600 2010:
michael said:
I had to have read the whole thing to get to the part I quoted. . .duh!
you can quote something without reading it.
but i should have said i wish you would have _understood_ what i was saying, because then you would have understood that your response didn't address the point that i was making.
When you ask people to pay attention, it helps to PAY ATTENTION.
i am paying attention, michael. even though i've heard what you're saying lots and lots and lots of times before.
and it makes sense when you are addressing the people who are making _other_ points. but they left long ago...
It addresses EXACTLY the point you made that I quoted. . . .
no, it doesn't.
Then SAY that!!! Right up front in plain language!!!
i _have_ said it, every time i've talked about this issue...
including once when marcello brought up this same point you brought up (think about that, michael), about editions. (it was back in september of 2007, if you're curious.)
marcello pointed out that the various different editions of "pride and prejudice" had different pagination, and he asked which of the editions should be used to do the pagination...
here's the reply i made to marcello's point:
if you don't care to read all of it, here are the guts of that reply:
the answer to the question as to which set of linebreaks and pagebreaks to use is this: the ones in the edition you digitize.
plain old common sense. if you didn't already know the answer, perhaps you might want to exercise your brain a little bit more...
if you're digitizing the 1844, use its linebreaks and pagebreaks. if you're digitizing the 1853, use its linebreaks and pagebreaks. if you're digitizing the 1870, use its linebreaks and pagebreaks. if you're digitizing the 1892, use its linebreaks and pagebreaks.
and here is a web-page showing the first page of those 4 editions:
and yes, in case you're wondering, if a p-book was important enough to go through different editions, we should digitize _every_ edition...
i'm not going to tell you which of those 4 editions you _should_ use, which one is the "right" one. whichever one you want to use is "right". and you should be able to determine if any specific e-book _does_ or _does_ not match the edition you want to use, or some other edition.
***
by the way, there's another web-page of interest in this directory:
this fascinating page shows some work done by jose menendez.
jose adopted my suggestion that the e-book be able to mimic the p-book, and he created a series of .pdf books that did just that...
shown on this web-page are some screenshots of his .pdf-books, compared to the page-scans from those pages. he did a great job.
of course, since a .pdf-book is unable to reflow its text, jose's work doesn't fit the more-important criterion of reflowability, but it does show that the ability to mimic the original can be extremely valuable.
However, that still relegates us to being a Xerox machine, no?
no. because a xerox machine can't do reflow. or fix typos. or pull in spacey contractions. or change the font, or size.
look, i understand the appeal of digital text _extremely_ well. i've made all of the arguments myself, so there's no need for you to repeat 'em back at me, you're just wasting your breath.
but there's a problem looming here, a problem that the future will have to face, and solve, and i'm telling you what you need to do, so that you can _help_ the future _solve_ that problem, such that your e-texts will continue to be used, and not tossed.
i'm on _your_ side, michael... i've got your back, good buddy...
so you need to get that through your skull and start listening...
I'm never going to get into any of these semantic arguments!!!!!!! Mimic means to copy as closely as possible. . . . Synonym: copy.
it's not a semantic argument, michael. it's protective coloration.
if your copy isn't capable of _assuming_the_look_and_feel_ of the thing that it _purports_ to be copying, nobody will trust it.
you seem to be forgetting that you are claiming to _be_ a copy. perhaps you are an "improved" copy, but you are _still_ a copy.
certainly if you came out and said "we rewrote parts of the middle, because the original was too boring", you would expect that people would throw you away.
but what if someone points out a few errors in your work, and says "see, you can't trust this work, it hasn't been faithfully transcribed," then what is your defense? you can say that "it was just a few errors", but what if they then point out a few more, and a few more after that? at what point can you no longer expect the end-user to believe you?
As I have said before, if you would listen, I am not AGAINST keeping a copy with such pagination for such purposes
well, good, and bully for you, and all that, but the fact of the matter is that project gutenberg is not, at this point in time, actually doing that.
but I draw the lines, pun intended, at keeping every character in the same page position when there is no need for pages, in all available PG editions.
and, as i have said before, if you would listen, i'm not suggesting, in any way, shape, or form, that pagination and linebreaks need to be kept "in all available p.g. editions". that'd be stupid, absolutely and totally and ridiculously _stupid_, and i don't usually feel a need to rule out stuff that is absolutely, totally, ridiculously stupid...
I want our eBooks to be optimally readable: Minimal end of line hyphenation. No page headers or footers. Just plain reading.
i'm 100% in favor of that. and i have demonstrated before, and will be happy to demonstrate once again, any time that you like, how p.g, could save its texts in a format that allows verification of the type that i am talking about, _and_ allows the end-user to have the text exactly like you specify.
Once again, I have no stance AGAINST people who want pagination, I just don't want for force any such arbitrary formats on anyone and neither should you or anyone else. STOP TRYING TO FORCE YOUR OPINIONS ON OTHERS, MAKE THEM OPTIONS!
see, that's precisely why i said you're not listening to me, because there's no way in the world i would try to "force" this on end-users.
i've never, ever, said _anything_ even _remotely_ like that, in all the years i've been on this listserve, or the decades i have done e-books, so i don't know who you're having this conversation with, michael, but it's obviously not me.
I CAN tell you that most of the paper editions' page numbers will fade along with the hyphenation.
no, they won't. because our cultural heritage is full of references to page-numbers, and it'll be several orders of magnitude cheaper and more efficient to keep track of those page-numbers than to attempt to re-do all those references using hyperlinks or whatever.
Last time I looked there were still pretty ubiquitous programs to lay out all such differences.
IFF you have such deep interests, you can simply put up two editions side by side when you look at them. . .I do. . . .
If not, then you aren't really that interested. . .it's all smoke.
this is very amusing. i do this kind of work, michael. regularly.
and i can tell you that it's not nearly as simple as you make it sound.
"_RIGHT_" copy??? Now you've contradicted yourself back into the ivory tower. . . . "_RIGHT_" copy, indeeeeed. . . .
you just can't _wait_ to jump to the wrong conclusion, can you?
by "right" copy, i mean the one that the person _wants_ to see.
This will ONLY do you any good if you manage to find that edition, out of all the other paper editions in the world.
again, "that edition" is whatever edition the person wants to use.
i have a paper copy of "catcher in the rye" on my bookshelf now.
let's say, 10 years on, i can find a dozen digital versions online. let's also say that some analysis shows differences between them. i haven't compared all, not in full, but i know there's some diffs.
i don't want a dozen different versions. i want the one that matches the paper copy that has been sitting on my bookshelf for 4 decades.
how do i determine which one -- _if_any_ -- is the same version as the one that is sitting on my shelf? that's the difficult question.
Sorry, but I anticipated ALL of these questions when I first started, and have answered, and will continue to answer, at length.
no, you didn't answer the question. so i just asked it to you again.
Why can't you just propose your ideas as OPTIONS, not CARVED IN STONE?
stop making the thread ridiculous.
no one can carve anything in stone any more.
-bowerbird -- Michael McDermott www.mad-computer-scientist.com

On Tue, Feb 23, 2010 at 2:22 PM, Michael McDermott <mmcdermott@mad-computer-scientist.com> wrote:
Readers do not care about the original pages. There have been many editions of Twain or Shakespeare.
But a lot of books aren't Twain or Shakespeare. Most non-fiction books are littered with page numbers that probably should be converted to hyperlinks, but that's a lot of work. And non-fiction books reference page numbers in other books. -- Kie ekzistas vivo, ekzistas espero.

....are you interested in text preservation or in manuscript preservation? PG & DP while they do good work for society don't actually do either of those things. What they do is transcription of a book into ASCII or something close to ASCII -- even when transcribing into HTML or ISO. The end result is usually something that is readable and recognizable as being somehow more-or-less related to what the original author wrote and the original author published. Is it "correct" ? Of course not -- one cannot talk about "correctness" when something is 1) intended to be readable by today's audience, and 2) has been transcribed into something that is a small subset of what was available to publishers even by the 1700s 3) the chosen subset is primarily dictated by what can be easily input from a standard IBM chicklet keyboard and more-or-less OCR'ed by standard OCR software 4) a subset of punctuation and simplified punctuation rules have been adopted in practice which differ somewhat from that which obviously the author and publisher put in their books. One might be tempted to say that what PG & DP actually do is "word preservation" but actually they don't even really do that either. Its really re-interpretation and republishing from one format -- on paper by professional publishers a long long time ago, into another format -- either a PG specific non-re-flowable electronic format built around "teletype" standards of the early 1970s, "ASCII, 70 chars more-or-less per line" similar to AP wire format, or to HTML for lowest-common-denominator browsers -- said constraint being in practice more likely the HTML to EPUB and/or HTML to MOBI converter routines and the limitations of EPUB and/or MOBI stand-alone reader hardware -- and doing so in a way that might actually be read by one or another target audiences on said devices. Are these efforts successful? I think so -- for example when I see a friend of mine has bought a new iPad and is happily reading a text I produced for PG prior to the iPad's announcement and my friend didn't even realize that I wrote it in HTML and PG published it -- because of course Apple strips out the PG header and transcriber acknowledgements before converting it to Apple DRM'ed EPUB and redistributing it as "Apple's own free book available only from the Apple iPad Store"! [ Thank You Jobs -- who's "1984" now??? ]

On Fri, Apr 16, 2010 at 7:05 PM, James Adcock <jimad@msn.com> wrote:
Of course not -- one cannot talk about "correctness" when something is 1) intended to be readable by today's audience, and 2) has been transcribed into something that is a small subset of what was available to publishers even by the 1700s 3) the chosen subset is primarily dictated by what can be easily input from a standard IBM chicklet keyboard and more-or-less OCR'ed by standard OCR software 4) a subset of punctuation and simplified punctuation rules have been adopted in practice which differ somewhat from that which obviously the author and publisher put in their books.
One can always talk about correctness; it comes in many different levels and varieties. Just because the New Testament was written in Greek, doesn't mean we can't call an English translation wrong where the Gospel of John starts: "Send David all your money in small unmarked bills." I rather like that translation, but objectively speaking, it doesn't represent the original Greek in any way, shape or form. -- Kie ekzistas vivo, ekzistas espero.
participants (5)
-
Bowerbird@aol.com
-
David Starner
-
James Adcock
-
Michael McDermott
-
Michael S. Hart