re: [gutvol-d] the end of the line

26 Jun 2006

      jeroen said:
...
Although I agree with Michael that there is no need 
   to preserve things as linebreaks in most texts --
ok, well you and michael agree.   that's good.          :+)

but what do you say to end-users who want that info?

somehow, "tough luck, kid, _we_ don't think it's necessary"
doesn't sound like the kind of thing _i_ want to tell people.

because that's the type of statement that makes people
go off to a different cyberlibrary.   that's my whole point.

(and to all of the other people who responded on similar
"theoretical" grounds, i'm truly sorry you missed the point.)
...
if you really need to go to that level of detail, there 
   is always the original or the scans to fall back upon
well, neither of those gives you the flexibility of digital text.

but yes, a tight coupling of the two forms is the best method.
you will note that those "digital reprints" from jose menendez
allow a reader to summon the scan of the page with one click.

(since the page already looks like the scan anyway, there might
be little reason to do it, though, except to verify that similarity.
but this constant willingness to demonstrate the verisimilitude
will be the proof that makes people comfortable with the use of
the smaller-sized digital reprint, with its expanded functionality,
as opposed to the bigger, slower, dumber collection of scans.
anyone who has proofread a scan against reflowed text knows
the reflowing makes that task immensely more difficult though,
so you'll never attain the same confidence in the text's accuracy.)
...
I want to make a case for preserving page numbers, 
   if not at least as recognisable anchors in text, and only for 
   those books being referenced to regularly by other books.
page-numbers are retained in many e-texts these days...

but i'm sure you remember we all had this same argument
about page-numbers.   i'm confident that -- down the line --
sentiment will similarly change to be in favor of line-breaks.

in general, i've just been content to wait it out until the change;
but seeing all the e-texts as they cross my screen downloading
made me realize again the sadness of the discarded line-breaks.
...
This excludes most fiction, but is particularly important for 
   scientific works, which have constructed a kind of paper web 
   with cross references mainly based on page numbers.
there are plenty of cross-references made to works of fiction.

and the concept of "books reading each other" would require
that _all_ of our books are brought under the same umbrella...
...
In long term, such references of course should give way to proper 
   references to the actual paragraph or sentence being referenced
good!   you recognize the need for a finer-grained pointer than the page.

because that's the kind of thinking that leads to line-break retention.

you can narrow things down rather specifically when you point to the
range that's represented from page-19-line-7 to page-21-line-14, 
or from page-87-line-6 to page-87-line-8, can't you?   not only that,
this kind of reference also works for the person who only has the paper
copy of the book, not the e-book, if the two are duplicates of each other.

and that's precisely the type of capability i'll have in my viewer-program.

even in a traditional browser, it wouldn't be hard to implement something
roughly equivalent, though.    the user could specify some text with a link,
and after going to the precise point of the link, the browser could then
execute a "find" command for the specified text.   it wouldn't be hard at 
all,
and would seem to give a rather exact form of pointing to a specific place.

it has the benefit of being implemented entirely outside of the document,
as well, which i see as being tremendously important.   if all our links need
markup in the original document to be implemented, as is the present case,
we're _never_ going to be able to quickly get to a point of profuse 
interlinks.
we'll get thoroughly bogged down in the quicksand of heavy markup first...
(for an example of that, take a look at the markup which jon noring posted,
and then read through that particular diversion of this thread.   the 
horrors!)
...
but as a practical ad-interim solution, staying with page numbers will 
   increase the number of texts we can digitize with our limited means.
it doesn't cost anything to retain the line-break information.
...
I would however, like to see the collection be incorporated in a kind of
...
wiki-like system, where people can add -- without tampering with the 
static 
   source texts -- annotations, add tagging and create live cross 
references
i've had a demo up for some time now showing "continuous proofreading".
...
http://users.aol.com/bowerbird/proof_wiki.html
i also used a similar template in these demo-books:
...
http://www.greatamericannovel.com/mabie/mabiep001.html
   http://www.greatamericannovel.com/myant/myantc001.html
  http://www.greatamericannovel.com/ahmmw/ahmmwc001.html
   http://www.greatamericannovel.com/sgfhb/sgfhbc001.html
this system could easily be elaborated upon to build what you requested here.

indeed, i will be pouring all of the p.g. texts that i'll be handling -- 
perhaps
some 5000-6000, as near as i can tell -- into just this type of system, 
within
the next 6 months, and i would be open to any ideas that you might have...

heck, design a webpage to do what you want, and i will use it as the 
template.
you know me, i don't even care if it "validates", as long as it's easy and it 
works.

***

andrew said:
...
There are places such as wikisource.org, where you could add the texts 
   and start providing links such as you mention here immediately.
i'll check out wikisource.org to see what kind of capabilities they offer.

in the past, when i've looked at existing sites, it has seemed that wikis
aren't geared to do things -- like populate pages -- on a massive scale.

even rather fundamental things like batch f.t.p. are sometimes missing.
and when you're dealing with thousands, or tens of thousands, of files,
it becomes absolutely necessary to deal with them in a template fashion.

i also think there's a good reason jeroen asked for a "wiki-like system",
and not a wiki per se, as indicated by his concern about "tampering"
with the static source texts.   the thought is that the original source
-- and indeed, the string of comments as well -- must be inviolate.

that's because the idea is to build a body of thought around a text,
of which links -- intrasystem, and outgoing and incoming -- are
a very crucial aspect.   and it's not possible to link into a wiki proper,
because what was there yesterday might well be gone today, only to
reappear in different form tomorrow.   you can't link into a pile of sand.

oh sure, you could instruct users to leave link markup untouched.
and they might even follow your instructions.   (yeah, right.)   still,
that will interfere with refactoring, and get very crufty before long.

besides, a good part of the give-and-take of this kind of conversation
involves letting all of the arguments stand, rather than editing them.
(and especially rather than "editing them by deletion".)   let the future
examine all the arguments, and see which ones stand the test of time.

so you need to have stability for the process itself, not just for the links.

-bowerbird

p.s.   jeroen, if you want to provide me a template, i could use it sooner
rather than later, the better to architect it into my overall work-flow...

re: [gutvol-d] the end of the line

Bowerbird＠aol.com