New subject: Basic simple test case.

10 Oct 2012

      Why? To discuss as an illustration of proper handling of the issues you
list, and others; and to see what alternative markup schemes look like in
action.

Sent from my Phone
------------------------------
From: James Adcock
Sent: 10/10/2012 4:53 AM
To: Project Gutenberg Volunteer Discussion
Subject: Re: [gutvol-d] Basic simple test case.

Re 14668:

Well, the first question would be: Why?

Contrary to the idea that PG needs to scale up efforts 10X and “do
everything” maybe the right answer is to scale DOWN things by 10X and fix
the books that people actually want to read, but which are currently
hopelessly gone moldy, rather than offer more kiddie readers?

Secondly, one needs to get page scans, which are at least available from
Google in a variety of editions, you’d have to pick one.

In terms of the current “automagic” HTML conversion from txt, this txt
shows the problem that PG isn’t even currently “correctly” specifying that
similar <p> formatting be used on each device.  Seems given the PG txt
conventions, PG should be specifying “no indent, 1em of white space between
paragraphs” for the <p> styling – so at least the basics match the txt
styling.  This is important because txt “formatters” implicitly are using
the txt formatting rules as an element of the formatting – i.e. syntax vs.
semantics **cannot** be uniquely determined automagically by examining a PG
txt file, so the best one can hope to do is to emulate the PG txt layout.

In terms of hand-recoding the html/epub/mobi there appears to be no great
problems other than understanding and dealing with the issue of
merged/rounded top/bottom margins or not, which can be dealt with in the
standard manner of using top margins only.

In terms of design issues, there appears to be minor issues of poetry – not
hard since the poetry lines are short.  (how to “correctly” autowrap lines
of poetry remains problematic in html since html doesn’t support poetry)

There are issues of quasi-table listings of words, where the traditional
solution is simply to linearize the lists.  IE these word lists were
“packed” on paper to save paper, but on ebook devices vertical landscape is
“free” [horizontal landscape however definitely **is not**] so the word
lists can simply be “unpacked.”

And there appears to be a minor issue of plain rules vs. decorative rules.

* *

But all this would still beg the first question: Why?

Who is the customer?

What parent would want this for their kid today? Seriously?

Is some researcher interested in this for historical reasons?  Well –
frankly they would be better off examining the bitmap scans.

Fundamentally, one can’t code anything reasonable unless you decide who the
customer is, and how they are going to actually be using your efforts.

Re: [gutvol-d] Basic simple test case.

don kretz

James Adcock

David Starner

Lee Passey

Marcello Perathoner

Lee Passey

Marcello Perathoner

Lee Passey

Marcello Perathoner

don kretz

Marcello Perathoner

don kretz

don kretz

James Adcock

James Adcock

don kretz

James Adcock

tags

participants (5)