Re: [gutvol-d] Zero markup

5 Jan 2012

      On Jan 5, 2012, at 2:30 AM, Keith J. Schultz wrote:
...
I believe I can help you on setting up the heuristics that you 
will be needing as this kind of work is right up my alley ...
What I am uncertain about is if you using scans or text files.
Hi Keith. I'm using a single text file. In particular, I made it work with
the concatenated text file that comes out of DP. My goal is to make it
easier for someone to get into post-processing. I know many want to
but are overwhelmed by PPing--especially the formats other than text
parts.

The format recognition code actually works pretty well already and
I'm surprised how fast I can go from DP text file to HTML. I haven't written
back-end generators for other formats, but that would be next. Since it
is a two step process and since I can manually mark up the infrequent
special cases, it's not that important to work on the hueristics to get it
all just right automagically. It would be important if this were going to be
applied as part of a script to a whole catalog collection, which I won't be
doing anytime soon.

So thanks for the offer to improve the first pass (text to 'p-code' equivalent).
I could send you the source code for this (Python 3) anytime; I would just
have to document the intermediate form and you could make it more
accurate. However, unless that is just academically interesting to you,
it's probably not necessary.

--Roger

Re: [gutvol-d] Zero markup

Roger Frank