jim said:
>   Perhaps you should start by examining what PG has *already*
>   implemented for txt unwrapping to generated HTML,
>   find out what works and what doesn't work, and
>   what requirements this puts on txt submission
>   in order to make it all work right?

the code that does unwrapping right now is marcello's.

i don't know if he has updated it since i looked at it, but
when i did, it was just what i'd expect from a technocrat:
he made the problem much more difficult than it is, and
subsequently his code is overwrought, _and_ it backfires.

(for instance, he was using a rhyming dictionary to try to
determine if a set of lines constituted a poem; good luck.)

all in all, once you approach the problem intelligently,
it's not that difficult to unwrap most p.g. files correctly,
even the ones which have not been formatted correctly,
because you can detect the lines that should be indented.
i could run a script that auto-fixes most p.g. e-texts, with
few introduced errors; too bad p.g. doesn't work that way;
the whitewashers insist on fixing the books one-at-a-time.


>   Otherwise PG will end up with
>   two conflicting text unwrapping standards,
>   which will make the submitter's task
>   even more confusing.

marcello had to code his unwrapper precisely because
p.g. doesn't enforce its existing policy on text indents,
or have the foresight to expand it to cover other cases.

his code won't scale.  and the indentation policy _will_...
(and it'll replace his kludge code with something simple.)

so there's no issue with "two conflicting standards" here.


>  
To the extent that you guys are heading more-and-more
>   towards the "unobtrusive" marking up of txt files, please
>   note that Python has already got very good efforts in this regard
>   called "reStructured Text" -- and the tools existing to support it! 

you're a few years behind the threads here, jim...

"restructured text" is a light-markup format, just like z.m.l.

the main difference is z.m.l. is geared directly toward p.g.,
whereas restructured text has a provenance that's muddled,
so if you were gonna choose between the two, choose z.m.l.
(tools for any light markup system are _not_ hard to build.)

but hey, if you can get p.g. to go for restructured text, do it!

-bowerbird