
jim said:
You make my point for me.
yes, i do, jim. and i make your point _much_better_ than you do, because i don't do the sabotage thing to it first.
When one relies on automagical tools to try to recreate semantic information discarded by the PG TXT representation, more or less often one ends up with something that looks like sh*t – your word not mine.
actually, i got the word from the dictionary, so you should feel free to use it without me. your tools are neither "auto" nor "magical" enough. it is only when you've improved them to the point that they cannot be improved any more that you earn the right to bitch about what others are doing.
When the results break one is told “Oh you did it wrong, you should have done something else instead.”
i didn't say "you did it wrong" as some kind of mystical power intended to make you go away. i said that you did it wrong because you did it wrong. it ends up that, with the right tool, and if you do it right, the p.g. e-text format works perfectly well, or at least it _can_, if the whitewashers really did what they say they do. and sometimes they do. for instance, if you would have taken your "hamlet" text from the newest version in the library, and put it into my unwrap site listed above, you'll see that it works just fine. so there is no _shortcoming_ of the p.g. plain-text format that needs to be "overcome". there are only some _flaws_ -- a portion of which seem to be intentionally inflicted -- which need to be corrected, so that the format can shine...
once one relies on human intervention to “fix” the problem when a particular algorithm breaks, then one does not have an automatic algorithm.
i agree. but that's not what is at issue here.
Ultimately what one should do if one wants to “get it right” is to abandon attempts at automagical tools which work sometimes and end up looking like sh*t other times and instead take the PG TXT file, take the original page scans, look at the page scans to figure out where the PG TXT files gratuitously entered line breaks where the author didn’t intend line breaks, and take them back out.
see, jim, here's where you get things half-right-but-kinda-wrong. you just haven't thought through these things well enough so that you can explain them clearly, so it comes out in this mumbo-jumbo.
After the gratuitous page breaks are taken back out (the work of a few days – trust me on this!)
again, you're severely unclear here. (and please, please, please, if anyone thinks that jim _is_ being "clear", do jump in and say so and help provide an explanation.)
then one can either, if one has a machine, such as a teletype, incapable of reflow, run the now gratuitous-line-break free TXT back through a simple unambiguous algorithm to insert a line break at the appropriate point for your machine
ok, here's a relatively straightforward description of the process. but, really, jim, there's no need for it. we programmers _know_ how to do this. it's not difficult. the guy who coded "eucalyptus" did a fine job on doing this, and he is using the p.g. text-files, so they don't really present the insurmountable problem you think...
Or tolerate slightly ugly word spacing on machines that force right justify (sigh.) Better yet, we should ask our technologist friends to include not only reflow but also automatic hyphenation routines in our machines.
again, not to beat a dead horse, but eucalyptus does ragged-right or justification, whichever the user prefers, and hyphenation too, so everything you're asking for has already been done at least once. rather than harping about the format -- which does just peachy, thank you very much -- you need to complain about the coders who are not giving you the type of tools you would like to have...
Is it too much, for example, to ask PG to provide the option to the rare user who actually WANTS line breaks at char 72, or for that matter actually wants line breaks at char 20, is it too much to ask PG to provide a filter to insert such “gratuitous” line breaks? Consider: PG *already* provides literally 40 different such filter programs to help people with various strange obscure legacy machines.
more sloppy thinking, jim. what do you really _mean_ when you say "provide the option" or "provide a filter"? i take it to mean "give your users a tool that does that". and if you started saying it that way, you'd come to realize that your beef is not with the p.g. text-file format at all, but rather the fact that p.g. isn't supplying users with the tools we need... -bowerbird