[gutvol-d] Re: why the plain-text format is the most useful format for eliciting beauty (and more)

12 Sep 2009

      Sankar Viswanathan wrote:
...
The final output from DP is a text. This is processed through Guiguts. Most of 
the Post Processors in DP use Guiguts for post processing.   The html is 
generated from this text file.
If this is true its all the more waste.

If you output a text file from the OCR and later use a human to 
re-create HTML this is more work than letting the OCR output the HTML 
directly.

And all this crooked workflow is needed because PG requires a txt file 
for hysterical reasons.

No wonder Google is eating our lunch ... they know how to put software 
to work instead of people.
...
So no additional work is involved in producing a text file.
Nice sophism. Additional work is required to produce the HTML file. So what?
...
Again there is no additional work in White Washing because of the text file.
I don't believe you.

Working 2 files (3, maybe 4) IS more work than working one file. Even if 
you just open the file to see if it is the right one, its work.

-- 
Marcello Perathoner
webmaster@gutenberg.org