
Want the simple way? Try unzipping to the Apple text format. . . . On Wed, 9 Sep 2009, David A. Desrosiers wrote:
On Wed, Sep 9, 2009 at 8:12 AM, Marcello Perathoner<marcello@perathoner.de> wrote:
ROTFL! Apply that algorithm to Hamlet and see.
See if you can come up with an algorithm that doesn't make mincemeat of the following small excerpt. The algorithm should at least:
As you already know, parsing HTML is a much easier matter than parsing semi-freeflow text (which was the original poster's request).
Also remember, I do this all the time for spiders we write for Plucker. I slice, I dice, and I make beautiful, automated works of art from the worst, most semantically-incorrect HTML out there. See some examples here:
http://projects.plkr.org/ _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/mailman/listinfo/gutvol-d