joey said:
> For those who are interested,
> I use the open-source library "itextsharp"
> to generate PDFs from an XML master that also
> provide what Bowerbird calls "clean copied text".
i knew somebody who wasn't thinking
was gonna fall into that exact trap, and
-- sure enough -- it took joey 8 minutes.
joey. maybe. you. should. reflect. on. things.
for. a. few. minutes. more. just. a. suggestion.
how long do you think it took me to _write_
that 55k thesis? let me assure you that it was
more than 8 minutes. a lot more.
so hey, next time, spend _at_least_ as much time
ruminating about what i wrote as i spent writing it.
ok?
it'll help you from falling into obvious pits like this.
you might be able to get "clean text" out of a .pdf.
in and of itself, that's not all that difficult to achieve.
after all, all you have to do is create some dummy lines.
now that i've made it clear how to workaround that flaw,
and the "itextsharp" people have made it clear as well,
i expect that everyone will be able to copy out clean text.
if you cannot, even though the "solution" is known to all,
then your app is particularly brain-dead, we would assess...
but "clean text" is _not_ the main achievement here.
oh sure, it's nice and all, especially considering the pain
that repurposing dirty text has imposed on people so far.
but a more _important_ aspect of "round-tripping" is that
the end-user is getting the _master_ copy from select-all.
i highlit that, but you weren't quick enough to get it:
when _you_ select-all and copy out of a .pdf, do you
get back _your_ "master" -- i.e, the original .tei file?
um, no, you don't. you might get out clean text.
but you don't get out your "master". not even close.
so you'd have to re-apply all your markup to that text.
that reapplication will take more than my 2 minutes...
but me? when i get out my clean text, i am getting
back _my_master_. and if the clean-up is automatic?
think about that. no need to "re-apply" a darn thing.
it's ready to go. ready to go right into the zml-viewer.
where -- just like in the earlier round -- formatting will be
auto-applied to it, based on its structure, to make it pretty...
that's round-tripping. power in the hands of the end-user.
in fact, you could call it "power tripping" for short. really!
so you weren't paying enough attention, joey.
hey wait! you're not the "joey" from that "friends" show,
are you? because if you are, then i know that your "dense"
mental capacities are just an act you use to get the chicks.
so tell me, are you _that_ joey. i might live near hollywood,
but -- to tell the truth -- i only know 10 (or so) celebrities...
but if you're _that_ joey, i'd peg my new number at 11...
-bowerbird