james said:
> One thing I spotted was the use of
i don't need details. just show me the utf8. :+)
> However, I thought you should be aware that
> your word extraction program is doing this
> and it is wrong.
i know it's wrong. that's the whole point.
don't explain. show me the right version.
> I have a thought on looking up the words.
> PDFs and DjVus from archive.org
> have text contained in them.
> I should be able to put in
> a questionable word from the right column
> and see what it should be on the left,
> then fix it.
you're a programmer, right?
start thinking like one.
i don't know exactly what you mean by
"put in a questionable word",
but it sounds uncomfortably _manual_.
ditto with doing "find" in a .pdf or .djvu.
you have a list of the bad words in a file.
and you have the actual e-book in a file.
with pagenumbers pointing to the scans.
so...
think like a programmer, and write code
that _automates_ the process for you, so
you just have to click a button or two and
maybe -- in the extreme case -- edit text
in a text-field by using your (ick) keyboard.
think like a programmer.
i _will_ repeat this. if i _have_to_ repeat it.
but james, i don't want to have to repeat it...
-bowerbird