
Generally, proofraiding (or the new PC term, harvesting) refers to grabbing page images (and optionally text, but it's usually mediocre-to-poor raw OCR). Grabbing only text is seldom worth it.. nothing to compare it against. Blind format conversions are discouraged unless you have access to the original book. And as you say, clearing a text-only is more difficult. IIRC, it requires access to a paper copy and doing a fairly lengthy comparison.. only worthwhile IMO if the text is very clean and/or OCRs particularly poorly. (To be honest, I had forgotten this option when I wrote the previous post). Back to the gist of my question.. Does anyone know of image archives of spanish fiction? On 12/14/05, Joshua Hutchinson <joshua@hutchinson.net> wrote:
----- Original Message ----- From: "Robert Cicconetti" < grythumn@gmail.com>
AFAIK, we don't simply repackage existing text-only copies available on the web.
R C
Actually we do and have, Robert. Ok, DP doesn't, but PG volunteers do. We even have a name for it. Proofraiding. ;)
The hard part is making sure it is well proofed and to our formatting standards and that it is clearable by our standards.
Josh (JHutch at DP) _______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/listinfo.cgi/gutvol-d