on the issue of hand-crafted versus computer-generated .html versions
i noticed that e-text #16663 was "reposted" the other day. in .tei form. with a .pdf, an .html file, and a flock of text versions. so i took a look at it, and did some comparisons with the "original" versions, which were released earlier this month, on september 5th. i found the two .html versions looked quite a bit different. i preferred the look of the original, but that's personal taste. the second one -- by virtue of being the result of a conversion -- has the benefit of being "standardized", at least to some degree. (well, if we can assume that the conversion won't get changed.) the first one looked more like the .html had been hand-crafted, perhaps after an initial run through one of the .html converters. (though of course i can't say that with any degree of certainty.) if it's gonna be a practice to "deprecate" idiosyncratic .html versions that are being created by individuals over at distributed proofreaders, replacing them with versions that have been converted by machine, someone should start informing people not to make the original effort. when i brought up this very point a little while back, on the d.p. forums, juliet made a sharp public rebuke telling me not to "discourage" people. (i know, it's just not like her to do that in public, is it? but she did...) personally, i think _all_ the time people have spent over there making hand-tooled .html versions has been a big waste of energy; it would've been far smarter to perfect a text-to-html conversion routine instead... i see other instances, too, where distributed proofreaders appears to prefer to waste the time of volunteers to "keep 'em busy and happy", rather than ensuring that the workflow is as efficient as it could be... it must be nice to have that much energy at your command... one more thing: are we assured that the conversion routine is solid? or are "repostings" going to start happening with greater frequency? if you have to repost every separate variation from every .tei e-text every time the converter is changed, you could be doing that all day. for instance, in this version, there's a glitch on the chapter 18 header. when that's fixed -- which will only take a few seconds -- and the files are regenerated, which will also only take a couple seconds, will there be a whole other "reposting" then? because that will end up taking much more than a few seconds of time for more than a few people, including the people who mirror the library. i don't know the solution, other than to wonder why each of these variant files needs to be put in the library at all, if users will truly be able to generate them at will. which _is_ still the plan, isn't it? that they can generate them at will? setting all of their own options, in a simple way with a friendly interface? when do we get to see that part of the puzzle? and will marcello be telling people they can't have the font of their choice because p.g. "doesn't have a license to distribute it"? that would be a real bummer. this particular glitch is trivial from the viewpoint of usability of this text, but might be a sign that your quality-control checks need improvement. i caught this one because it had a black-on-white visible manifestation, but there could be many that don't, and they could come to haunt you. -bowerbird
Bowerbird@aol.com wrote:
personally, i think _all_ the time people have spent over there making hand-tooled .html versions has been a big waste of energy; it would've been far smarter to perfect a text-to-html conversion routine instead...
It would have been smarter to do them in TEI, but everything takes time to develop. As of today, the TEI people started churning out books, whereas we are still waiting for your much-announced all singing all dancing Wunderkind reader program.
one more thing: are we assured that the conversion routine is solid? or are "repostings" going to start happening with greater frequency? if you have to repost every separate variation from every .tei e-text every time the converter is changed, you could be doing that all day.
Who says you have to repost anything if the converter changes? You have to repost if you want to fix errors in the text. And fixing *one* text manually and regenerating the others is faster than fixing 3 texts manually.
... to wonder why each of these variant files needs to be put in the library at all, if users will truly be able to generate them at will.
Because we start in a more conventional way: we post all pre-generated versions of the files along the TEI file. So if the converter should break, we can go back to a process of manually editing all posted files, like of old.
which _is_ still the plan, isn't it? that they can generate them at will? setting all of their own options, in a simple way with a friendly interface? when do we get to see that part of the puzzle?
When the conversion process is deemed stable and we have got enough server-power to let thousands of people play with the options.
and will marcello be telling people they can't have the font of their choice because p.g. "doesn't have a license to distribute it"? that would be a real bummer.
Of course I will tell people they cannot download a font which we cannot legally distribute. If they want that, they can buy the font and generate the PDF at home. Or they could simply get the HTML version, which takes the font from the browser settings.
i caught this one because it had a black-on-white visible manifestation,
Pray tell, what *is* a "black-on-white visible manifestation"? Do you see such things often? -- Marcello Perathoner webmaster@gutenberg.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 At 11:33 AM 9/29/2005, you wrote:
if it's gonna be a practice to "deprecate" idiosyncratic .html versions that are being created by individuals over at distributed proofreaders, replacing them with versions that have been converted by machine, someone should start informing people not to make the original effort.
You are assuming that all of the typographic information from the html files will be thrown out. I dare say you are mistaken.
when i brought up this very point a little while back, on the d.p. forums, juliet made a sharp public rebuke telling me not to "discourage" people.
And in true Bowerbird fashion, you never considered for a moment that she might have a point.
personally, i think _all_ the time people have spent over there making hand-tooled .html versions has been a big waste of energy; it would've been far smarter to perfect a text-to-html conversion routine instead...
What you think is moot, as has previously been demonstrated.
i see other instances, too, where distributed proofreaders appears to prefer to waste the time of volunteers to "keep 'em busy and happy", rather than ensuring that the workflow is as efficient as it could be...
Of course, if we left it to you, DP would still be in beta and the only way you could participate would be to join a yahoogroup.
one more thing: are we assured that the conversion routine is solid?
Well, whether it is or isn't, at least we've got the source code, so we can fix it. Others have addressed the rest, so I won't. Aaron Cannon - -- E-mail: cannona@fireantproductions.com Skype: cannona MSN Messenger: cannona@hotmail.com (Do not send E-mail to the hotmail address.) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (MingW32) - GPGrelay v0.959 Comment: Key available from all major key servers. iD8DBQFDPGtvI7J99hVZuJcRAsDfAJ9hbb7FTkhRHF/SmDjVDLgxXghXQACg8UK9 30SUJp0Pr3ZxZ5XUjcwDUl8= =kbiA -----END PGP SIGNATURE-----
Bowerbird@aol.com wrote:
if it's gonna be a practice to "deprecate" idiosyncratic .html versions that are being created by individuals over at distributed proofreaders, replacing them with versions that have been converted by machine, someone should start informing people not to make the original effort.
when i brought up this very point a little while back, on the d.p. forums, juliet made a sharp public rebuke telling me not to "discourage" people.
Do you have a pointer to this exchange? I searched for posts from Juliet containing the word "discourage" and didn't find it. -Michael
participants (4)
-
Aaron Cannon -
Bowerbird@aol.com -
Marcello Perathoner -
Michael Dyck