Re: !@! #17135 Twas The Night Before Christmas

dakretz said:
How accurate is this assessment?
it's half-assed accurate. mostly because it's looking at the p.g. corpus from the standpoint of its two major file-types. but we'll want to look at it from the perspective of the users who will access it, and how... (the answer to that is mobile, mobile, and mobil.) moreover, some of your points, even as given, are wrong...
1. original plain-text. significant but declining strongly.
no. significant and _increasing._ (a rising tide lifts all boats.)
2. original .html files. significant but declining slightly.
no. significant and _increasing._ (same tide, different boat.)
3. plain-text derivatives. declining.
dead wrong. this is the segment that is increasing fastest. most of the places that make derivatives use the plain-text.
4. .html derivates. significant and increasingly so.
well, not quite "dead-wrong", but still wrong nonetheless. the .html files have far too little consistency to be used in a systematic creation of derivatives, not without glitches... some places use the .html file, but then "fall back" to the plain-text version if they see problems with the derivative. but most can't spend that much energy on quality-control, so they've resigned themselves to using the plain-text files. which is not that big of a sacrifice, to be perfectly honest... indeed, the system giving the most consistently best results is the iphone viewer-app "eucalyptus", which utilizes _only_ the plain-text files; his converter is giving very good output. and, to help get people's heads on, and completely straight, it's good to do the reminder that many of the .html files are the result of a straight-out conversion of the plain-text file. and these files, because they're machine-generated, _are_ consistent enough to be used in a systematic conversion... it's the "hand-crafted" ones that cause all of the problems, which is something that i first pointed out many years ago. when problems with the auto-generated .html files do occur, it's usually due to an underlying glitch in the plain-text file. so auto-conversion of plain-text is the best way to proceed. and i've maintained for 7 years now that such a conversion is not just _possible_, but our best course of action to follow... for several years after i started, i left my argument unproven, just to see who would jump at the bait and try to dispute it... after destroying all that opposition, i have since proven that it is indeed possible to use a plain-text file as your "master". why y'all continue to ignore this proof, i simply do not know. but i'll keep making the case, until all of you can see it clearly. -bowerbird

While I agree more with BB's conjecture, than Don's I have seen no real statistical evidence on either side. My own experience, which is very old now, is of encountering titles in Palm-compatible formats that had manifestly been derived mechanically from the PG plain-text versions. This is just an anecdotal point, but it matches BB's "eucalyptus" data point. This doesn't seem to hard to research though. For grins I rummaged around for e-book versions of something I am familiar with. I found two separate conversions of _Sunshine Sketches_ by Leacock. Despite the existence of a nice HTML version by David Widger, both the PDF and HTML versions I found were based on the PG text version, using the text version of the TOC and having the double- hyphen version of M-dashes. So there's two more random data points in BB's column. On 14-Apr-2010 18:42, Bowerbird@aol.com wrote:
dakretz said:
How accurate is this assessment?
it's half-assed accurate.
============================================================ Gardner Buchanan <gbuchana@teksavvy.com> Ottawa, ON FreeBSD: Where you want to go. Today.

On Wed, 14 Apr 2010, Gardner Buchanan wrote:
While I agree more with BB's conjecture, than Don's I have seen no real statistical evidence on either side.
My own experience, which is very old now, is of encountering titles in Palm-compatible formats that had manifestly been derived mechanically from the PG plain-text versions. This is just an anecdotal point, but it matches BB's "eucalyptus" data point.
Another couple of anecdotal points. There are two paper publishers I've worked with a bit. Not recently, but a couple of years ago. Both had scripts to take the plain text and allow them to typeset in a couple of hours. They didn't use the html ever because it threw too many exceptions that required hand input to resolve, and therefore took a lot longer to get typeset. The proofread after to make sure nothing got messed up took longer. -- Greg Weeks http://durendal.org:8080/greg/

Does anyone know of any epublisher other than PG that *does* distribute the html we provide? Don On Wed, Apr 14, 2010 at 5:48 PM, Greg Weeks <greg@durendal.org> wrote:
On Wed, 14 Apr 2010, Gardner Buchanan wrote:
While I agree more with BB's conjecture, than Don's I have seen
no real statistical evidence on either side.
My own experience, which is very old now, is of encountering titles in Palm-compatible formats that had manifestly been derived mechanically from the PG plain-text versions. This is just an anecdotal point, but it matches BB's "eucalyptus" data point.
Another couple of anecdotal points. There are two paper publishers I've worked with a bit. Not recently, but a couple of years ago. Both had scripts to take the plain text and allow them to typeset in a couple of hours. They didn't use the html ever because it threw too many exceptions that required hand input to resolve, and therefore took a lot longer to get typeset. The proofread after to make sure nothing got messed up took longer.
-- Greg Weeks http://durendal.org:8080/greg/
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/mailman/listinfo/gutvol-d

Does anyone know of any epublisher other than PG that *does* distribute the html we provide?
Not sure exactly what you are asking but Apple for example takes the PG html, strips out the PG legalize and acknowledgment of the volunteers, converts it to EPUB with DRM, and redistributes it "free" [where "Free" in this case means being only able to get then book in DRM form and only being able to get it directly from the Steve Jobs iPad monopoly] One knows they are not working from the txt versions of the files because the Apple redistributions contains chars and formatting found only in the HTML versions. FreeKindleBooks redistributes in HTML form converted to MOBI and retaining all the PG legalize and requirements. Mobileread has volunteers which take the HTML usually heavily reformat it, strip it, and republish in MOBI and EPUB formats while cackling about how much better their versions are! Many other sites appear to "down-convert" to a least-common-denominator ASCII format before "up-converting" back to HTML, MOBI, EPUB, etc. Presumably they are working from an ASCII version of an old DVD distribution - getting "working" EPUB and MOBI from the HTML formats tends to be "non-trivial", not to mention that some sites republish in say two dozen different formats.

I did some checking too. The conclusion I provisionally have arrived at is that there are relatively few beneficiaries from our expectations for an increasingly elegant HTML version of each project which also is one of the major drags on the post-processing stage and a major contributor in the increasing residency period of projects on DP. It appears to me that the only people who enjoy the full pleasure of our finest work are a.) those who read the whole thing online at PG, and b) those who personally download the HTML version and install it locally so they can read it with a device (probably a PC full-width screen (including laptops and similar.) Which would be - what - 10% or less? In fact, it appears that secondary distributors treat the removal of all or part of the HTML as part of their value-add. Don On Fri, Apr 16, 2010 at 4:52 PM, James Adcock <jimad@msn.com> wrote:
Does anyone know of any epublisher other than PG that *does* distribute the html we provide?
Not sure exactly what you are asking but Apple for example takes the PG html, strips out the PG legalize and acknowledgment of the volunteers, converts it to EPUB with DRM, and redistributes it “free” [where “Free” in this case means being only able to get then book in DRM form and only being able to get it directly from the Steve Jobs iPad monopoly] One knows they are not working from the txt versions of the files because the Apple redistributions contains chars and formatting found only in the HTML versions. FreeKindleBooks redistributes in HTML form converted to MOBI and retaining all the PG legalize and requirements. Mobileread has volunteers which take the HTML usually heavily reformat it, strip it, and republish in MOBI and EPUB formats while cackling about how much better their versions are! Many other sites appear to “down-convert” to a least-common-denominator ASCII format before “up-converting” back to HTML, MOBI, EPUB, etc. Presumably they are working from an ASCII version of an old DVD distribution – getting “working” EPUB and MOBI from the HTML formats tends to be “non-trivial”, not to mention that some sites republish in say two dozen different formats.
_______________________________________________ gutvol-d mailing list gutvol-d@lists.pglaf.org http://lists.pglaf.org/mailman/listinfo/gutvol-d
participants (5)
-
Bowerbird@aol.com
-
don kretz
-
Gardner Buchanan
-
Greg Weeks
-
James Adcock