James, this is great!! I did not realize that so much work had been put into it already. Question, can this system include some sort of search engine? (publication date, author, title, etc...). Also can we support pdf, doc and sxw formats as well? Listen, if no one else is stepping up to the plate to do the work, than no one can complain about how it's done. You have a server, you have the software, you have me (an eager novice). What more do we need. Let's just do it! cheers, darryl James Linden wrote:
If there are little things here and there that I can do, feel free to ask. An hour here or there I can do. :-)
---------------------------------------------------------
I have about 350 texts in a database (and the source code to use said database), but in order to use the system, the content must be in a certain psuedo-markup. This markup takes about 5 minutes to apply to an average novel type text.
Demo can be seen at http://ibiblio.org/edison/engine/catalog.browse.php -- incidentally, this was the demo of the FIRST working markup-based system for PG, which was rejected by the masses as too much work (about 3.5 years ago). Ironically, I wrote the engine in about 20 hours, and put the 350/ish texts in, by hand, in about 15 hrs. That's ~23 texts per hour - far less work than it currently takes.
At that rate (let's say 20 per hour), 10 volunteers could put PG's entire current collection (about 12,000 distinct/usable items) into the system in about 60 hours. As they do this work, the catalog, etc is automatically built, and various alternate formats automatically available. New formats can be added later, with each previously entered text automatically in that format.
Some rather basic Java applets and a bit of cooperation with DP could have put this post processing right into the DP system, with the automatic format conversion the end result, instead of various mismatched HTML/TXT, etc outputs.
------------------------------------------------------------------
Basically, my lack of forward motion on PG related matters is more a result of being completely fed-up with PG as a whole, and not being willing to _waste_ my time for nothing. I get paid for my expertise in dealing with data formats / knowledge repositories / collaborative data management. While I'd love to give the same to PG, I'm not going to be insulted while I'm doing it. (Note: my problems with PG started circa 2000.) It was decided that my formatting/management engine wasn't workable for PG, by whomever make those decisions, so basically, continuing to work on the engine isn't worth my time and effort. I _still_ don't understand this, because in direct conversations with both Greg and Michael, I distinctly remember _both_ of them thinking it was a good idea.
------------------------------------------------------------------
As I see it, the big problem with PG is that everyone wants to be in charge of something, so PG is broken up into a zillion micro-managed pieces. It is this complete lack of process order that creates MORE work and slows everything down.
When Michael first talked to me about getting PG Canada started, there were a set of specific items that we both agreed on:
1) PG CAN would be completely independant of PG USA; 2) PG CAN would implement a "next generation" system, for PG CAN's own use and as a proof of concept for PG USA;
This next generation system includes collaborative processing, automatic format conversion, enhanced cataloging, capability for language translation, backend for voice synthesis (and not that garbage that's currently in PG's archive), and a few other things.
Not only have I spent years researching and experimenting, but I've created proof of concepts for all of it, at various times and in various pieces. What I am suggesting as the "next generation" for PG is not only possible, but well worth the effort.
Unfortunately, getting a half dozen programmers with proper skills and a similar vision has proven very difficult. So far, we have ONE, and he's only marginally available due to university and work. That leaves me, and my availability is only a little better, but around life and running a business, what time I do have that I can put into PG is uniformly wasted in political BS, turf wars, format wars, etc.
------------------------------------------------------------------
So, I should clarify / sum up -- I _HAVE_ time for PG related work, but it's limited to about 6 hrs per week. I don't want to waste that 6 hrs in turf wars and political crap, but it's not enough time to make headway with a development project of the appropriate size. This basically means that my 6 hrs is better spent doing other things right now. Understand, if someone paid me a small salary to cover my bills (so I wouldn't have to work as a contractor), I'd work for PG full time -- building this system. Once built, PGs could spring up all over the place (Russia, Africa, Asia, etc), just by installing said system and doing some basic configuration (default interface language, logo, etc). And, ALL output from ALL of them would be uniform and cross compatible! (And yeah, there would be TEI output...)
------------------------------------------------------------------
I'm going to stop my rant, even though there is a lot more to say...
Some docs about my ideas/system are available online: http://www.kodekrash.com/index.php?p=11
-- James
Darryl Moore wrote:
That's too bad James. I was hoping we could get some help from you, however, I fully understand. My time has been really tight too, however I have been continuing to work on this a bit here and there. Not nearly as fast as I'd like though.
I have a wiki almost set up on my home site on which I'd like to collaborate with others on creating the corporate documentation. (some of which is already in a rough draft) As I've said before I think a registered Not-For-Profit (then hopefully charity) will get more attention on The Hill then 'a bunch of guys with a web site'.
Also, I found an interesting open source document database at: http://docdb.sourceforge.net/ I am trying to install it on my home server for testing purposes. If we can add to this some way to convert XML<-->ASCII when submitting/retrieving documents, this might be all we need (with a few static web pages) to get a basic PG up and running. And running in James preferred incarnation. Thoughts???
------- As an aside, as I've been getting more and more annoyed with MS over recent years, and as I've learned more and more about IP issues, I've finally taken the big step and moved all my home and work machines over to Linux. Wouldn't have been nearly so painful if I weren't at the same time trying to set up all kinds of servers, implement networking, remote 'X' server/clients etc... Oh well, I'm almost at the top of the initial learning curve now. -------
cheers, darryl
James Linden wrote:
Several people have recently voice a desire to get PG Canada going. I'd like to invite them to do so. I am not currently in a place of my life that I can devote any time to PG, so I'm going to fully step aside and let everyone else go at it without me in the way.
I currently have control of the projectgutenberg.ca domain, and I'll point DNS to whomever is going to host the site, just give me the DNS server names.
I will be staying on the mailing lists so as to still be available if the need arises.
May PG Canada succeed (and kick PG USA's butt!)... EH!
Regards, James Linden http://www.kodekrash.com/ http://www.eidix.org/ _______________________________________________ Project Gutenberg of Canada Website: http://www.projectgutenberg.ca/ List: pgcanada@lists.pglaf.org Archives: http://lists.pglaf.org/private.cgi/pgcanada/
_______________________________________________ Project Gutenberg of Canada Website: http://www.projectgutenberg.ca/ List: pgcanada@lists.pglaf.org Archives: http://lists.pglaf.org/private.cgi/pgcanada/
_______________________________________________ Project Gutenberg of Canada Website: http://www.projectgutenberg.ca/ List: pgcanada@lists.pglaf.org Archives: http://lists.pglaf.org/private.cgi/pgcanada/