kevin said:
>   Instead of embedding it into the e-book,
>   I think it would work better as a seperate file.

if i had to choose between the two, i'd agree with you.
but there's no reason we can't do it both ways.

>   If you embed it into the ebooks, you will need to
>   put it in all the versions (html, text, pdf, tei, etc..),
>   and keep ALL of them up-to-date.

you put it in the master version (z.m.l.) and
then re-propagate the auxiliary versions...

>   Also, if you want to make it "user" editable,
>   however you want to define "user", it would be
>   better as seperate file, so that the original files
>   don't constantly get flagged as modified.

social tagging is an ongoing process, so yes,
it doesn't make sense to put that into the file,
because your files will be constantly changing.

you could roll social tags into your documents on
a regular basis, however, and that might be useful.

(and every e-text should have a changelog anyway.
until you install that, you'll never have a good handle
on controlling the contents of your library. never.)

but until we see a social tagging system that really
works for our purposes, this planning is premature.

>   make it easy to join the meta-files into a single file
>   (cat *.meta > all.meta would be ideal)

yes, of course. indeed the single-file version should
be the one that is public-facing, for easy download.
we can give 'em a tool that splits it on their machine.

>   The format could be text, or XML, or even tei. If you use an
>   XML based version, a text version could be easily created.

at one time, i looked at the x.m.l. version of the catalog.
what a bloated crufty mess!   kevin, please demonstrate
that there is some reality behind what you have said here
by showing us "the text version that could be easily created".

because in order to make any of these plans really _work_,
we will need a simple list of the e-texts. i'd like to see one
with about 20,000 lines, each line looking something like this:
>   00011 -- alice's advertures in wonderland -- lewis carroll

>   Instead of just category, you could store all sorts of information
>   in the "meta" file. Authors name, copyright date(s), categories
>   (science finction, horticulture, cook-book), available formats
>   (text, html, tei, pdf, etc.), language(s), links to web sites,
>   link to author meta file, and any other information
>   like you would like to find in a card catalog,

you'll find much of that data in the existing x.m.l. catalog.
so have at it. show us what you can do with it.

>   Which one has the most correct information,
>   the text version or the html one?

if such a difference comes into existence, you have a bigger problem,
which is that your workflow has some bug in it that needs to be fixed.

>   If you wanted to search for all polish math books,
>   how would you write the query program so that
>   you would get all of them, without duplicates because
>   of the different formats, and without wasting a
>   lot of CPU cycles. Not all texts have a .txt version

good question. got an answer?

-bowerbird