
Hi Keith
With the development of a structured databese. Which means that we will have to comprise, that is cover the basic cases and in certain cases hand edit the fields involved. These special cases will be harder to find, but there will be a set of rules which will help us look for them. To make things easier we could use cross- references as in library catalogues.
There is no magic bullet. As aexample take look at iTunes. It has field for sorting Artist. they use a db and for my own CDs the information is gotten from a diferent DB. I have my own notion how things should be sorted. So I edit the "sort for Artist" field. The only problem here is that for classical music sorting/ indexing by Artist is not viable. I prefer to use the Komposer field. So I have to use a different index.
I take your point, but I reckon that with a bit of definition of canonical fields and formats one should be able to clean the lot up with the exception of cases where previous manual record entry had violated sensible rules. Most of the problems could be cleaned up automatically, and only the horrible examples (basically errors) need get special manual treatment. Trying to construct special rules for your data base to negotiate, would fall foul of the ingenuity of fools. Whether you really need a "formal data base" or not is an open question. Some direct access to properly sorted and indexed files can be startlingly effective. Jon