
Michael Dyck wrote:
Jon Noring wrote:
So, does DP keep an internal identifier and associated metadata for each project it works on?
Yup.
Well, as I asked in replying to Juliet's message, is the identifier simply a sequential integer, or does it contain some metadata? Also, what metadata is recorded for each project, and is it in some normalized (machine-readable) form, such as Dublin Core?
And will it be pretty easy to associate a scan set with that DP identifier, or will it require some human intervention to make the association?
If the scan set knows its DP project id, then the association is there. If it doesn't, then someone will have to find out what it is. I'm not sure I understand the question.
I apologize for not being more precise in my question. How many scan sets (or maybe what percentage of scan sets) will require a human being to intervene to determine the associated DP project ID? Intervention includes actually looking at the title page scan and then use the information to manually lookup with which DP project it is associated.
And does the DP identifier correlate to a PG text identifier?
The DP database maintains the relation. Mostly it's one-to-one, sometimes many-to-one, and rarely one-to-many or many-to-many. The DP database can handle the first two.
Juliet went into even gorier detail on this! But despite the complexity of the PG <--> DP id mappings, once a DP project ID is associated with a particular scan set, then the association to a PG text is possible to machine trace. Am I right on this? Thanks. Jon