Title: An abstract model for DCMI metadata descriptions
1An abstract model for DCMI metadata descriptions
DC Usage Board meeting at DC2003,
SeattleSeptember/October 2003
- Andy Powell
- a.powell_at_ukoln.ac.uk
- UKOLN, University of Bath, UK
- http//www.ukoln.ac.uk/
UKOLN is supported by
2I am going to
- assume people have read the current Abstract
Model working draft - propose a revised (more generic) abstract model
- look at some of the issues that have been raised
- encourage discussion of the revised model and the
issues - consider what happens next with the abstract
model document
3Major issues
- why develop an abstract model?
- what is qualified DC? whylimit to DCMI
properties? - what is a record?
- what is simple DC? why limit to DCMES
- what is a value?
- where does DCSV fit in?
- relationship to application profiles?
- relationship to RDF?
- abstract model and dumb-down?
4Why?
- non-syntax-based view of what constitutes a DC
metadata description - need to understand what kinds of descriptions we
are trying to encode - best done without reference to any particular
syntax - allows us to compare and contrast the
capabilities of different encodings - syntax X supports feature Y but syntax Z doesnt
- supports better mappings between syntaxes
5What is qualified DC?
- general feeling that limiting abstract model for
qualified DC to DCMI properties is too limiting - real world applications typically go beyond this
- therefore, need to re-model at more generic level
- DCMI Abstract Model
6DCMI abstract model
- a description is made up of one or more
properties and their associated values - each property is an attribute of the resource
being described - properties may be repeated
- a record is a set of descriptions about one or
more related resources
therefore each description is about one, and
only one, resource (the 11 principle)
7DCMI abstract model (2)
- each value is a resource
- each value may be denoted by a value string
- each value string may have an associated encoding
scheme - each encoding scheme is identified by an encoding
scheme URI - each value string may have an associated language
(e.g. en-GB)
a value string is a simple, human-readable
string
8DCMI abstract model (3)
- each value may be identified by a value URI
- each value may have an associated rich value
(some marked-up text, an image, a video, some
audio, etc. or some combination thereof) - each value may have some associated related
metadata
related metadata is a description of a related
resource e.g. metadata about the person who is
the creator of a document
9What is a record?
- a record is a set of descriptions about one or
more related resources, e.g. - a description of a resource and a description of
its creator - a description of a resource, a rights statement
about the resource and a description of the
description - note a description is about a single resource
and is made up of one or more properties and
their associated values
10What is a value?
- a value is the physical or conceptual entity that
is associated with a property when it is used to
describe a resource - a person (physical)
- an organisation (physical)
- a subject (conceptual)
- a country (physical)
- a type (conceptual)
- etc.
- therefore, in the abstract model,a value is
always a resource
11A value is always a resource
- in the DCMI abstract model, a value is always a
resource - the value resource may
- be identified by a value URI
- be denoted by a string value and/or a rich value
- have some associated related metadata
- but the value is always a resource!
- I think this has an impact on the RDF encodings??
12But some problems
- some problems with wording of existing DCMES
definitions - CCP element values defined to be a resource
- relation, identifier and source defined to be a
reference to a resource - rights defined to be either a resource or a
link to a service that provides a resource - problem too much of the model is embedded into
the definition!
13What is qualified DC?
- a qualified DC record is
- any record that
- conforms to the DCMI abstract model
- contains a description that uses at least one
DCMI term
however, this means that it is probably not
possible to define a single XML schema for
qualified DC records but can provide a template
XML schema
14What is simple DC?
- a simple DC record is
- any record that
- conforms to the DCMI abstract model
- comprises only a single description
- uses only properties taken from DCMES
- makes no use of value URIs, encoding schemes,
rich values or related metadata
15or to put it differently
- a simple DC record is made up of a single
description - that description is made up of one or more
properties and their associated values - each property is an attribute of the resource
being described - each property must be one of the 15 DCMES
elements - properties may be repeated
- each value is denoted by a value string
- each value string may have an associated language
(e.g. en-GB)
16or to put it differently
- simple DC is an application profile that only
uses terms taken from the DCMES
17simple DC and value URIs
- all values in simple DC are denoted using only a
value string - the value string can be a URI
- but there is nothing to formally indicate that
the value string is a URI - simple DC software applications may choose to
guess which value strings are URIs and which
arent
18Simple DC and audience
- why isnt dctermsaudience included in simple
DC? - because single namespace is simpler than multiple
namespaces - dcxxx and dctermsxxx
- because static definition is simpler than one
that grows over time - audience
- because, arguably, audience not part of the
core - the t-shirt problem
19Abstract model and DCSV?
- DCSV provides mechanism for encoding markup in
value string - thus DCSV runs slightly counter to the abstract
model - DCSV better handled as related metadata
- e.g. Period provides related metadata about a
conceptual period in time - impact? XML enc. good string enc. bad?
- suggest no new proposals based on DSCV for the
time being
20What is a DCAP?
- a Dublin Core Application Profile (as currently
defined) declares the properties and encoding
schemes used to construct a description as used
within a particular application - problems
- DCAPs dont currently cover the whole abstract
model - DCAPs define what a description is but most
applications need defining at the record level
21RDF vs. abstract model
- what is the relationship between RDF and the
abstract model? - RDF provides richest encoding syntax currently
- full encoding of all features of the model
- but expect to see model fully implemented in XML
as well - (expect HTML syntax to always be a partial
implementation)
22Dumb-down
- intelligent vs. dumb, element vs. value
- element dumb-down (dumb)
- ignore anything that isnt DCMES/an element
- element dumb-down (intelligent)
- resolve sub-properties until you get to DCMES/an
element - value dumb-down (dumb)
- use value URI or value string as value string
- value dumb-down (intelligent)
- use knowledge of related metadata, or value
string to create new value string - resolve sub-classes/broader terms
23sub-properties and classes
- RDFS and human-readable declarations of DCMI
terms refer to sub-properties and sub-classes - however, these dont formally appear in the
abstract model (expect as part of dumb-down) - where do these fit into the model?
- I think they belong in the grammatical
principles document
24(No Transcript)
25Example 1 dccreator
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsmy"http//purl.org"gt
ltrdfDescriptiongt ltdccreatorgt
ltrdfDescriptiongt ltrdfvaluegt
Andy Powell lt/rdfvaluegt
ltmyemailgt a.powell_at_ukoln.ac.uk
lt/myemailgt lt/rdfDescriptiongt
lt/dccreatorgt lt/rdfDescriptiongt lt/rdfRDFgt
Example RDF description using dccreator
26Example 1 dccreator
Andy Powell
rdfslabel
dccreator
Andy Po
myname
myemail
a.powell_at_uko
a.powell_at_uko
myaffiliation
UKOLN, Univ
and the RDF model it represents.
27Example 1 dccreator
relatedMetadata
Andy Powell
rdfslabel
dccreator
Andy Po
myname
myemail
a.powell_at_uko
a.powell_at_uko
myaffiliation
UKOLN, Univ
But we dont want to embed all this information
into every instance metadata record do we?
28Example 1 dccreator
Andy Powell
rdfslabel
dccreator
Andy Po
myname
myemail
a.powell_at_uko
a.powell_at_uko
myaffiliation
UKOLN, Univ
Need to separate part of the information out and
store it in a single place in this case in a
directory service
29Example 1 dccreator
Andy Powell
rdfslabel
dccreator
Andy Po
valueURI
valueURI
myname
myemail
a.powell_at_uko
a.powell_at_uko
myaffiliation
UKOLN, Univ
To do this we need to assign a URI (the
valueURI) to the anonymous value node
30Example 1 dccreator
relatedMetadataURI
Andy Powell
rdfslabel
dccreator
Andy Po
valueURI
valueURI
myname
myemail
a.powell_at_uko
a.powell_at_uko
myaffiliation
UKOLN, Univ
The document containing this information is
itself an RDF resource (the relatedMetadata)
and has a URI
31Example 1 dccreator
relatedMetadataURI
Andy Powell
rdfslabel
dccreator
Andy Po
valueURI
valueURI
myname
myemail
rdfsseeAlso
a.powell_at_uko
a.powell_at_uko
myaffiliation
UKOLN, Univ
Use rdfseeAlso to form linkage between
description and relatedMetadata
32Example 2 dcsubject
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltrdfDescriptiongt ltdcsubjectgt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
ltrdfslabelgt Formate Dehydrogenase
lt/rdfslabelgt lt/dctermsMESHgt
lt/dcsubjectgt lt/rdfDescriptiongt lt/rdfRDFgt
Example RDF description using dcsubject (taken
from Qualified DC in RDF recommendation
33Example 2 dcsubject
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltrdfDescriptiongt ltdcsubjectgt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
ltrdfslabelgt Formate Dehydrogenase
lt/rdfslabelgt lt/dctermsMESHgt
lt/dcsubjectgt lt/rdfDescriptiongt lt/rdfRDFgt
Formated
rdfslabel
dcsubject
rdfsvalue
D08.586
rdftype
rdftype
dctermsMESH
and the RDF model it represents.
34Example 2 dcsubject
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltrdfDescriptiongt ltdcsubjectgt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
ltrdfslabelgt Formate Dehydrogenase
lt/rdfslabelgt lt/dctermsMESHgt
lt/dcsubjectgt lt/rdfDescriptiongt lt/rdfRDFgt
relatedMetadata
Formated
rdfslabel
dcsubject
rdfsvalue
D08.586
rdftype
rdftype
dctermsMESH
But we dont want to embed all this information
into every instance metadata record do we?
35Example 2 dcsubject
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
ltrdfslabelgt Formate Dehydrogenase
lt/rdfslabelgt lt/dctermsMESHgt lt/rdfRDFgt
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltrdfDescriptiongt ltdcsubjectgt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
lt/dctermsMESHgt lt/dcsubjectgt
lt/rdfDescriptiongt lt/rdfRDFgt
D08.586
Formated
rdfslabel
dcsubject
Formated
rdftype
rdftype
dctermsMESH
dctermsMESH
Need to separate part of the information out and
store it in a single place in this case with
the terminology owner
36Example 2 dcsubject
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
ltrdfslabelgt Formate Dehydrogenase
lt/rdfslabelgt lt/dctermsMESHgt lt/rdfRDFgt
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltrdfDescriptiongt ltdcsubjectgt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
lt/dctermsMESHgt lt/dcsubjectgt
lt/rdfDescriptiongt lt/rdfRDFgt
D08.586
Formated
rdfslabel
dcsubject
valueURI
valueURI
Formated
rdftype
rdftype
dctermsMESH
dctermsMESH
To do this we need to assign a URI (the
valueURI) to the anonymous value node
37Example 2 dcsubject
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
ltrdfslabelgt Formate Dehydrogenase
lt/rdfslabelgt lt/dctermsMESHgt lt/rdfRDFgt
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltrdfDescriptiongt ltdcsubjectgt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
lt/dctermsMESHgt lt/dcsubjectgt
lt/rdfDescriptiongt lt/rdfRDFgt
relatedMetadataURI
D08.586
Formated
rdfslabel
dcsubject
valueURI
valueURI
Formated
rdftype
rdftype
dctermsMESH
dctermsMESH
The document containing this information is
itself an RDF resource (the relatedMetadata)
and has a URI
38Example 2 dcsubject
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
ltrdfslabelgt Formate Dehydrogenase
lt/rdfslabelgt lt/dctermsMESHgt lt/rdfRDFgt
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltrdfDescriptiongt ltdcsubjectgt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
lt/dctermsMESHgt lt/dcsubjectgt
lt/rdfDescriptiongt lt/rdfRDFgt
relatedMetadataURI
D08.586
Formated
rdfslabel
dcsubject
valueURI
valueURI
rdfsseeAlso
Formated
rdftype
rdftype
dctermsMESH
dctermsMESH
Use rdfseeAlso to form linkage between
description and relatedMetadata
39Abstract DC model
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
ltrdfslabelgt Formate Dehydrogenase
lt/rdfslabelgt lt/dctermsMESHgt lt/rdfRDFgt
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltrdfDescriptiongt ltdcsubjectgt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
lt/dctermsMESHgt lt/dcsubjectgt
lt/rdfDescriptiongt lt/rdfRDFgt
relatedMetadataURI
resource
resource
relatedMetadata
D08.586
Formated
valueString
valueString (valueStringLang)
rdfslabel
dcsubject
valueURI
property
property
valueURI
valueURI
valueURI
rdfsseeAlso
Formated
rdftype
rdftype
dctermsMESH
dctermsMESH
In terms of abstract DC model we now have
resource, property, valueURI, valueString (and
valueStringLang), encodingScheme, relatedMetadata
encodingScheme