Title: Forum for metadata schema implementers
1Forum for metadata schema implementers
- SCHEMAS and the
- Semantic Web
- SCHEMAS Workshop
- Budapest, 10 May 2001
- Thomas Baker, GMD
2Semantic Web vision
3What is it?
- Standards-making activity of World Wide Web
Consortium (W3C) - Making Web-accessible data easier to process
- Making machine-readable statements about all
kinds of things (Web pages, organisations,
people, concepts, products, etc) and the links
between them - Agreeing on explicit terms for characterizing
relationships between things described on Web
4Core architectural principles
- Simple linked data model
- a Web link means "has something to do with"
- URIs everything has a unique address
- Both resources and the metadata terms used to
describe them - XML universal file format
- XML namespaces unique addresses for metadata
vocabulary terms
5Why?
- Not only display Web data for people, but process
automatically (by software) - To share data between programs and resources
designed independently - Essential trait of a massively distributed Web
- Incorporate, reuse, re-purpose data for
unanticipated objectives - Allow diverse communities to communicate on the
basis of partial (imperfect) understanding
6How?
- Create webs of information about related things
using explicit statements - Statements follow a common model and use
machine-processable vocabularies - URIs ensure that vocabulary terms are tied to
unique definitions that everyone can find on the
Web
7"RDF"
- Basic model for making statements about
- Resources anything named with a URI
- Description stating the properties of the
resource using terms named by URIs - Framework a common model (grammar) for
statements using diverse vocabularies - Statements most commonly written in XML
8Example
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdf"http//
www.w3.org/1999/02/22-rdf-syntax-ns" xmlnsdc"ht
tp//purl.org/dc/elements/1.1/ gt
ltrdfDescription about"http//docs gt
ltdccreatorgt Joe Smith lt/dccreatorgt
lt/rdfDescriptiongt lt/rdfRDFgt
9URIs Everything defineduniquely on the Web
Tim Berners-Lee The most fundamental
specification of Web architecture, while one of
the simpler, is that of the Universal Resource
Identifier, or URI. The principle that anything,
absolutely anything, on the Web should be
identified distinctly is core.
10URIs as anchors for merging data
- URIs are fixed points on global Web for
- Metadata vocabulary terms, defined with
"dictionary entries" in namespace schemas - The resources described by those terms
- These points can be used to superimpose graphs,
merging statements - Creates market for aggregation, data merging,
annotation, and filtering services
11A set of statements
12...merged with others
Node and arc model can use multiple
vocabularies, distinguished by URI
13URIs anchor conceptsindependently of language
dccreator
Server in Germany
DCMI Server
Server in Jakarta
14Elementary sentences
- In RDF, meaning is expressed by Subject,
Predicate, and Object - This simple structure can be used to describe
most of the data processed by machines - More complex will not interoperate in the diverse
Web environment - Instead of asking machines to understand people's
language, ask people to make the extra effort
Tim Berners-Lee
15Partial understanding
- As in real world, anyone on Web can assert (in
RDF sense) anything about anything - Assumption you will understand some of these
statements but not all. - Ignore the ones you don't understand
- Tolerance of inconsistency and errors
- Web itself "Error 404 File not found", but
unchecked exponential growth
16Walk before we run
- W3C Semantic Web activity priorities
- Enable simple applications now
- Focus on general principles Web automation, data
aggregation, identity... - ...while planning for future complexity
17Pie-in-the-sky...
Some day... maybe...
Longer-term goal
SCHEMAS Registry
This workshop
18"Schemas"
- Declare, like a dictionary, the names and
definitions ("semantics") of vocabulary terms - From official standards to project-specific
schemas - Printed documentation or Web pages
- Or machine-processable schemas in XML or XML/RDF
19"Namespace schemas"
- One type of schema
- For officially declaring a unique set of elements
and definitions - Ideally, addressed on the Web with a URI
- May be an XML or XML/RDF schema
- In SCHEMAS Project, namespace schemas by
definition only "declare" names and definitions
of vocabulary terms
20"Application profiles"
- A type of schema
- Declare which vocabulary terms a particular
application or project uses in its metadata - May "mix and match" terms from different
namespaces - In SCHEMAS Project, a Profile by definition only
reuses terms defined somewhere in a namespace
21What profiles say
- Simple statements about the semantics used in a
project or application - My project http//project.org uses vcardfn.
- My project uses vcardfn in describing people.
- My project uses dcqLCSH to qualify dcSubject.
- My profile relabels cmuconductor as "Dirigent"
(in German).
22For what purpose
- Statements can be merged for exploratory queries
of an entire landscape of schemas - What elements do projects use to describe people?
- What controlled vocabularies do projects use for
Subject elements? - Which projects use elements from the vcard
namespace? - How do geospatial projects use elements for Date?
23What RDF profiles do not say
- Not for describing the tag structure of complex
documents (i.e. theses, order forms), like XML
Schema - Useful for validating metadata records ("Must
have at least one, but no more than four,
authors.") - But nested structures are hard to merge
- Software must be hand-coded for a particular
document type "Unless you use my document
schema, your software cannot use the data."
24Existing schemas today
- Project schemas are often designed as documents
- "These are the fields we want to see"
- But if the schema was not designed with a
well-defined data model, it may not have one - Information nested in 100 different ways
- Conceptually "messy" or complexly nested schemas
cannot be merged
25Making schemas comparable
- Declaration in RDF "Here's what our schema says"
- Uniform model makes schemas comparable
- Simple model "walk before we run"
- Documentary and social purpose
- Define semantic coherence across applications
- Support schema design and harmonization
26Thomas.Baker_at_gmd.de