Title: The Metadata Landscape: Conventions for Semantics, Syntax, and Structure in the Internet Commons
1The Metadata LandscapeConventions for
Semantics, Syntax, and Structure in the Internet
Commons
Interfaces to Scientific Data Archives Pasadena,
California
- Stuart Weibel
- Senior Research Scientist
- OCLC Office of Research
- March 24-27, 1998
2Outline of Todays Talk
- Motivations for developing new conventions for
resource description - The Dublin Core Metadata Initiative Semantics
for resource description - The Resource Description Framework encoding and
transport for Web metadata
3Metadatastructured data about data
A resource description community is characterized
by common semantic, structural, and syntactic
conventions for exchange of resource description
information
4 The Internet Commons embraces many formal and
informal Resource Description Communities
5Interoperabilityrequires conventions about
- Semantics
- The meaning of the elements
- Structure
- human-readable
- machine-parseable
- Syntax
- grammars to convey semantics and structure
6The Dublin Core Metadata Workshop Series
- How to improve resource discovery on the Web?
- simple resource description semantics
- Build an interdisciplinary consensus about a core
element set for resource discovery - simple and intuitive
- cross-disciplinary
- international
- flexible
7 Metadata Workshop Series and
Related Events
- Chicago WWW Conference Oct,
1994 - OCLC/NCSA Metadata Workshop Mar, 1995
- OCLC/UKOLN Warwick Workshop April,
1996 - W3C Dist. Indexing and Searching
May, 1996 - CNI/OCLC Image Metadata Workshop Sep, 1996
- DC-4, Canberra, Australia Mar, 1997
- Meta Access Summit at RLG
July, 1997 - EPA Metadata Registry Workshop
July, 1997 - DC-5, Helsinki, Finland Oct, 1997
8The Dublin CoreMetadata Element Set
- Title
- Author/Creator
- Subject /Keywords
- Description
- Publisher
- Other Contributor
- Date
- Resource Type
- Format
- Resource Identifier
- Source
- Language
- Relation
- Coverage
- Rights Management
9Central Characteristics of the
Dublin Core Metadata Element Set
- Descriptive metadata for resource discovery (15
elements) - All elements optional
- All elements repeatable
- Extensible (a starting place for richer
description) - Interdisciplinary (semantic interoperability)
- International (10 languages and growing)
10Extensibility
- Refined semantics (Ukrainian Doll model)
- improve sharpness of description with qualifiers
that refine semantics (controlled vocabularies,
encoding standards) - Extended semantics (Lego block model)
- additional elements
- complementary packages of metadata
(administrative, rights management,
discipline-specific, etc)
11What might Extensibility mean for Scientific
Data Communities?
- High-level descriptors to describe data sets
- Use of domain-specific schemes
- Schemes to refine the semantics of Subject,
Description, Format, Relation, Coverage. - Controlled vocabularies
- There is no magic If you couldnt play the piano
before your operation, you wont be able to
after.
12Steps Toward Standardization
- IETF informational RFCs of Dublin Core semantics
and syntax for unqualified DC - NISO standardization initiated
- ISO standardization is under consideration as
well
13DC Implementation Projects
- 50 major implementation projects in 10 countries
(see the DC home page) - Australian Government Locator Service
- Danish Online Government Information and Danish
National Bibliography - Corporate interest in document management
semantics - eg. Boeing, Ford, Nokia
- Metadata for Digital Object Identifiers (DOIs)
14Why consider the Dublin Core?
- You have a rich standard, need a simple one
(probably for cost reasons) - You want to reveal your data to other communities
(via the Web) using commonly understood semantics - You want to provide unified access to databases
with different underlying schemas - You need core description semantics and dont
feel compelled to invent them anew
15Resource Description FormatAn architecture for
metadata on the Web
- W3C Initiative (Formal Working Group)
- Conventions to support interoperability among
applications that exchange metadata - Syntax expressed in XML
- Semantics defined by others
- Promote expression of semantics in syntax that
can be processed by machines as well as humans
16RDF Working Groups
- Model and Syntax Working Group (baked)
- conventions for encoding arbitrary varieties of
metadata (semantics defined by others) - Schemas Working Group (half-baked)
- conventions for defining interoperable schemas
- Search Protocols (looking for the recipe)
- conventions for indexing and searching protocols
17The RDF Data Model
18RDF (pseudo) Syntax Example
lt? XMLNamespace HREF http//purl.org/RDF/RDFCo
re AS RDF ?gt lt? XMLNamespace HREF
http//oclc.org/DublinCore AS DC
?gt ltRDFDescription HREF http//purl.oclc.org
/metadata/dublin_coregt ltDC gt
lt/gt ltDC
gt
lt/gt ltDC gt
lt/gt ltDC gt
lt/DC
gt lt/RDFDescriptiongt
Title
Creator Date Subject Subject
19RDF Why is it important?
- Market demand for deployment
- Software infrastructure will be ubiquitous
- RDF provides a model and syntactical framework
for metadata in this infrastructure - Will support independently developed and
maintained metadata element sets - e.g. MARC, DC, TEI, EAD, CIMI, GILS, IAFA,
content ratings...
20XML-Datacomplementary or competitive?
- W3C Technical Note released in January 98
- Provides for strong data typing within XML
- Semantic and syntactic schemas within XML
(overlapping RDF functionality) - Tim Berners-Lee (Feb 10, 1998)
The relationship between the roles of XML
level (structural) schemas and RDF level
(semantic) schemas is not yet clear.
21Finally...
- We will have means for expressing highly
structured data and metadata on the Web - Tools will be integrated into Web infrastructure
for creating and managing metadata (possibly this
year) - The foundations for extensible semantics are in
place - The biggest challenge is to promote consistent
deployment
22Additional Information on Dublin Core and RDF
- Dublin Core Homepage
- http//purl.org/metadata/dublin_core
- RDF Working Group Home Page
- http//www.w3.org/RDF