Title: Use of RDFOWL in Ingrid
1Use of RDF/OWL in Ingrid
M.Benno Blumenthal and John del
Corral International Research Institute for
Climate and Society http//iridl.ldeo.columbia.
edu/ontologies/
2Why RDF?
- Make implicit semantics explicit
- Web-based system for interoperating semantics
- RDF/OWL is an emerging technology, so tools are
being built that help solve the semantic problems
in handling data
3Standard Metadata
Standard Metadata Schema/Data Services
Datasets
Tools
Users
4Many Data Communities
5Super Schema
Standard metadata schema
6Super Schema direct
Standard metadata schema/data service
7Flaws
- A lot of work
- Super Schema/Service is the Lowest-Common-Denomina
tor - Science keeps evolving, so that standards either
fall behind or constantly change
8RDF Standard Data Model Exchange
Standard metadata schema
RDF
RDF
RDF
RDF
RDF
RDF
9RDF Data Model Exchange
Standard metadata schema
RDF
10Why is this better?
- Maps the original dataset metadata into a
standard format that can be transported and
manipulated - Still the same impedance mismatch when mapped to
the least-common-denominator standard metadata,
but - When a better standard comes along, the original
complete-but-nonstandard metadata is already
there to be remapped, and late semantic binding
means everyone can use the new semantic mapping - Can use enhanced mappings between models that
have common concepts beyond the
least-common-denominator - EASIER tools to enhance the mapping process,
mappings build on other mappings
11RDF Architecture
Virtual (derived) RDF
12Example Search Interface
Additional Semantics
Dataset Ontology
Search Ontology
Datasets
Search Interface
Users
13Sample Tool Faceted Search
http//iridl.ldeo.columbia.edu/ontologies/query2.p
l?...
14Distinctive Features of the search
- Search terms are interrelated
- terms that describe the set of returns are
displayed (spanning and not) - Returned items also have structure (sub-items and
superseded items are not shown)
15Architectural Features of the search
http//iridl.ldeo.columbia.edu/ontologies/query2.p
l
- Multiple search structures possible
- Multiple languages possible
- Search structure is kept in the database, not in
the code
16RDF framework for writing connections
- Triplets of
- Subject
- Property (or Predicate)
- Object
- URIs identify things, i.e. most of the above
- Namespaces are used as a convenient shorthand for
the URIs
17Datatype Properties
- WOA dctitle NOAA NODC WOA01
- WOA dcdescription NOAA NODC WOA01 World
Ocean Atlas 2001, an atlas of objectively
analyzed fields of major ocean parameters at
monthly, seasonal, and annual time scales.
Resolution 1x1 Longitude global Latitude
global Depth 0 m,5500 m Time Jan,Dec
monthly
18Object Properties
WOA iridlisContainerOf Grid-1x1, Grid-1x1
iridlisContainerOf Monthly
19WOA01 diagram
20Standard Properties
- WOA dctermhasPart Grid-1x1,
- Grid-1x1 dctermhasPart MONTHLY
- Alternatively
- WOA iridlisContainerOf Grid-1x1,
- iridlisContainerOf rdfssubPropertyOf
dctermhasPart
21Data Structures in RDF
SST rdftype cfattnon_coordinate_variable,
SST cfobjstandard_name cfsea_surface_tempera
ture, SST netcdfhasDimension longitude
- Object properties provide a framework for
explicitly writing down relationships between
data objects/components, e.g. vague meaning of
nesting is made explicit - Properties also can be related, since they are
objects too
22Virtual Triples
- Use Conventions to connect concepts to
established sets of concepts - Generate additional virtual triples from the
original set and semantics - RDFS some property/class semantics
- OWL additional property/class semantics more
sophisticated (ontological) relationships - SWRL rules for constructing virtual triples
23OWL
- Language for expressing ontologies, i.e. the
semantics are very important. However, even
without a reasoner to generate the implied RDF
statements, OWL classes and properties represent
a sophistication of the RDF Schema - However, there are many world views in how to
express concepts concepts as classes vs concepts
as individuals vs concept as predicate
24Define terms
- Attribute Ontology
- Object Ontology
- Term Ontology
25Attribute Ontology
- Subjects are the only type-object
- Predicates are attributes
- Objects are datatype
- Isomorphic to simple data tables
- Isomorphic to netcdf attributes of datasets
- Some faceted browsers predicate facet
26Object Ontology
- Objects are object-type
- Isomorphic to belongs to
- Isomorphic to multiple data tables connected by
keys - Express the concept behind netcdf attributes
which name variables - Concepts as objects can be cross-walked
- Concepts as object can be interrelated
27Example controlled vocabulary
- variable cfattstandard_name string
- Where string has to belong to a list of
possibilities. - variable cfobjstandard_name stdnam
- Where stdnam is an individual of the class
cfobjStandardName
28Example controlled vocabulary
- Bi-direction crosswalk between the two is
somewhat trivial, which means all my objects will
have both - cfattstandard_name
- and
- cfobjstandard_name
29Example controlled vocabulary
- If I am writing software to read/write netcdf
files, I use the cfatt ontology and in particular
cfattstandard_name - If I am making connections/cross-walks to other
variable naming standards, I use - cfobjstandard_name
30Term Ontology
- Concepts as individuals
- Simple Knowledge Organization System (SKOS) is a
prime example - The ontology used here is slightly different
facets are classes of terms rather than being
top_concepts
31Nuanced tagging
- Concepts as objects can be interrelated specific
terms imply broader terms - Object ends up being tagging with terms ranging
from general to specific. - Search can then be nuanced
- tagging can proceed in absence of perfect
information
32Mapping to Object Oriented Programming
33Faceted Search Explicated
34Search Interface
- Items (datasets/maps)
- Terms
- Facets
- Taxa
35Search Interface Semantic API
- item dctitle dcdescription rsslink
iridlicon - dctermisPartOf item2
- dctermisReplacedBy item2
- item trmisDescribedBy term
- term a facet of taxa of trmTerm,
- facet a trmFacet, taxa a trmTaxa,
- term trmdirectlyImplies term2
36Faceted Search w/Queries
http//iridl.ldeo.columbia.edu/ontologies/query2.p
l?...
37RDF Architecture
Virtual (derived) RDF
38IRI RDF Architecture
Data Servers
MMI
Ontologies
JPL
Start Point
bibliography
Standards Organizations
RDF Crawler
Location Canonicalizer
RDFS Semantics Owl Semantics SWRL Rules SeRQL
CONSTRUCT
Time Canonicalizer
Sesame
Search Queries
Search Interface
39Cast of Characters
- NC netcdf data file format
- CF Climate and Forecast metadata convention for
netcdf - SWEET - Semantic Web for Earth and Environmental
Terminology (OWL Ontology) - IRIDL IRI Data Library
40CF attributes
NC basic attributes
IRIDL attributes/objects
CF data objects
CF Standard Names (RDF object)
SWEET Ontologies (OWL)
Location
IRIDL Terms
CF Standard Names As Terms
SWEET as Terms
Search Terms
Gazetteer Terms
41Thoughts
- Pure RDF framework seems currently viable for a
moderate collection of data - Potential for making a lot of implicit data
conventions explicit - Explicit conventions can improve interoperability
- Simple RDF concepts can greatly impact searches
42Future Work Possibilities
- More Usable Search Interface
- Tagging Interface that uses tag
interrelationships to simplify choices - Data Format translation using semantics
- Related Object Browsing given a dataset, find
related data, papers, images - Document/execute/create analysis trees
- Stovepipe conventions/bash-to-fit
- Less Monolithic IRI Data Library
43Implications for Curator/Metafor
- Reproducibility implies complete metadata
- Non-standard complete metadata just needs to be
mapped to more standard schemes - A multiple-scheme system like RDF retains
reproducibility even with partial mapping to
standards - Should be able to measure the misfit find the
space of the unexplained guidance for
developing standards.
44Stovepipe Conventions
- Fixed Schema
- Agreed upon metadata domain
- Agreed upon data domain
- Designed to be a partial solution
- General server software needs to decide whether
data legitimately fits the standard - User contemplates bash-to-fit