Title: Developing Ontologies and more
1Developing Ontologies(and more)
- Peter Fox (NCAR)
- ESIP Winter Meeting (TIWG)
- January 9, 2008, Washington, D.C.
2 Ontology Spectrum
Thesauri narrower term relation
Selected Logical Constraints (disjointness,
inverse, )
Frames (properties)
Formal is-a
Catalog/ ID
Informal is-a
Formal instance
General Logical constraints
Terms/ glossary
Value Restrs.
Originally from AAAI 1999- Ontologies Panel by
Gruninger, Lehmann, McGuinness, Uschold, Welty
updated by McGuinness. Description in
www.ksl.stanford.edu/people/dlm/papers/ontologies-
come-of-age-abstract.html
3Ontology - declarative knowledge
- The triple subject-predicate-object
- interferometer is-a optical instrument
- Fabry-Perot is-a interferometer
- Optical instrument has focal length
- Optical instrument is-a instrument
- Instrument has instrument operating mode
- Data archive has measured parameter
- SO2 concentration is-a concentration
- Concentration is-a parameter
4Semantic Web Layers
http//www.w3.org/2003/Talks/1023-iswc-tbl/slide26
-0.html, http//flickr.com/photos/pshab/291147522/
5Terminology
- Ontology (n.d.). The Free On-line Dictionary of
Computing. http//dictionary.reference.com/browse/
ontology - An explicit?formal specification of how to
represent the objects, concepts?and other
entities that are assumed to exist in some area
of?interest and the relationships that hold among
them. - Semantic Web
- An extension of the current web in which
information is given well-defined meaning, better
enabling computers and people to work in
cooperation, www.semanticweb.org - Primer http//www.ics.forth.gr/isl/swprimer/
- Languages
- OWL 1.0 (Lite, DL, Full) - Web Ontology Language
(W3C) - RDF - Resource Description Framework (W3C)
- OWL-S/SWSL - Web Services (W3C)
- WSMO/WSML - Web Services (EC/W3C)
- SWRL - Semantic Web Rule Language, RIF- Rules
Interchange Format - Editors Protégé, SWOOP, CoE, VOM, Medius, SWeDE,
6OWL and RDF
- OWL
- Lite
- DL
- Full
- RDF
- Services
- OWL-S
- SWSL
- WSML
- SAWSDL - (WSDL-S)
- Rules
- SWRL
7Developing Ontologies
- Approach
- Bottom-up
- Top-down (upper-level or foundational)
- Mid-level (use case)
- Using tools
- Coding and testing
- Iterating
- Maintaining and evolving (curation, preservation)
8GRDDL - bottom up
- GRDDL - Gleaning Resource Descriptions from
Dialects of Languages - Pretty much XML/XHTML (for e.g.) into RDF via
XSLT - Good support, e.g. Jena
- Handles microformats
- Active community
- How to categorize, use, re-use (parts of)?
9Collecting
- RDFa extends XHTML by
- extending the link and meta to include child
elements - add metadata to any elements (a bit like the
class in micro-formats, but via dedicated
properties) - It is very similar to micro-formats, but with
more rigor - it is a general framework (instead of an
agreement on the meaning of, say, a class
attribute value) - terminologies can be mixed more easily
- ATOM (used with RSS)
10Foundational Ontologies
- CONTENTS
- General concepts and relations that apply in all
domains - physical object, process, event,, inheres,
participates, - Rigorously defined
- formal logic, philosophical principles, highly
structured - Examples
- DOLCE, BFO, GFO, SUMO, CYC, (Sowa)
Courtesy Boyan Brodaric
11Foundational Ontologies
PURPOSE help integrate domain ontologies
Courtesy Boyan Brodaric
12Foundational Ontologies
PURPOSE help organize domain ontologies
Courtesy Boyan Brodaric
13Problem scenario
- Little work done on linking foundational
ontologies with geoscience ontologies - Such linkage might benefit various scenarios
requiring cross-disciplinary knowledge, e.g. - water budgets groundwater (geology) and surface
water (hydro) - hazards risk hazard potential (geology,
geophysics) and items at threat (infrastructure,
people, environment, economic) - health toxic substances (geochemistry) and
people, wildlife - many others
Courtesy Boyan Brodaric
14DOLCE
15DOLCE SWEET
- Benefits
- full coverage
- rich relations
- home for orphans
- single superclasses
- Issues
- individuals (e.g. Planet Earth)
- roles (contaminant)
- features (SeaFloor)
Courtesy Boyan Brodaric
16Conclusions
- Surprisingly good fit amongst ontologies
- so far no show-stopper conflicts, a few
difficult conflicts - DOLCE richness benefits geoscience ontologies
- good conceptual foundation helps clear some
existing problems - Unresolved issues in modeling science entities
- modeling classifications, interpretations,
theories, models,
- Same procedure with GeoSciML
Courtesy Boyan Brodaric
17SUMO - Standard Upper Merged Ontology
- Physical
- Object
- SelfConnectedObject
- ContinuousObject
- CorpuscularObject
- Collection
- Process
- Abstract
- SetClass
- Relation
- Proposition
- Quantity
- Number
- PhysicalQuantity
- Attribute
18(No Transcript)
19(No Transcript)
20Using SNAP/ SPAN
21GeoSciOnt?
22(No Transcript)
23Using SWEET
- Plug-in (import) domain detailed modules
- Lots of classes, few relations (properties)
24Mix-n-Match
- The IRI example
- Collect a lot of different ontologies
representing different terms, levels of concepts,
etc. into a base form RDF - See Bennos talk in session 1b.
- MMI
- Others
25CF attributes
NC basic attributes
IRIDL attributes/objects
CF data objects
CF Standard Names (RDF object)
SWEET Ontologies (OWL)
Location
IRIDL Terms
CF Standard Names As Terms
SWEET as Terms
Search Terms
Gazetteer Terms
Blumenthal
26IRI RDF Architecture
Data Servers
MMI
Ontologies
JPL
Start Point
bibliography
Standards Organizations
RDF Crawler
Location Canonicalizer
RDFS Semantics Owl Semantics SWRL Rules SeRQL
CONSTRUCT
Time Canonicalizer
Sesame
Search Queries
Blumenthal
Search Interface
27Mid-Level Developing ontologies
- Use cases and small team (7-8 2-3 domain
experts, 2 knowledge experts, 1 software
engineer, 1 facilitator, 1 scribe) - Identify classes and properties (leverage
controlled vocab.) - Start with narrower terms, generalize when needed
or possible - Adopt a suitable conceptual decomposition (e.g.
SWEET) - Import modules when concepts are orthogonal
- Review, vet, publish
- Only code them (in RDF or OWL) when needed (CMAP,
) - Ontologies small and modular
28Use Case example
- Plot the neutral temperature from the
Millstone-Hill Fabry Perot, operating in the
vertical mode during January 2000 as a time
series. - Plot the neutral temperature from the
Millstone-Hill Fabry Perot, operating in the
vertical mode during January 2000 as a time
series. - Objects
- Neutral temperature is a (temperature is a)
parameter - Millstone Hill is a (ground-based observatory is
a) observatory - Fabry-Perot is a interferometer is a optical
instrument is a instrument - Vertical mode is a instrument operating mode
- January 2000 is a date-time range
- Time is a independent variable/ coordinate
- Time series is a data plot is a data product
29Class and property example
- Parameter
- Has coordinates (independent variables)
- Observatory
- Operates instruments
- Instrument
- Has operating mode
- Instrument operating mode
- Has measured parameters
- Date-time interval
- Data product
30(No Transcript)
31(No Transcript)
32(No Transcript)
33Higher level use case
- Find data which represents the state of the
neutral atmosphere above 100km, toward the arctic
circle at any time of high geomagnetic activity - Find data which represents the state of the
neutral atmosphere above 100km, toward the arctic
circle at any time of high geomagnetic activity
34 Translating the Use-Case - non-monotonic?
GeoMagneticActivity has ProxyRepresentation Geophy
sicalIndex is a ProxyRepresentation (in Realm of
Neutral Atmosphere) Kp is a GeophysicalIndex
hasTemporalDomain daily hasHighThreshold
xsd_number 8 Date/time when KP gt 8
Specification needed for query to
CEDARWEB Instrument Parameter(s) Operating
Mode Observatory Date/time Return-type data
- Input
- Physical properties State of neutral atmosphere
- Spatial
- Above 100km
- Toward arctic circle (above 45N)
- Conditions
- High geomagnetic activity
- Action Return Data
35 Translating the Use-Case - ctd.
NeutralAtmosphere is a subRealm of
TerrestrialAtmosphere hasPhysicalProperties
NeutralTemperature, Neutral Wind,
etc. hasSpatialDomain 0,360,0,180,100,150 h
asTemporalDomain NeutralTemperature is a
Temperature (which) is a Parameter
Specification needed for query to
CEDARWEB Instrument Parameter(s) Operating
Mode Observatory Date/time Return-type data
Input Physical properties State of neutral
atmosphere Spatial Above 100km Toward arctic
circle (above 45N) Conditions High geomagnetic
activity Action Return Data
FabryPerotInterferometer is a Interferometer,
(which) is a Optical Instrument (which) is a
Instrument hasFilterCentralWavelength
Wavelength hasLowerBoundFormationHeight
Height ArcticCircle is a GeographicRegion hasLati
tudeBoundary hasLatitudeUpperBoundary
GeoMagneticActivity has ProxyRepresentation Geophy
sicalIndex is a ProxyRepresentation (in Realm of
Neutral Atmosphere) Kp is a GeophysicalIndex
hasTemporalDomain daily hasHighThreshold
xsd_number 8 Date/time when KP gt 8
36Tools - Using Protégé
37Creating Ontologies - visual
- UML - new release of ODM/MOF
- Ontology Definition Metamodel/Meta Object
Facility (OMG) for UML - Provides standardized notation
- CMAP Ontology Editor (concept mapping tool from
IHMC) - Drag/drop visual development of classes, subclass
(is-a) and property relationship - Read and writes OWL
- Formal convention (OWL/RDF tags, etc.)
- White board, text file
38Using CMAP/COE
39(No Transcript)
40Is OWL the only option? No
- SKOS - Simple Knowledge Organization Scheme
- Annotations (RDFa)
- Atom
- Natural Language (read results from a web search
and transform to a usable form) - CL (common logic)
- Rabbit, e.g. ShellfishCourse is a Meal Course
that (if has drink) always has drink Potable
Liquid that has Full body and which either has
Moderate or Strong flavour - PENG (processable English)
41Is OWL the only option II? No
- Natural Language (NL)
- Read results from a web search and transform to a
usable form - Find/filter out inconsistencies,
concepts/relations that cannot be represented - Popular options
- CLCE (common logic controlled english)
- Rabbit, e.g. ShellfishCourse is a Meal Course
that (if has drink) always has drink Potable
Liquid that has Full body and which either has
Moderate or Strong flavour - PENG (processable English)
- Really need PSCI - process-able science
42Creating Ontologies - verbal
- Translating use cases
- E.g. Find data which represents the state of the
neutral atmosphere above 100km, toward the arctic
circle at any time of high geomagnetic activity - Can this be expressed as an ontology?
- CLCE, Rabbit, PENG, Sydney syntax
- Notice something about the next examples?
43Sydney syntax
- If X has Y as a father then Y is the only father
of X. - The class person is equivalent to male or female,
and male and female are mutually exclusive. - equivalent to
- The classes male and female are mutually
exclusive. The class person is fully defined as
anything that is a male or a female.
44PENG - Processible English
- If X is a research programmer then X is a
programmer. - Bill Smith is a research programmer who works at
the CLT. - Who is a programmer and works at the CLT?
45CLCE - Common Logic Controlled English
- CLCE If a set x is the set of (a cat, a dog,
and an elephant), then the cat is an element of
x, the dog is an element of x, and the elephant
is an element of x. - PC(?xSet)(?x1Cat)(?x2Dog)(?x3Elephant)(Set(x
,x1,x2,x3) ? (x1?x ? x2?x ? x3?x))
46Use Case
- Provide a decision support capability for an
analyst to determine an individuals
susceptibility to avian flu without having to be
precise in terminology (-nyms)
47(No Transcript)
48(No Transcript)
49Using ThManager
50Services
- Ontologies of services, provides
- What does the service provide for prospective
clients? The answer to this question is given in
the "profile," which is used to advertise the
service. To capture this perspective, each
instance of the class Service presents a
ServiceProfile. - How is it used? The answer to this question is
given in the "process model." This perspective is
captured by the ServiceModel class. Instances of
the class Service use the property describedBy to
refer to the service's ServiceModel. - How does one interact with it? The answer to this
question is given in the "grounding." A grounding
provides the needed details about transport
protocols. Instances of the class Service have a
supports property referring to a ServiceGrounding.
51Developing a service ontology
- Use case find and display in the same
projection, sea surface temperature and land
surface temperature from a global climate model. - Find and display in the same projection, sea
surface temperature and land surface temperature
from a global climate model. - Classes/ concepts
- Temperature
- Surface (sea/ land)
- Model
- Climate
- Global
- Projection
- Display
52Service ontology
- Climate model is a model
- Model has domain
- Climate Model has component representation
- Land surface is-a component representation
- Ocean is-a component representation
- Sea surface is part of ocean
- Model has spatial representation (and temporal)
- Spatial representation has dimensions
- Latitude-longitude is a horizontal spatial
representation - Displaced pole is a horizontal spatial
representation - Ocean model has displaced pole representation
- Land surface model has latitude-longitude
representation - Lambert conformal is a geographic spatial
representation - Reprojection is a transform between spatial
representation - .
53Service ontology
- A sea surface model has grid representation
displaced pole and land surface model has grid
representation latitude-longitude and both must
be transformed to Lambert conformal for display
54Best practices
- Ontologies/ vocabularies must be shared and
reused - swoogle.umbc.edu, www.planetont.org - Examine core vocabularies to start with
- SKOS Core about knowledge systems
- Dublin Core about information resources, digital
libraries, with extensions for rights,
permissions, digital right management - FOAF about people and their organizations
- DOAP on the descriptions of software projects
- DOLCE seems the most promising to match science
ontologies - Go Lite as much as possible, then DL and only
if you have to Full - balancing expressibility
vs. implementability - Minimal properties to start, add only when needed
55Tutorial Summary
- Many different options for ontology development
and encoding - Tools are in reasonable shape, no killer-tool
- Best practices DO exist
- PLEASE DO NOT just start coding OWL!
- Use case should drive the functional requirements
of both your ontology and how you will build
one - PARTNER with someone already familiar
56More information
- OWL-S - http//www.w3.org/Submission/OWL-S
- SWSO/F/L - Semantic Web Services
Ontology/Framework/Language - http//www.w3.org/S
ubmission/SWSF/ - WSMO/X/L - Web Services Modeling
Ontology/Exection/Language - http//www.w3.org/Sub
mission/WSMX/ www.wsmo.org, www.wsmx.org - SAWSDL - (WSDL-S)
57Other tools
- Reasoners
- Pellet, Racer, Medius KBS, FACT, fuzzyDL,
KAON2, MSPASS, QuOnto - Query Languages
- SPARQL, XQUERY, SeRQL, OWL-QL, RDFQuery
- Other Tools for Semantic Web
- Search SWOOGLE swoogle.umbc.edu
- Collaboration www.planetont.org
- Other Jena, SeSAME/SAIL, Mulgara, Eclipse,
KOWARI - Semantic wiki OntoWiki, SemanticMediaWiki
58Editors
- Protégé (http//protégé.stanford.edu)
- SWOOP (http//mindswap.org/2004/SWOOP)
- Altova SemanticWorks (http//www.altova.com/downlo
ad/semanticworks/semantic_web_rdf_owl_editor.html)
- SWeDE (http//owl-eclipse.projects.semwebcentral.o
rg/InstallSwede.html), goes with Eclipse - Medius
- TopBraid Composer and other commercial tools
- Visual Ontology Modeler (VOM) - Sandpiper
- CMAP Ontology Editor (COE) (http//cmap.ihmc.us/co
e)
59What about Earth Science?
- SWEET (Semantic Web for Earth and Environmental
Terminology) - http//sweet.jpl.nasa.gov
- based on GCMD terms
- modular using faceted and integrative concepts
- VSTO (Virtual Solar-Terrestrial Observatory)
- http//vsto.hao.ucar.edu
- captures observational data (from instruments)
- modular using domains
- MMI
- http//marinemetadata.org
- captures aspects of marine data, ocean observing
systems - partly modular, mostly by developed project
- GeoSciML
- http//www.opengis.net/GeoSciML/
- is a GML (Geography ML) application language for
Geoscience - modular, in packages