Title: XML, RDF, and OWL
1XML, RDF, and OWL
- The Derivation of Web Ontology Language
2Acknowledgments
- This presentation uses several researchers
previous examples - Special thanks to Roger L. Costello and David B.
Jacobs in MITRE Corporation, Hamish Cunningham
and Kalina Bontcheva in University of Sheffield,
David De Roure in GGF Semantic Grid Research
Group, and one anonymous researcher who provides
excellent explanation of RDF syntax.
3The Holy Grail
Hamish Cunningham and Kalina Bontcheva,
Ontology-Aware Information Extraction, 2002
4Semantic Web Wedding Cake
5XML document labeled tree
ltcourse date...gt lttitlegt...lt/titlegt ltteachergt
...lt/teachergt ltnamegt...lt/namegt lthttpgt...lt/http
gt ltstudentsgt...lt/studentsgtlt/coursegt
- XML Schema grammars for describing legal trees
and datatypes - Why not use XML to represent semantics?
6Syntax and Semantics
- Syntax structure of the data
- Semantics meaning of the data
- Two conditions necessary for interoperability
- Adopt a common syntax this enables applications
to parse the data. - Adopt a means for understanding the semantics
this enables applications to use the data.
7Can XML represent semantics?
- lttitlegt
-
- lt/titlegt
- title a heading that names a statute or
legislative bill. - title the name of a work of art or literary
composition etc. - title a general or descriptive heading for a
section of a written work. - title the status of being a champion.
- title a legal document signed and sealed and
delivered to effect a transfer of property and to
show the legal right to possess it - (from WordNet)
8XML limitations for semantic markup
- XML makes no commitment on
- ? Domain-specific ontological vocabulary
- ? Ontological modeling primitives
- Requires pre-arranged agreement on ? ?
- Only feasible for closed collaboration
- agents in a small stable community
- pages on a small stable intranet
- Not suited for sharing Web-resources
9What is the purpose of RDF?
- The purpose of RDF (Resource Description
Framework) is to give a standard way of
specifying data "about" something. - Here's an example of an XML document that
specifies data about China's Yangtze river
lt?xml version"1.0"?gt ltRiver id"Yangtze"
xmlns"http//www.geodesy.org/river"gt
ltlengthgt6300 kilometerslt/lengthgt
ltstartingLocationgtwestern China's Qinghai-Tibet
Plateault/startingLocationgt
ltendingLocationgtEast China Sealt/endingLocationgt lt/
Rivergt
"Here is data about the Yangtze River. It has a
length of 6300 kilometers. Its startingLocation
is western China's Qinghai-Tibet Plateau. Its
endingLocation is the East China Sea."
10From XML to RDF
11Internationalized Resource Identifier (IRI)
RDF provides an ID attribute for identifying the
resource being described.
1
The ID attribute is in the RDF namespace.
2
lt?xml version"1.0"?gt ltRiver rdfID"Yangtze"
xmlnsrdf"http//www.w3.org/1999/02/22-rd
f-syntax-ns" xmlns"http//www.geodes
y.org/river"gt ltlengthgt6300
kilometerslt/lengthgt ltstartingLocationgtwestern
China's Qinghai-Tibet Plateault/startingLocationgt
ltendingLocationgtEast China
Sealt/endingLocationgt lt/Rivergt
Add the "fragment identifier symbol" to the
namespace.
3
12Namespaces
- Newest version W3C Recommendation in February
4th, 2004 (Namespaces in XML 1.1) - A simple method for qualifying element and
attribute names used in XML documents - Identified by IRI references
13RDF Namespace
14RDF Framework Model
RDF Description
Resource
IRI
Property
Property Type
Value
15The RDF Format
lt?xml version"1.0"?gt ltClass rdfID"Resource"
xmlnsrdf"http//www.w3.org/1999/02/22-r
df-syntax-ns" xmlns"uri"gt
ltpropertygtvaluelt/propertygt
ltpropertygtvaluelt/propertygt ... lt/Classgt
16More Interpretation
17Uniquely Identify the Resource
- RDF is very concerned about uniquely identifying
the type (class) and the properties. RDF is also
very concerned about uniquely identifying the
resource, e.g.,
This is the resource being described. We want to
uniquely identify this resource.
lt?xml version"1.0"?gt ltRiver rdfID"Yangtze"
xmlnsrdf"http//www.w3.org/1999/02/22-rd
f-syntax-ns" xmlns"http//www.geodes
y.org/river"gt ltlengthgt6300
kilometerslt/lengthgt ltstartingLocationgtwestern
China's Qinghai-Tibet Plateault/startingLocationgt
ltendingLocationgtEast China
Sealt/endingLocationgt lt/Rivergt
18rdfID
- The value of rdfID is a "relative URI".
- The "complete URI" is obtained by concatenating
the URL of the XML document with "" and then the
value of rdfID, e.g.,
lt?xml version"1.0"?gt ltRiver rdfID"Yangtze"
xmlnsrdf"http//www.w3.org/1999/02/22-rd
f-syntax-ns" xmlns"http//www.geodes
y.org/river"gt ltlengthgt6300
kilometerslt/lengthgt ltstartingLocationgtwestern
China's Qinghai-Tibet Plateault/startingLocationgt
ltendingLocationgtEast China
Sealt/endingLocationgt lt/Rivergt
Yangtze.rdf
Suppose that this RDF/XML document is located at
this URL http//www.china.org/geography/rivers. T
hus, the complete URI for this resource is
http//www.china.org/geography/riversYangtze
19xmlbase
- By default, the URL of the document provided the
base URI. - Depending on the location of the document is
brittle it will break if the document is moved,
or is copied to another location. - A more robust solution is to specify the base URI
in the document, e.g.,
lt?xml version"1.0"?gt ltRiver rdfID"Yangtze"
xmlnsrdf"http//www.w3.org/1999/02/22-rd
f-syntax-ns" xmlns"http//www.geodes
y.org/river" xmlbase"http//www.chi
na.org/geography/rivers"gt ltlengthgt6300
kilometerslt/lengthgt ltstartingLocationgtwestern
China's Qinghai-Tibet Plateault/startingLocationgt
ltendingLocationgtEast China
Sealt/endingLocationgt lt/Rivergt
Resource URI concatenation(xmlbase, '',
rdfID)
concatenation(http//www.china.org/geography/river
s, '', "Yangtze")
http//www.china.org/geography/riversYangtze
20rdfabout
- Instead of identifying a resource with a relative
URI (which then requires a base URI to be
prepended), we can give the complete identity of
a resource. However, we use rdfabout, rather
than rdfID, e.g.,
lt?xml version"1.0"?gt ltRiver rdfabout"http//www
.china.org/geography/riversYangtze"
xmlnsrdf"http//www.w3.org/1999/02/22-rdf-syntax
-ns" xmlns"http//www.geodesy.org/ri
ver"gt ltlengthgt6300 kilometerslt/lengthgt
ltstartingLocationgtwestern China's Qinghai-Tibet
Plateault/startingLocationgt
ltendingLocationgtEast China Sealt/endingLocationgt lt/
Rivergt
21rdfDescription rdftype
- There is another way of representing the XML.
This way makes it very clear that you are
describing something, and it makes it very clear
what the type (class) is of the thing you are
describing
lt?xml version"1.0"?gt ltrdfDescription
rdfabout"http//www.china.org/geography/riversY
angtze"
xmlnsrdf"http//www.w3.org/1999/02/22-rdf-syntax
-ns"
xmlns"http//www.geodesy.org/river"gt
ltrdftype rdfresource"http//www.geodesy.org/riv
erRiver"/gt ltlengthgt6300 kilometerslt/lengthgt
ltstartingLocationgtwestern China's
Qinghai-Tibet Plateault/startingLocationgt
ltendingLocationgtEast China Sealt/endingLocationgt lt/
rdfDescriptiongt
22RDF Triple Model
- RDF statements consist of
- resources ( nodes)which have propertieswhich
have values ( nodes,strings)
subject predicate object
http//www.china.org/geography/riversYangtze
has a http//www.geodesy.org/riverlength of 6300
kilometers
http//www.china.org/geography/riversYangtze
http//www.geodesy.org/riverlength
6300 kilometers
23RDF Graph Model
East China Sea
http//www.geodesy.org/riverendingLocation
http//www.china.org/geography/riversYangtze
http//www.geodesy.org/riverlength
6300 Kilometers
http//www.geodesy.org/riverstartingLocation
western China's Qinghai-Tibet Plateau
24Naming Convention
- The convention is to use a capital letter to
start a type (class) name, and use a lowercase
letter to start a property name. - This helps the eye quickly discern the striping
pattern.
25Complex Values
- RDF/XML can also represent graphs that include
nodes that have no IRIrefs, i.e., the blank
nodes, syntactically, values can be embedded
(i.e. lexically in-line) or referenced (linked)
26Complex Values (RDF code)
lt?xml version"1.0"?gt ltrdfDescription
rdfabout"http//www.china.org/geography/riversY
angtze"
xmlnsrdf"http//www.w3.org/1999/02/22-rdf-syntax
-ns"
xmlns"http//www.geodesy.org/river"gt
ltrdftype rdfresource"http//www.geodesy.org/riv
erRiver"/gt lt/rdfDescriptiongt ltrdfDescription
rdfabout"http//www.china.org/geography/riversY
angtze"gt ltlocation rdfnodeID"abc"/gt
lt/rdfDescriptiongt ltrdfDescription
rdfnodeID"abc"gt ltstartinggtwestern China's
Qinghai-Tibet Plateault/startinggt
ltendinggtEast China Sealt/endinggt lt/rdfDescriptiongt
27rdfID versus rdfabout
- When should rdfID be used? When should
rdfabout be used? - When you want to introduce a resource, and
provide an initial set of information about a
resource use rdfID - When you want to extend the information about a
resource use rdfabout - The RDF philosophy is akin to the Web philosophy.
That is, anyone, anywhere, anytime can provide
information about a resource.
28RDF Description
Resource 1
Resource 2
Resource 3
PropertyType1
PropertyType3
PropertyType2
PropertyType4
Atomic Value
Atomic Value
29RDF Parser
- There is a nice RDF validation Web services at
the W3C Web site, which will tell you if your XML
is in the proper RDF format.
http//www.w3.org/RDF/Validator/
30Notes of using the RDF Format
- Constrained the RDF format constrains you on how
you design your XML (i.e., you can't design your
XML in any arbitrary fashion). - RDF uses namespaces to uniquely identify types
(classes), properties, and resources. Thus, you
must have a solid understanding of namespaces. - Another XML vocabulary to learn to use the RDF
format you must learn the RDF vocabulary.
31Two Main Areas of RDF
RDF Schema
RDF Syntax
RDF
XML
32RDF Schema (RDFS)
- Defines small vocabulary for RDF
- Class, subClassOf, type
- Property, subPropertyOf
- domain, range
- Vocabulary can be used to define other
vocabularies for your application domain - The benefit of an RDFS is that it facilitates
inferences on your data, and enhanced searching.
Person
subClassOf
subClassOf
range
domain
Student
Researcher
HasSupervisor
type
type
Jeen
Frank
hasSuperVisor
33NaturallyOccurringWaterSource
BodyOfWater
Stream
Ocean
River
Tributary
Brook
Lake
Sea
Properties length Literal emptiesInto
BodyOfWater
Inference Engine
Rivulet
lt?xml version"1.0"?gt ltRiver rdfID"Yangtze"
xmlnsrdf"http//www.w3.org/1999/02/22-rd
f-syntax-ns" xmlns"http//www.geodes
y.org/water/naturally-occurring"gt
ltlengthgt6300 kilometerslt/lengthgt
ltemptiesInto rdfresource"http//www.china.org/ge
ographyEastChinaSea"/gt lt/Rivergt
Yangtze.rdf
Inferences - Yangtze is a Stream - Yangtze
is an NaturallyOcurringWaterSource -
http//www.china.org/geographyEastChinaSea is a
BodyOfWater
34NaturallyOccurringWaterSource
BodyOfWater
Stream
Ocean
River
Tributary
Brook
Lake
Sea
Properties length Literal emptiesInto
BodyOfWater
Search Engine
Rivulet
"Show me all documents that contain info about
Streams"
lt?xml version"1.0"?gt ltRiver rdfID"Yangtze"
xmlnsrdf"http//www.w3.org/1999/02/22-rd
f-syntax-ns" xmlns"http//www.geodes
y.org/water/naturally-occurring"gt
ltlengthgt6300 kilometerslt/lengthgt
ltemptiesInto rdfresource"http//www.china.org/ge
ographyEastChinaSea"/gt lt/Rivergt
Yangtze.rdf
Results - Yangtze is a Stream, so this
document is relevant to the query.
35RDF Schemas is all about defining taxonomies
(class hierarchies)
All classes and properties are defined
within rdfRDF
1
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdf"http//w
ww.w3.org/1999/02/22-rdf-syntax-ns"
xmlnsrdfs"http//www.w3.org/2000/01/rdf-sch
ema" xmlbase"http//www.geode
sy.org/water/naturally-occurring"gt
ltrdfsClass rdfID"River"gt
ltrdfssubClassOf rdfresource"Stream"/gt
lt/rdfsClassgt ltrdfsClass rdfID"Stream"gt
ltrdfssubClassOf rdfresource"NaturallyOcc
urringWaterSource"/gt lt/rdfsClassgt
... lt/rdfRDFgt
Assigns a namespace to the taxonomy!
2
Defines the River class
3
Since the Stream class is defined in the
same document we can reference it using a
fragment identifier.
5
Defines the Stream class
4
NaturallyOccurringWaterSource.rdfs (snippet)
This is read as "I hereby define a River
Class. River is a subClassOf Stream." "I
hereby define a Stream Class. Stream is a
subClassOf NaturallyOccurringWaterSource." ...
36rdfsClass
- This type is used to define a class.
- The rdfID provides a name for the class.
- The contents are used to indicate the members of
the class. - The contents are ANDed together.
37rdfssubClassOf
Stream
River
This represents the set of Streams, i.e., the set
of instances of type Stream.
This represents the set of Rivers, i.e., the set
of instances of type River.
38Multiple rdfssubClassOf Properties
ltrdfsClass rdfID"River"gt
ltrdfssubClassOf rdfresource"Stream"/gt
ltrdfssubClassOf rdfresource"http//www.containe
rs.orgSedimentContainer"/gt lt/rdfsClassgt
Stream
SedimentContainer
River
- a River is both a Stream and a
SedimentContainer.
The conjunction (AND) of two subClassOf
statements is a subset of the intersection of the
classes.
39rdfProperty
40Example of multiple rdfsrange
41Example of multiple rdfsdomain
42Class and Property different namespaces
- Class is in the rdfs namespace.
- Property is in the rdf namespace.
43Properties are defined separately from classes
- RDF Schema approach is to define a class, and
then separately define properties and state that
they are to be used with the class. - The advantage of this approach is that anyone,
anywhere, anytime can create a property and state
that it is usable with the class!
44Problems
- Equivalent classes
- Cardinality constraints
- More
- no precisely described meaning
- no inference model
45Beyond RDF
- OIL (Ontology Inference Layer) extends RDF Schema
to a fully-fledged knowledge representation
language. - logical expressions
- data-typing
- cardinality
- quantifiers
- http//www.ontoknowledge.org
- DAML (DARPA Agent Markup Language) US sister of
OIL - Merged as DAMLOIL in 2001
- Becomes OWL W3C Recommendation in February 10th,
2004
46DARPAs DAML/ W3Cs OWL Language
47OWL Web Ontology Language
48OWL cannot be a simple semantic extension of RDF/S
- Relationship between layers
- Syntactically no restriction
- Semantically preserve meanings
- Russells paradox
- A very large collection of built-in sets
- These built-in sets include the set consisting of
those sets do not contain themselves - Is this set a member of itself?
- Yes? It contains itself, so no
- No? It do not contain itself, so yes
- Violate the very principle of set theory set
membership should be a well-defined relationship
49OWL cannot be a simple semantic extension of RDF/S
- If OWL layered on top of RDF/S as a same-syntax
extension - There has to bee a large collection of built-in
classes in any model - When we want to make logical foundations of
classes in the extension work correctly - This collection includes the class that is
defined as those resources that do not belong to
the class - Russells paradox
- RDF/S does not fall into this paradox because it
does not need a large collection of built-in
classes - RDFS theory of classes and properties is very
weak - Not possible to give class a formula or determine
which resources belong to him - OWL is designed to allow for defined classes and
more relationships between classes - This richer theory clashes the underlying
principle of RDF/S
50OWL Extends RDF
- RDF-schema
- Class, subclass
- Property, subproperty
- Restrictions
- Range, domain
- Local, global
- Existential
- Cardinality
Combinators Union, Intersection Complement Symme
tric, transitive Mapping Equivalence Inverse
51Again! What is an Ontology?
- An ontology answers questions that are implicit
in your data.
4
How many guns/people are registered in a gun
license?
1
ltGunLicensegt ltregisteredGungt ltGungt
ltserialgtABCDlt/serialgt
lt/Gungt lt/registeredGungt ltholdergt
ltPersongt ltdriversLicenseNumbergtZXYZX
Ylt/driversLicenseNumbergt lt/Persongt
lt/holdergt lt/GunLicensegt
How many guns can have this serial number?
Can this gun be registered in other gun licenses?
2
How many people can have this driver's license
number?
3
52Gun License Ontology answers the Questions!
4
A gun license registers one gun to one person.
ltGunLicensegt ltregisteredGungt ltGungt
ltserialgtABCDlt/serialgt
lt/Gungt lt/registeredGungt ltholdergt
ltPersongt ltdriversLicenseNumbergtZXYZX
Ylt/driversLicenseNumbergt lt/Persongt
lt/holdergt lt/GunLicensegt
1
Only one gun can have this serial number.
A gun can be registered in only one gun license.
2
Only one person can have this driver's license
number.
3
53Ontologies vs. Markups
- Ontologies contain persistent information
- Markups data about specific instances of
classes and properties - E.g., general knowledge about the class River
(ontology) vs. data about specific river in a
country (markup) - OWL does not enforce this separation
54Ontologies vs. Markups
ltowlClass rdfID"River"gt
ltrdfssubClassOf rdfresource"Stream"/gt lt/owlCl
assgt
lt?xml version"1.0"?gt ltRiver rdfID"Yangtze"
xmlnsrdf"http//www.w3.org/1999/02/22-rd
f-syntax-ns" xmlns"http//www.geodes
y.org/water/naturally-occurring"gt
ltconnectsTo rdfresource"http//www.china.org/riv
ersWu"/gt ltconnectsTo rdfresource"http//ww
w.china.org/geographyEastChinaSea"/gt lt/Rivergt
55Properties
ltowlObjectProperty rdfID"emptiesInto"gt
ltrdfsdomain rdfresource"River"/gt
ltrdfsrange rdfresource"BodyOfWater"/gt lt/owlOb
jectPropertygt
ltowlDatatypeProperty rdfID"length"gt
ltrdfsdomain rdfresource"River"/gt
ltrdfsrange rdfresource"http//www.w3.org/2001/X
MLSchemanonNegativeInteger"/gt lt/owlDatatypePrope
rtygt
56OWL Full, OWL DL, and OWL Lite
- Description Logics provides a careful balance
between expressivity and computational complexity - OWL provides sublanguages with reduced
expressivity and computational complexity
OWL Full
OWL DL
OWL Lite
57Language Constructs OWL Lite
- Class
- rdfProperty
- rdfssubClassOf
- rdfssubPropertyOf
- rdfsdomain
- rdfsrange
- Individual
- equivalentClass
- equivalentProperty
- sameIndividualAs
- differentFrom
- allDifferent
- inverseOf
- TransitiveProperty
- SymmetricProperty
- FunctionalProperty
- InverseFunctionalProperty
- allValuesFrom
- someValuesFrom
- minCardinality (only 0 or 1)
- maxCardinality (only 0 or 1)
- cardinality (only 0 or 1)
- intersectionOf
- Imports
- priorVersion
- more
58Language Constructs DL Full
- one of
- disjointWith
- equivalentClass
- (applied to class expressions)
- rdfssubClassOf
- (applied to class expressions)
- unionOf
- intersectionOf
- complementOf
- Arbitrary Cardinality
- minCardinality
- maxCardinality
- cardinality
- hasValue
59More OWL example
ltowlClass rdfID"Large-Format"gt
ltowlintersectionOf rdfparseType"Collection"gt
ltowlClass rdfabout"Camera"/gt
ltowlClassgt
ltowlunionOf rdfparseType"Collection"gt
ltowlRestrictiongt
ltowlonProperty
rdfresource"optics"/gt
ltowlallValuesFrom rdfresource"Micro"/gt
lt/owlRestrictiongt
ltowlRestrictiongt
ltowlonProperty
rdfresource"optics"/gt
ltowlallValuesFrom rdfresource"Macro"/gt
lt/owlRestrictiongt
ltowlRestrictiongt
ltowlonProperty
rdfresource"optics"/gt
ltowlallValuesFrom rdfresource"Normal"/gt
lt/owlRestrictiongt
lt/owlunionOfgt
lt/owlClassgt lt/owlintersectionOfgt
lt/owlClassgt
60More OWL example
ltowlDatatypeProperty rdfID"f-stop"gt
ltowlequivalentProperty rdfresource"aperture"/gt
ltrdfsdomain rdfresource"Lens"/gt
ltrdfsrange rdfresource"xsdstring"/gt lt/o
wlDatatypePropertygt
61Differences
- OWL Lite
- Support classification hierarchy and simple
constraint features. - Tool support is simple
- Provide a quick migration path for thesauri and
other taxonomies - Support cardinality constraints, but only 0 or 1
- OWL DL
- Supports maximum expressiveness without losing
computational completeness and decidability of
reasoning systems - Support the existing Description Logic business
segment - A class cannot also be an individual or property,
a property can not also be an individual or class - OWL Full
- Maximum expressiveness and the syntactic freedom
of RDF with no computational guarantees - Allow an ontology to augment the meaning of the
pre-defined (RDF or OWL) vocabulary - A class can be treated simultaneously as a
collection of individuals and as an individual in
its own right
62OWL Validator
- OWL Validator
- http//owl.bbn.com/validator/
- Web-based or command-line utility
- Performs basic validation of OWL file
- OWL Ontology Validator
- http//phoebus.cs.man.ac.uk9999/OWL/Validator
- a "species validator" that checks use of OWL
Lite, OWL DL, and OWL Full constructs
63(No Transcript)