Title: Semantic query, rules, tools Inference, triple stores, etc
1Semantic query, rules, tools (Inference, triple
stores, etc)
- Peter Fox (RPI)
- ESIP Winter Meeting
- Washington D.C., 2009, Jan 6, 4-530pm
2Semantic Web Methodology and Technology
Development Process
- Establish and improve a well-defined methodology
vision for Semantic Technology based application
development - Leverage controlled vocabularies, et c.
Adopt Technology Approach
Leverage Technology Infrastructure
Science/Expert Review Iteration
Rapid Prototype
Open World Evolve, Iterate, Redesign, Redeploy
Use Tools
Analysis
Use Case
Develop model/ ontology
Small Team, mixed skills
2
3Semantic Web Layers
http//www.w3.org/2003/Talks/1023-iswc-tbl/slide26
-0.html, http//flickr.com/photos/pshab/291147522/
4Terminology
- Ontology (n.d.). The Free On-line Dictionary of
Computing. http//dictionary.reference.com/browse/
ontology - An explicit formal specification of how to
represent the objects, concepts and other
entities that are assumed to exist in some area
of interest and the relationships that hold among
them. - Semantic Web
- An extension of the current web in which
information is given well-defined meaning, better
enabling computers and people to work in
cooperation, www.semanticweb.org - Primer http//www.ics.forth.gr/isl/swprimer/
5 Ontology Spectrum
Thesauri narrower term relation
Selected Logical Constraints (disjointness,
inverse, )
Frames (properties)
Formal is-a
Catalog/ ID
Informal is-a
Formal instance
General Logical constraints
Terms/ glossary
Value Restrs.
Originally from AAAI 1999- Ontologies Panel by
Gruninger, Lehmann, McGuinness, Uschold, Welty
updated by McGuinness. Description in
www.ksl.stanford.edu/people/dlm/papers/ontologies-
come-of-age-abstract.html
6Ontology - declarative knowledge
- The triple subject-object-predicate
- interferometer is-a optical instrument
- Fabry-Perot is-a interferometer
- Optical instrument has focal length
- Optical instrument is-a instrument
- Instrument has instrument operating mode
- Data archive has measured parameter
- SO2 concentration is-a concentration
- Concentration is-a parameter
- A query select all optical instruments which
have operating mode vertical - An inference infer operating modes for a
Fabry-Perot Interferometer which measures neutral
temperature
7What is Query?
- http//esw.w3.org/topic/SPARQL
- Languages
- SPARQL for RDF (http//www.sparql.org/ and
http//www.w3.org/TR/rdf-sparql-query/ ) - RDFQuery for RDF
- SeRQL for RDF (SeSAME)
- OWL-QL for OWL (http//projects.semwebcentral.org/
projects/owl-ql/ ) - XQUERY for XML
- Few as yet for natural language representations
(ROO Dolbear, et al., )
8SPARQL
- W3 Recommendation, Jan 2008
- SPARQL has 4 result forms
- SELECT Return a table of results.
- CONSTRUCT Return an RDF graph, based on a
template in the query. - DESCRIBE Return an RDF graph, based on what the
query processor is configured to return. - ASK Ask a boolean query.
- The SELECT form directly returns a table
- DESCRIBE and CONSTRUCT use the outcome of
matching to build RDF graphs.
9SPARQL Solution Modifiers
- Pattern matching produces a set of solutions.
This set can be modified in various ways - Projection - keep only selected variables
- OFFSET/LIMIT - chop the number solutions (best
used with ORDER BY) - ORDER BY - sorted results
- DISTINCT - yield only one row for one combination
of variables and values. - The solution modifiers OFFSET/LIMIT and ORDER BY
always apply to all result forms.
10Query examples
- PREFIX foaf lthttp//xmlns.com/foaf/0.1/gt
- SELECT ?url
- FROM ltbloggers.rdfgt
- WHERE
- ?contributor foafname "Jon Foobar" .
- ?contributor foafweblog ?url .
-
11What happens
- These triples together comprise a graph pattern.
- The query attempts to match the triples of the
graph pattern to the model. - Each matching binding of the graph pattern's
variables to the model's nodes becomes a query
solution, and the values of the variables named
in the SELECT clause become part of the query
results. - In this example, the first triple in the WHERE
clause's graph pattern matches a node with a
foafname property of "Jon Foobar," and binds it
to the variable named contributor. - In the bloggers.rdf model, contributor will match
the foafAgent blank-node at the top of the
figure. - The graph pattern's second triple matches the
object of the contributor's foafweblog property.
- This is bound to the url variable, forming a
query solution.
12Using SPARQL with Jena
- Jena calls RDF graphs "models" and triples
"statements" because that is what they were
called at the time the Jena API was first
designed - ARQ's query engine can also parse queries
expressed in RDQL or its own internal query
language. ARQ is under active development, and is
not yet part of the standard Jena distribution. - http//jena.sourceforge.net/ARQ/Tutorial/data.html
- Can also use SPARQL from the command line
13com.hp.hpl.jena.query package
- // Open the bloggers RDF graph from the
filesystem - InputStream in new FileInputStream(new
File("bloggers.rdf")) - // Create an empty in-memory model and populate
it from the graph - Model model ModelFactory.createMemModelMaker().c
reateModel() - model.read(in,null) // null base URI, since
model URIs are absolute - in.close()
- // Create a new query
- String queryString
- "PREFIX foaf lthttp//xmlns.com/foaf/0.1/gt "
- "SELECT ?url "
- "WHERE "
- " ?contributor foafname \"Jon Foobar\" . "
- " ?contributor foafweblog ?url . "
- "
- Query query QueryFactory.create(queryString)
- // Execute the query and obtain results
- QueryExecution qe QueryExecutionFactory.create(q
uery, model) - ResultSet results qe.execSelect()
- // Output query results
14More complex queries
- _at_prefix foaf lthttp//xmlns.com/foaf/0.1/gt .
- _a foafname "Jon Foobar"
- foafmbox ltmailtojon_at_foobar.xxgt
- foafdepiction lthttp//foobar.xx/2005/04/jo
n.jpggt . - _b foafname "A. N. O'Ther"
- foafmbox ltmailtoa.n.other_at_example.ne
tgt - foafdepiction lthttp//example.net/photos/a
n-2005.jpggt . - _c foafname "Liz Somebody"
- foafmbox_sha1sum "3f01fa9929df769aff173f57dec
2fe0c2290aeea" - _d foafname "M Benn"
- foafdepiction lthttp//mbe.nn/pics/me.jpeggt
.
15Querying FOAF data with an optional block
- PREFIX foaf lthttp//xmlns.com/foaf/0.1/gt
- SELECT ?name ?depiction
- WHERE
- ?person foafname ?name .
- OPTIONAL
- ?person foafdepiction ?depiction .
- .
-
- name depiction
- "A. N. O'Ther" lthttp//example.net/photos/an-2
005.jpggt - "Jon Foobar" lthttp//foobar.xx/2005/04/jon.j
pggt - "Liz Somebody"
- "M Benn" lthttp//mbe.nn/pics/me.jpeggt
16Query with alternative matches, and its results
- PREFIX foaf lthttp//xmlns.com/foaf/0.1/gt
- PREFIX rdf lthttp//www.w3.org/1999/02/22-rdf-synt
ax-nsgt - SELECT ?name ?mbox
- WHERE
- ?person foafname ?name .
-
- ?person foafmbox ?mbox UNION ?person
foafmbox_sha1sum ?mbox -
-
- name mbox
- "Jon Foobar" ltmailtojon_at_foobar.xxgt
- "A. N. O'Ther" ltmailtoa.n.other_at_example
.netgt - "Liz Somebody" "3f01fa9929df769aff173f57dec
2fe0c2290aeea"
17Filter to retrieve RSS feed items published in
April 2005
- PREFIX rss lthttp//purl.org/rss/1.0/gt
- PREFIX xsd lthttp//www.w3.org/2001/XMLSchemagt
- PREFIX dc lthttp//purl.org/dc/elements/1.1/gt
- SELECT ?item_title ?pub_date
- WHERE
- ?item rsstitle ?item_title .
- ?item dcdate ?pub_date .
- FILTER xsddateTime(?pub_date) gt
"2005-04-01T000000Z"xsddateTime - xsddateTime(?pub_date) lt
"2005-05-01T000000Z"xsddateTime -
18Find people described in two named FOAF graphs
- PREFIX foaf lthttp//xmlns.com/foaf/0.1/gt
- PREFIX rdf lthttp//www.w3.org/1999/02/22-rdf-synt
ax-nsgt - SELECT ?name
- FROM NAMED ltjon-foaf.rdfgt
- FROM NAMED ltliz-foaf.rdfgt
- WHERE
- GRAPH ltjon-foaf.rdfgt
- ?x rdftype foafPerson .
- ?x foafname ?name .
- .
- GRAPH ltliz-foaf.rdfgt
- ?y rdftype foafPerson .
- ?y foafname ?name .
- .
19Which graph describes different people
- PREFIX foaf lthttp//xmlns.com/foaf/0.1/gt
- PREFIX rdf lthttp//www.w3.org/1999/02/22-rdf-synt
ax-nsgt - SELECT ?name ?graph_uri
- FROM NAMED ltjon-foaf.rdfgt
- FROM NAMED ltliz-foaf.rdfgt
- WHERE
- GRAPH ?graph_uri
- ?x rdftype foafPerson .
- ?x foafname ?name .
-
-
- name graph_uri
- "Liz Somebody" ltfile//.../jon-foaf.rdfgt
- "A. N. O'Ther" ltfile//.../jon-foaf.rdfgt
- "Jon Foobar" ltfile//.../liz-foaf.rdfgt
- "A. N. O'Ther" ltfile//.../liz-foaf.rdfgt
20Personalized feed by query filter
- PREFIX foaf lthttp//xmlns.com/foaf/0.1/gt
- PREFIX rss lthttp//purl.org/rss/1.0/gt
- PREFIX dc lthttp//purl.org/dc/elements/1.1/gt
- SELECT ?title ?known_name ?link
- FROM lthttp//planetrdf.com/index.rdfgt
- FROM NAMED ltphil-foaf.rdfgt
- WHERE
- GRAPH ltphil-foaf.rdfgt
- ?me foafname "Phil McCarthy" .
- ?me foafknows ?known_person .
- ?known_person foafname ?known_name .
- .
- ?item dccreator ?known_name .
- ?item rsstitle ?title .
- ?item rsslink ?link .
- ?item dcdate ?date.
-
- ORDER BY DESC?date LIMIT 10
21Returning as XML
- SPARQL allows query results to be returned as
XML, in a simple format known as the SPARQL
Variable Binding Results XML Format. - This schema-defined format acts as a bridge
between RDF queries and XML tools and libraries. - There are a number of potential uses for this
capability. You could transform the results of a
SPARQL query into a Web page or RSS feed via
XSLT, access the results via XPath, or return the
result document to a SOAP or AJAX client. - To output query results as XML, use the
ResultSetFormatter.outputAsXML() method, or
specify --results rs/xml on the command line.
22Final example
- PREFIX dc lthttp//purl.org/dc/elements/1.1/gt
- PREFIX rss lthttp//purl.org/rss/1.0/gt
- SELECT ?link ?title
- FROM lthttp//rss.slashdot.org/Slashdot/slashdotSci
encegt - FROM lthttp//www.nature.com/nprot/current_issue/rs
s/index.htmlgt - WHERE
- ?i rsslink?link .
- ?i dcdate?date . FILTER (?date gt "2008-08-31")
- ?i rssdescription?desc. FILTER
regex(?desc,"biolog mathematic","i") - ?i rsstitle?title
232-page reference guide
- http//www.dajobe.org/2005/04-sparql/SPARQLreferen
ce-1.8-us.pdf
24Using Protégé
- SPARQL plug-in to run queries on your ontology
25Semantic Web with Rules
- Metalog
- RuleML
- SWRL
- WRL
- Cwm
- N3 - http//hydrogen.informatik.tu-cottbus.de/wiki
/index.php/N3_Notation - Jess
- Jena
- RIF
26Rules - expressing logic
- Notation - e.g. Horn rules
- (P1 ? P2 ? ...) ? C
- parent(?x,?y) ? brother(?y,?z) ? uncle(?x,?z)
- for any X, Y and Z if Y is a parent of X, and Z
is a brother of Y then Z is the uncle of X
27Examples from http//www.w3.org/Submission/SWRL/
- A simple use of these rules would be to assert
that the combination of the hasParent and
hasBrother properties implies the hasUncle
property. Informally, this rule could be written
as - hasParent(?x1,?x2) ? hasBrother(?x2,?x3) ?
hasUncle(?x1,?x3) - In the abstract syntax the rule would be written
like - Implies(Antecedent(hasParent(I-variable(x1)
I-variable(x2)) hasBrother(I-variable(x2)
I-variable(x3)))Consequent(hasUncle(I-variable(x1)
I-variable(x3)))) - From this rule, if John has Mary as a parent and
Mary has Bill as a brother then John has Bill as
an uncle.
28Examples
- An even simpler rule would be to assert that
Students are Persons, as in - Student(?x1) ? Person(?x1).Implies(Antecedent(Stud
ent(I-variable(x1)))Consequent(Person(I-variable(x
1)))) - However, this kind of use for rules in OWL just
duplicates the OWL subclass facility. It is
logically equivalent to write instead - Class(Student partial Person) or
- SubClassOf(Student Person)
- which would make the information directly
available to an OWL reasoner.
29Rule Interchange Format (RIF)
- Leading candidate for W3 Recommendation
- Interlingua (similar to KIF)
- http//www.w3.org/2005/rules/wiki/RIF_Working_Grou
p - Tools starting (just) to emerge
30Test an interchanged RIF rule set
- testQuery(Literal)
test the literal ( rule head or fact) - testNotQuery(Literal)
negatively test the literal with default negation - testNegQuery(Literal)
negatively test the literal with explicit
negation - testNumberOfResults(Literal, Number)
test number of results derived for the literal
stated value - testNumberOfResults(Literal, Var, Number) test
number of results for the variable in the literal - testNumberOfResultsMore(Literal,Number) test
number of results for the literal gt given value - testNumberOfResultsLess(Literal,Number) test
number of results for the literal lt given value - testNumberOfResultsMore(Literal,Var,Number) test
number of results for the variable in the literal
gt given value - testNumberOfResultsLess(Literal,Var,Number) test
number of results for the variable in the literal
lt given value
31More RIF testing
- testResult(QueryLiteral,ResultLiteral)
test if the second literal is an answer of the
query literal - testResults(Literal,Var,ltBindingListgt)
test if the list of binding results for the
variable in the literal can be derived - testResultsOrder(Literal,Var,ltBindingListgt)
test if the list of ordered binding results for
the variable in the literal can be derived - testQueryTime(Literal, MaxTime)
test if the literal can be derived in less than
the stated time in milliseconds - testNotQueryTime(Literal, MaxTime)
test if the literal can be derived negatively by
default in less than the stated time in
milliseconds - testNegQueryTime(Literal, MaxTime)
test if the literal can be derived strongly
negative in less than the stated time in
milliseconds - getQueryTime(Literal, Time) get the
query time for the literal - getNotQueryTime(Literal,Time)
get the default negated query time for the
literal - getNegQueryTime(Literal,Time)
get the explicitly negated query time for the
literal
32Testing class membership
- Document(
- Prefix(fam http//example.org/family)
- Group (
- Forall ?X ?Y (
- famisFatherOf(?Y ?X) - And
(famisSonOf(?X ?Y) famisMale(?Y) ?XfamChild
?YfamParent ) - )
- famisSonOf(famAdrian famUwe)
- famisMale(famAdrian)
- famisMale(famUwe)
- famAdrianfamChild
- famUwefamParent
- )
- )
- Conclusion famisFather(famUwe famAdrian)
33XML for conclusion
- lt?xml version"1.0" encoding"UTF-8"?gt
- lt!DOCTYPE Document
- lt!ENTITY rif "http//www.w3.org/2007/rif"gt
- lt!ENTITY xs "http//www.w3.org/2001/XMLSchema
"gt - lt!ENTITY rdf "http//www.w3.org/1999/02/22-rdf-
syntax-ns"gt - gt
- ltAtom xmlns"rif"gt
- ltopgt
- ltConst type"rifiri"gthttp//example.org/fa
milyisFatherlt/Constgt - lt/opgt
- ltargsgt
- ltConst type"rifiri"gthttp//example.org/fa
milyUwelt/Constgt - ltConst type"rifiri"gthttp//example.org/fa
milyAdrianlt/Constgt - lt/argsgt
- lt/Atomgt
- lt!--XML document generated on Tue Dec 30 120816
EST 2008--gt
34Language options that you can implement
- JenaRules is based on RDF(S) and uses the triple
representation of RDF descriptions (see also N3
Notation and Turtle Syntax).
35Examples
- ltexDriver rdfabout"http//example.com/John"gt
- ltexstategtNew Yorklt/exstategt
- ltexhasTrainingCertificate rdfdatatype"http//
www.w3.org/2001/XMLSchemaboolean"gttruelt/exhasTra
iningCertificategt - lt/exDrivergt
- _at_prefix rdf http//www.w3.org/1999/02/22-rdf-synt
ax-ns - _at_prefix ex http//example.com/
- _at_prefix xs http//www.w3.org/2001/XMLSchema
- eligibleDriver (?d rdftype exEligibleDriver)
- lt-
- (?d rdftype exDriver)
- (?d exstate "New York")
- (?d exhasTrainingCertificate
"true"xsboolean) - Any driver living in New York and having training
driver certificate is eligible for insurance.
36A driver is young if has between 18 and 25 years
old.
- ltexage rdfdatatype"http//www.w3.org/2001/XMLS
chemainteger"gt21lt/exagegtltbrgt - lt/exDrivergt
- _at_prefix rdf http//www.w3.org/1999/02/22-rdf-synt
ax-ns - _at_prefix ex http//example.com/
- _at_prefix xs http//www.w3.org/2001/XMLSchema
- youngDriver (?d rdftype exYoungDriver)
- lt-
- (?d rdftype exDriver)
- (?d exage ?a)
- greaterThan(?a,18)
- lessThan(?a,25)
37Negation
- ltexDriver rdfabout"http//example.com/John"gt
- ltexnamegtJojn Smithlt/exnamegt
- lt/exDrivergt
- _at_prefix rdf http//www.w3.org/1999/02/22-rdf-synt
ax-ns - _at_prefix ex http//example.com/
- eligibleDriver (?d rdftype exTypicalDriver)
- lt-
- (?d rdftype exDriver)
- noValue(?d rdftype
exYoungDriver) - noValue(?d rdftype
exSeniorDriver)
38Multiple rules, split disjunction
- ltexDriver rdfabout"http//example.com/John"gt
- ltexstategtVancouverlt/exstategt
- ltexaccidentsNumber rdfdatatype"http//www.w3.
org/2001/XMLSchemainteger"gt1lt/exaccidentsNumbergt
- lt/exDrivergt
- _at_prefix rdf http//www.w3.org/1999/02/22-rdf-synt
ax-ns - _at_prefix ex http//example.com/
- eligibleDriver_1 (?d rdftype
exEligibleDriver) - lt-
- (?d rdftype exDriver)
- (?d exstate "New York")
- (?d exaccidentsNumber ?an)
- lessThan(?an,2)
- eligibleDriver_2 (?d rdftype
exEligibleDriver) - lt-
- (?d rdftype exDriver)
- (?d exstate "Vancouver")
- (?d exaccidentsNumber ?an)
- lessThan(?an,2)
39Using Protégé
- SWRL plugin for editing rules
- Jena (instructions for running the rule engine
and using inference http//hydrogen.informatik.tu
-cottbus.de/wiki/index.php/JenaRules)
40Inference structure
41Lastly and briefly
- Jess rules (LISP-like)
- Jess rules engine - http//herzberg.ca.sandia.gov/
jess/ - http//www.jessrules.com/jess/docs/Jess71p2.pdf
42Implementation
- Cover language representation choices, and
knowledge engineering - Pull apart the use case
- Tools and services
- Architecture considerations and design choices
43Languages
- OWL
- RDFS
- SKOS
- RIF
- SPARQL
- OWL-S
44RDFS
- Note XMLS not an ontology language
- Changes format of DTDs (document schemas) to be
XML - Adds an extensible type hierarchy
- Integers, Strings, etc.
- Can define sub-types, e.g., positive integers
- RDFS is recognisable as an ontology language
- Classes and properties
- Sub/super-classes (and properties)
- Range and domain (of properties)
45However
- RDFS too weak to describe resources in sufficient
detail - No localized range and domain constraints
- Cant say that the range of hasChild is person
when applied to persons and elephant when applied
to elephants - No existence/cardinality constraints
- Cant say that all instances of person have a
mother that is also a person, or that persons
have exactly 2 parents - No transitive, inverse or symmetrical properties
- Cant say that isPartOf is a transitive property,
that hasPart is the inverse of isPartOf or that
touches is symmetrical -
- Difficult to provide reasoning support
- No native reasoners for non-standard semantics
- May be possible to reason via First Order
axiomatisation
46OWL requirements
- Desirable features identified for Web Ontology
Language - Extends existing Web standards
- Such as XML, RDF, RDFS
- Easy to understand and use
- Should be based on familiar KR idioms
- Formally specified
- Of adequate expressive power
- Possible to provide automated reasoning support
47The OWL language
- Three species of OWL
- OWL full is union of OWL syntax and RDF
- OWL DL restricted to FOL fragment (¼ DAMLOIL)
- OWL Lite is easier to implement subset of OWL
DL - Semantic layering
- OWL DL ¼ OWL full within DL fragment
- DL semantics officially definitive
- OWL DL based on SHIQ Description Logic
- In fact it is equivalent to SHOIN(Dn) DL
- OWL DL Benefits from many years of DL research
- Well defined semantics
- Formal properties well understood (complexity,
decidability) - Known reasoning algorithms
- Implemented systems (highly optimized)
48(No Transcript)
49OWL Class Constructors
50OWL axioms
51SKOS properties
- skosnote
- e.g. Anything goes.
- skosdefinition
- e.g. A long curved fruit with a yellow skin and
soft, sweet white flesh inside. - skosexample
- e.g. A bunch of bananas.
- skosscopeNote
- e.g. Historically members of a sheriff's retinue
armed with pikes who escorted judges at assizes. - skoshistoryNote
- e.g. Deleted 1986. See now Detention,
Institutionalization (Persons), or
Hospitalization. - skoseditorialNote
- e.g. Confer with Mr. X. re deletion.
- skoschangeNote
- e.g. Promoted love to preferred label, demoted
affection to alternative label, Joe Bloggs,
2005-08-09.
52SKOS core and RDFS/OWL
- Disjoint?
- Should skosConcept be disjoint with
- rdfProperty ?
- rdfsClass ?
- owlClass ?
- DL?
- Should SKOS Core be an OWL DL ontology?
- Means not allowing flexibility in range of
documentation props - It is now (2008)!
53OWL 2
- http//www.w3.org/2007/OWL/wiki/OWL_Working_Group
- http//www.w3.org/2007/OWL/wiki/ImageOwl2-refcard
_2008-09-24.pdf
54Tutorial Summary
- Many different options for ontology querying -
none are standard - RDF query is most advanced
- Inference needs and choice will depend on
descriptive requirements (e.g. DL, Full, RDF,
etc.)