RDFS Reasoning and Query Answering on Top of DHTs - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

RDFS Reasoning and Query Answering on Top of DHTs

Description:

Index identifier = (Hash(p1)) query q = (?s, p1, ?o) Responsible node for p1 ... (painter, rdfs:subClassOf,artist) (flemish, rdfs:subClassOf, painter) ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 39
Provided by: velblodVid
Category:

less

Transcript and Presenter's Notes

Title: RDFS Reasoning and Query Answering on Top of DHTs


1
RDFS Reasoning and Query Answering on Top of DHTs
  • Zoi Kaoudi, Iris Miliaraki and Manolis Koubarakis
  • Department of Informatics and Telecommunications
  • National Kapodistrian University of Athens

2
Outline
  • Introduction
  • Background
  • Algorithms
  • Evaluation
  • Future Work

3
Introduction
  • RDFS reasoning essential for Semantic Web
    applications
  • Centralized RDF stores
  • Forward chaining
  • Backward chaining
  • Hybrid approach
  • Time space trade-off

4
Introduction
  • DHT-based RDF stores
  • No support for RDFS reasoning (RDFPeers, papers
    by Liarou et. al, etc.)
  • Only BabelPeers Battre06 has considered RDFS
    reasoning with a forward chaining approach only

5
Our work
  • Implementation of both forward chaining and
    backward chaining algorithms in a real DHT system
    that enables RDFS reasoning
  • Comparative study of algorithms
  • Analytically
  • Experimentally

6
Background in DHTs
  • Structured overlay networks
  • Solve the item location problem in a distributed
    and dynamic network of nodes (in O(log N) hops)
  • Let x be some data item. Find the node that holds
    x!
  • For data items we use a key (K) to compute an
    identifier (id)
  • Distributed version of hash table data structure
  • idHash(K)
  • Main operations
  • Put(K, x) given a key K (for a data item), map
    the key onto a node.
  • Get(K) Return the data item with a given a key.
  • O(logn) hops

7
Data Model
  • RDF data and RDFS descriptions can be written as
    RDF triples
  • RDF(S) database
  • RDFS entailment rules from W3C RDF Semantics
  • Not considered in this paper
  • Axiomatic triples
  • Rules with blank nodes

8
RDFS Entailment Rules
  • subClass(X,Y) - triple(X, rdfssubClassOf, Y).
  • subClass(X,Y) - triple(X, rdfssubClassOf, Z),
    subClass(Z, Y).
  • subProperty(X, Y) - triple(X, rdfssubPropertyOf,
    Y).
  • subProperty(X, Y) - triple(X, rdfssubPropertyOf,
    Z), subProperty(Z, Y).
  • type(X, Y) - triple(X, rdftype, Y).
  • type(X, Y) - type(X, Z), subClass(Z, Y).
  • type(X, Y) - triple(X, P, Z), triple(P,
    rdfsdomain, Y).
  • type(X, Y) - triple(Z, P, X), triple(P,
    rdfsrange, Y).

edb relation triple
idb relations subClass, subProperty, type
9
Indexing
triple t (s1, p1, o1)
Index identifier (Hash(s1)) Index identifier
(Hash(p1)) Index identifier (Hash(o1))
Responsible node for s1
Responsible node for p1
query q (?s, p1, ?o)
Responsible node for o1
Index identifier (Hash(p1))
10
Algorithms
  • Forward chaining
  • Compute all inferences a priori
  • Backward chaining
  • Compute inferences on demand

11
Distributed Forward Chaining
person
sc
artist
sc
sc
painter
sc
sc
flemish
12
Distributed Forward Chaining
(painter, rdfssubClassOf,artist) (flemish,
rdfssubClassOf, painter)
person
(painter, rdfssubClassOf, artist) (artist,
rdfssubClassOf, person)
sc
artist
sc
(flemish, rdfssubClassOf, painter)
painter
sc
(artist, rdfssubClassOf, person)
flemish
13
Distributed Forward Chaining
(painter, rdfssubClassOf,artist) (flemish,
rdfssubClassOf, painter)
triples
person
(painter, rdfssubClassOf, artist) (artist,
rdfssubClassOf, person)
sc
rules
artist
subClass(X,Y) - triple(X, rdfssubClassOf,
Y). subClass(X,Y) - triple(X, rdfssubClassOf,
Z), subClass(Z, Y).
sc
(flemish, rdfssubClassOf, painter)
painter
infer
sc
(artist, rdfssubClassOf, person)
flemish
(painter, rdfssubClassOf, person)
14
Distributed Forward Chaining
(painter, rdfssubClassOf,artist) (flemish,
rdfssubClassOf, painter)
?
(flemish, rdfssubClassOf, artist)
(flemish, rdfssubClassOf, artist)
person
(painter, rdfssubClassOf, artist) (artist,
rdfssubClassOf, person)
?
(painter, rdfssubClassOf, person)
(painter, rdfssubClassOf, person)
sc
artist
sc
(flemish, rdfssubClassOf, painter)
painter
sc
(artist, rdfssubClassOf, person)
flemish
15
Distributed Forward Chaining
(painter, rdfssubClassOf,artist) (flemish,
rdfssubClassOf, painter)
?
(flemish, rdfssubClassOf, person)
(painter, rdfssubClassOf, person)
person
(flemish, rdfssubClassOf, artist)
?
(painter, rdfssubClassOf, artist) (artist,
rdfssubClassOf, person)
(flemish, rdfssubClassOf, person)
sc
artist
sc
(flemish, rdfssubClassOf, artist)
(flemish, rdfssubClassOf, painter)
painter
sc
(painter, rdfssubClassOf, person)
(artist, rdfssubClassOf, person)
flemish
16
Distributed Forward Chaining
(painter, rdfssubClassOf,artist) (flemish,
rdfssubClassOf, painter)
?
(flemish, rdfssubClassOf, person)
(painter, rdfssubClassOf, person)
person
(flemish, rdfssubClassOf, artist)
?
(painter, rdfssubClassOf, artist) (artist,
rdfssubClassOf, person)
(flemish, rdfssubClassOf, person)
sc
artist
sc
(flemish, rdfssubClassOf, artist)
The same triple is generated in two nodes!!
(flemish, rdfssubClassOf, painter)
The same triple is sent to be stored twice!!
painter
sc
(painter, rdfssubClassOf, person)
(artist, rdfssubClassOf, person)
flemish
17
Distributed Forward Chaining
(painter, rdfssubClassOf,artist) (flemish,
rdfssubClassOf, painter)
?
(flemish, rdfssubClassOf, person)
(flemish, rdfssubClassOf, person)
(painter, rdfssubClassOf, person)
person
(flemish, rdfssubClassOf, artist)
?
(painter, rdfssubClassOf, artist) (artist,
rdfssubClassOf, person)
(painter, rdfssubClassOf, artist) (artist,
rdfssubClassOf, person)
(flemish, rdfssubClassOf, person)
(flemish, rdfssubClassOf, person)
sc
artist
sc
(flemish, rdfssubClassOf, artist)
(flemish, rdfssubClassOf, artist)
(flemish, rdfssubClassOf, painter)
(flemish, rdfssubClassOf, painter)
painter
sc
(painter, rdfssubClassOf, person)
(artist, rdfssubClassOf, person)
flemish
18
Querying after FC
(painter, rdfssubClassOf,artist) (flemish,
rdfssubClassOf, painter)
Query Find all subclasses of person q (X,
rdfssubClassOf, person)
(painter, rdfssubClassOf, person)
(flemish, rdfssubClassOf, artist)
(painter, rdfssubClassOf, artist) (artist,
rdfssubClassOf, person)
(flemish, rdfssubClassOf, person)
Answer is found!
(flemish, rdfssubClassOf, artist)
(flemish, rdfssubClassOf, painter)
(flemish, rdfssubClassOf, person)
(painter, rdfssubClassOf, person)
(artist, rdfssubClassOf, person)
19
Algorithms
  • Forward chaining
  • Compute all inferences a priori
  • Backward chaining
  • Compute inferences on demand

20
Data Model - revisited
  • Recursive rules
  • Rule adornment from recursive query processing
  • Good orderings for evaluating predicates
  • eg. subClass(X, artist) ?subClassfb (X,Y)
  • Extended adornment
  • Ordered string of f, b, k
  • k an argument that is bound and the key
  • b bound argument (not the key)
  • f free argument
  • eg. At node responsible for key artist
  • triple(X, rdftype, artist) ? triplefbk (X,
    rdftype, Y).
  • Good ordering for evaluating predicates in a
    distributed environment

21
RDFS Entailment Rules - revisited
  • subClasskf (X,Y) - triplekbf (X,
    rdfssubClassOf, Y).
  • subClasskf (X,Y) - triplekbf (X,
    rdfssubClassOf, Z), subClassff (Z, Y).
  • subClassfk (X,Y) - triplefbk (X,
    rdfssubClassOf, Y).
  • subClassfk (X,Y) - subClassff(X, Z), triplefbk
    (Z, rdfssubClassOf, Y).
  • subPropertykf (X,Y) - triplekbf (X,
    rdfssubPropertyOf, Y).
  • subPropertykf (X,Y) - triplekbf (X,
    rdfssubPropertyOf, Z), subPropertyff (Z, Y).
  • subPropertyfk (X,Y) - triplefbk (X,
    rdfssubPropertyOf, Y).
  • subPropertyfk (X,Y) - subPropertyff(X, Z),
    triplefbk (Z, rdfssubPropertyOf, Y).
  • typekf (X, Y) - triplekbf (X, rdftype, Y).
  • typekf (X, Y) - triplekff (X, P, Z), triplefbf
    (P, rdfsdomain, Y).
  • typekf (X, Y) - tripleffk (Z, P, X), triplefbf
    (P, rdfsrange, Y).
  • typekf (X, Y) - triplekbf (X, rdftype, Z),
    subClassff (Z, Y).
  • typefk (X, Y) - triplefbk (X, rdftype, Y).
  • typefk (X, Y) - triplefff (X, P, Z), triplefbk
    (P, rdfsdomain, Y).
  • typefk (X, Y) - triplefff (Z, P, X), triplefbk
    (P, rdfsrange, Y).
  • typefk (X, Y) - typeff (X, Z), triplefbk (Z,
    rdfssubClassOf, Y).

22
Distributed Backward Chaining
Query Find all subclasses of person q (X,
rdfssubClassOf, person)
(painter, rdfssubClassOf,artist) (flemish,
rdfssubClassOf, painter)
(sculptor, rdfssubClassOf, artist)
(painter, rdfssubClassOf, artist) (artist,
rdfssubClassOf, person)
person
(sculptor, rdfssubClassOf, artist)
sc
artist
(flemish, rdfssubClassOf, painter)
sc
sc
painter
(artist, rdfssubClassOf, person)
sc
sc
23
Distributed Backward Chaining
Query Find all subclasses of person q (X,
rdfssubClassOf, person)
(painter, rdfssubClassOf,artist) (flemish,
rdfssubClassOf, painter)
(sculptor, rdfssubClassOf, artist)
(painter, rdfssubClassOf, artist) (artist,
rdfssubClassOf, person)
person
(sculptor, rdfssubClassOf, artist)
sc
artist
(flemish, rdfssubClassOf, painter)
sc
sc
painter
(artist, rdfssubClassOf, person)
24
Distributed Backward Chaining
subClass fk (X, person)
25
Distributed Backward Chaining
Which predicate should we evaluate first?
subClass fk (X, person)
r1
r2
triple fbk (X, rdfssubClassOf, person)
triple fbk (Z, rdfssubClassOf, person)
subClass ff (X, Z)
(r1)
subClassfk (X,Y) - triplefbk (X,
rdfssubClassOf, Y). subClassfk (X,Y) -
subClassff(X, Z), triplefbk (Z, rdfssubClassOf,
Y).
(r2)
We choose to evaluate first the one that has a k
in its adornment
26
RDFS Entailment Rules - revisited
  • subClasskf (X,Y) - triplekbf (X,
    rdfssubClassOf, Y).
  • subClasskf (X,Y) - triplekbf (X,
    rdfssubClassOf, Z), subClassff (Z, Y).
  • subClassfk (X,Y) - triplefbk (X,
    rdfssubClassOf, Y).
  • subClassfk (X,Y) - subClassff(X, Z), triplefbk
    (Z, rdfssubClassOf, Y).
  • subPropertykf (X,Y) - triplekbf (X,
    rdfssubPropertyOf, Y).
  • subPropertykf (X,Y) - triplekbf (X,
    rdfssubPropertyOf, Z), subPropertyff (Z, Y).
  • subPropertyfk (X,Y) - triplefbk (X,
    rdfssubPropertyOf, Y).
  • subPropertyfk (X,Y) - subPropertyff(X, Z),
    triplefbk (Z, rdfssubPropertyOf, Y).
  • typekf (X, Y) - triplekbf (X, rdftype, Y).
  • typekf (X, Y) - triplekff (X, P, Z), triplefbf
    (P, rdfsdomain, Y).
  • typekf (X, Y) - tripleffk (Z, P, X), triplefbf
    (P, rdfsrange, Y).
  • typekf (X, Y) - triplekbf (X, rdftype, Z),
    subClassff (Z, Y).
  • typefk (X, Y) - triplefbk (X, rdftype, Y).
  • typefk (X, Y) - triplefff (X, P, Z), triplefbk
    (P, rdfsdomain, Y).
  • typefk (X, Y) - triplefff (Z, P, X), triplefbk
    (P, rdfsrange, Y).
  • typefk (X, Y) - typeff (X, Z), triplefbk (Z,
    rdfssubClassOf, Y).

Notice that every predicate has the k in its
adornment is the edb relation triple!
27
Distributed Backward Chaining
subClass fk (X, person)
r1
r2
triple fbk (X, rdfssubClassOf, person)
triple fbk (Z, rdfssubClassOf, person)
subClass ff (X, Z)
Z / artist
subClass fk (X, artist)
r1
r2
triple fbk (Z, rdfssubClassOf, artist)
triple fbk (X, rdfssubClassOf, artist)
subClass ff (X, Z)
Z / sculptor
Z / painter
28
Distributed Backward Chaining
subClass fk (X, person)
r1
r2
triple fbk (X, rdfssubClassOf, person)
triple fbk (Z, rdfssubClassOf, person)
subClass ff (X, Z)
Z / artist
subClass fk (X, artist)
r1
r2
triple fbk (Z, rdfssubClassOf, artist)
triple fbk (X, rdfssubClassOf, artist)
subClass ff (X, Z)
Z / sculptor
Z / painter
subClass fk (X, painter)
subClass fk (X, sculptor)
r1
r1
r2
r2
triple fbk (X, rdfssubClassOf, sculptor)
triple fbk (X, rdfssubClassOf, painter)
triple fbk (Z, rdfssubClassOf, painter)
triple fbk (Z, rdfssubClassOf, sculptor)
subClass ff (X, Z)
subClass ff (X, Z)
29
Evaluation
  • Analytical cost model
  • Experimental evaluation

30
Experimental Setup
  • Both algorithms have been implemented as a real
    distributed system using Bamboo DHT
  • Experiments were conducted in PlanetLab (123
    nodes available at the time of the experiments)
  • Synthetic data from RBench generator
    Theoharis05
  • number of instances 103, 104
  • RDFS class hierarchy tree depth 2-6 (7 to 128
    classes)
  • distribution both uniform and Zipf (z1)
  • Query Give me the instances of the root class

31
Metrics
  • Network traffic
  • number of messages sent
  • bandwidth
  • Storage load
  • total number of triples stored
  • Storage time
  • Query response time

32
Network traffic (while storing)
33
Storage load
34
Query response time
35
Comparison
  • Forward chaining
  • Query response time
  • Storage load
  • Storage time
  • Network traffic due to generated redundancies
  • Backward chaining
  • Storage load
  • Storage time
  • No redundancies
  • Query response time
  • Query processing load

36
Summary
  • How to implement forward and backward chaining in
    a distributed environment
  • Both algorithms have been integrated in the
    conjunctive query processing algorithms of our
    system Atlas (http//atlas.di.uoa.gr)
  • What techniques we need to extend to make this
    implementation feasible
  • How these algorithms perform in a real
    decentralized environment (PlanetLab)
  • Our algorithms could be adapted for general
    recursive query processing

37
Future Work
  • Ongoing work
  • Support for all RDFS entailment rules
  • Experimenting with complex queries (LUBM
    benchmark)
  • Future work
  • Optimize forward chaining
  • Hybrid approach
  • Network churn

38
Thank you!!
Questions?
Write a Comment
User Comments (0)
About PowerShow.com