RDF Aggregate Queries and Views - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

RDF Aggregate Queries and Views

Description:

(Painter, rdfs:subClassOf, Artist) ?xml version='1.0' ... rdfs:Class rdf:ID='Artist' ... artist, ns1:creates , ?artifact), (?artifact, ns1:estimated , ?price) ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 39
Provided by: edwar77
Category:
Tags: rdf | aggregate | queries | views

less

Transcript and Presenter's Notes

Title: RDF Aggregate Queries and Views


1
RDF Aggregate Queries and Views
  • Edward Hung, Yu Deng, V.S. Subrahmanian
  • University of Maryland, College Park

2
Maintenance of RDF Aggregate Views
  • Introduction of RDF and RDQL
  • RDQL Extension for Aggregate Views
  • Aggregate View Maintenance Algorithms AMX
  • Implementation and Experiments
  • Related Work

3
Publication
  • Edward Hung, Yu Deng, V.S. Subrahmanian, "RDF
    Aggregate Queries and Views", to appear in the
    Proc. of the 21st International Conference on
    Data Engineering (ICDE), Tokyo, Japan, 2005.

4
Introduction
  • Resource Description Framework (RDF)
  • W3C Recommendation
  • Represents metadata about resources identifiable
    on the web (by Uniform Resource Identifier (URI))
  • Triple (Resource, Property, Value)
  • (Artist, rdftype, rdfsClass)
  • (Painter, rdftype, rdfsClass)
  • (Painter, rdfssubClassOf, Artist)

5
  • lt?xml version"1.0"?gt
  • lt!DOCTYPE rdfRDF lt!ENTITY xsd
    "http//www.w3.org/2001/XMLSchema"gtgt
  • ltrdfRDF xmlnsrdf"http//www.w3.org/1999/02/22-
    rdf-syntax-ns" xmlnsrdfs"http//www.w3.org/20
    00/01/rdf-schema" xmlbase"http//www.auctions
    chema.com/schema1"gt
  • ltrdfsClass rdfID"Artist"/gt
  • ltrdfsClass rdfID"Painter"gtltrdfssubClassOf
    rdfresource"Artist"/gtlt/rdfsClassgt
  • ltrdfsDatatype rdfabout"xsdstring"/gt
  • ltrdfProperty rdfID"fname"gt
  • ltrdfsdomain rdfresource"Artist"/gt
  • ltrdfsrange rdfresource"xsdstring"/gt
  • lt/rdfPropertygt
  • lt/rdfRDFgt
  • lt?xml version"1.0"?gt
  • lt!DOCTYPE rdfRDF lt!ENTITY xsd
    "http//www.w3.org/2001/XMLSchema"gtgt
  • ltrdfRDF xmlnsrdf "http//www.w3.org/1999/02/22
    -rdf-syntax-ns"
  • xmlnsns1"http//www.auctionschema.com/schema
    1"gt
  • ltrdfDescription rdfabout"http//www.artist.n
    etguyrose"gt
  • ltrdftype rdfresource"ns1Painter"/gt
  • ltns1fname rdfdatatype"xsdstring"gt Guy
    lt/ns1fnamegt

RDF Schema
RDF Instance
6
  • lt?xml version"1.0"?gt
  • lt!DOCTYPE rdfRDF lt!ENTITY xsd
    "http//www.w3.org/2001/XMLSchema"gtgt
  • ltrdfRDF xmlnsrdf"http//www.w3.org/1999/02/22-
    rdf-syntax-ns" xmlnsrdfs"http//www.w3.org/20
    00/01/rdf-schema" xmlbase"http//www.auctions
    chema.com/schema1"gt
  • ltrdfsClass rdfID"Artist"/gt
  • ltrdfsClass rdfID"Painter"gtltrdfssubClassOf
    rdfresource"Artist"/gtlt/rdfsClassgt
  • ltrdfsDatatype rdfabout"xsdstring"/gt
  • ltrdfProperty rdfID"fname"gt
  • ltrdfsdomain rdfresource"Artist"/gt
  • ltrdfsrange rdfresource"xsdstring"/gt
  • lt/rdfPropertygt
  • lt/rdfRDFgt
  • lt?xml version"1.0"?gt
  • lt!DOCTYPE rdfRDF lt!ENTITY xsd
    "http//www.w3.org/2001/XMLSchema"gtgt
  • ltrdfRDF xmlnsrdf "http//www.w3.org/1999/02/22
    -rdf-syntax-ns"
  • xmlnsns1"http//www.auctionschema.com/schema
    1"gt
  • ltrdfDescription rdfabout"http//www.artist.n
    etguyrose"gt
  • ltrdftype rdfresource"ns1Painter"/gt
  • ltns1fname rdfdatatype"xsdstring"gt Guy
    lt/ns1fnamegt

fname
Artist
String
subClassOf
Painter
fname
r1
Guy
r1 http//www.artist.netguyrose
7
(No Transcript)
8
RDQL RDF Query Language
  • SELECT?highprice
  • WHERE (?artist, ltns1lnamegt, "Rose"),
  • (?artist, ltns1fnamegt, "Guy"),
  • (?artist, ltns1createsgt, ?artifact),
  • (?artifact, ltns1estimatedgt, ?price),
  • (?price, ltns1highgt, ?highprice),
  • (?artifact, ltns1presentedgt, ?date)
  • AND 2004-04-01 lt ?date lt 2004-04-30
  • USING ns1 FOR http//www.auctionschema.com/schema1
    gt

view pattern
9
RDQL Extension for Aggregates and Views
  • CREATEVIEW AS
  • SELECT max(?highprice)
  • WHERE (?artist, ltns1lnamegt, "Rose"),
  • (?artist, ltns1fnamegt, "Guy"),
  • (?artist, ltns1createsgt, ?artifact),
  • (?artifact, ltns1estimatedgt, ?price),
  • (?price, ltns1highgt, ?highprice),
  • (?artifact, ltns1presentedgt, ?date)
  • AND 2004-04-01 lt ?date lt 2004-04-30
  • USING ns1 FOR http//www.auctionschema.com/schema1
    gt

10
(No Transcript)
11
  • We are expanding the syntax of RDQL so that it
    allows constants in SELECT clauses which
    equivalently creates new resources and properties
    using the constants.
  • For example, the previous query can be modified
    as follows
  • CREATEVIEW AS
  • SELECT ltns1works_by_guyrosegt, ltns1maxpricegt,
    max(?highprice)
  • WHERE (?artist, ltns1lnamegt, "Rose"),
  • (?artist, ltns1fnamegt, "Guy"),
  • (?artist, ltns1createsgt, ?artifact),
  • (?artifact, ltns1estimatedgt, ?price),
  • (?price, ltns1highgt, ?highprice),
  • (?artifact, ltns1presentedgt, ?date)
  • AND 2004-04-01 lt ?date lt 2004-04-30
  • USING ns1 FOR http//www.auctionschema.com/schema1
    gt
  • The result is a valid RDF statement
    (ltns1works_by_guyrosegt,ltns1maxpricegt,800000"
    ns1USD)

12
(No Transcript)
13
(No Transcript)
14
Aggregate View Maintenance
  • Relational Approach
  • Store all triples in a relational table with
    schema (Resource, Property, Value)
  • OR
  • Store resources and values of the same property
    in a separate relational table with schema
    (Resource, Value)
  • self-joins (triples in where-clause) 1
  • Large number of delta rules during relational
    view maintenance ? expensive

15
Aggregate View Maintenance
  • Graph-structured DB (GSDB) Zhuge, Garcia-Molina,
    ICDE 1998
  • GSDB assumes a rooted graph model while RDF is a
    general graph
  • A GSDB view contains a set of nodes while our RDF
    views can contain nodes, edges, or any
    combinations.

16
Aggregate View Maintenance
  • Our Approach
  • Localized search in RDF graphs
  • breadth-first search starting at the
    inserted/deleted edge
  • auxiliary data are needed for certain aggregate
    views
  • min, max, avg

17
Compute Aggregates Algorithm CAA
18
view pattern
19
BAG
20
BAG 800000
21
BAG 800000, 500000
SELECT max(?highprice)
22
Aggregate View Maintenance Algorithms AMX
  • AMI Insertion
  • AMD Deletion
  • AMT Triple Modification
  • AMR Resource Modification

23
BAG 800000, 500000
Update Insertion
paints
24
BAG 800000, 500000
paints
25
BAG 800000, 500000, 60000
SELECT max(?highprice)
paints
26
AMI for Insertion
27
(No Transcript)
28
Distributive Aggregate Function
  • An aggregate function f is distributive w.r.t a
    source update operation if and only if after such
    an operation, the updated value of the function
    can be computed based on its old value and the
    value(s) of the source update without reference
    to the source.
  • More formally, f is distributive w.r.t. an update
    operation U if and only if there exists a
    function g such that f(I') g(f(I), v) where
    f(I) is the aggregate value, I' is the updated
    instance after the update operation U(I, v), and
    v is the value(s) used in the update (e.g., the
    new value to add, the old value to remove, etc).

29
Distributive Aggregate Function
  • Examples of distributive aggregate functions
  • count, sum, average w.r.t. insertion, deletion
    and update
  • For average, we will need an additional attribute
    size which stores the size of S (in line 3 of
    CAA) in order to compute the correct updated
    value (or, we can use sum, count to calculate it)
  • max and min are distributive w.r.t. insertion,
    but not deletion and update
  • Auxiliary data computed from the source (such as
    S) can help to maintain non-distributive
    aggregate functions to avoid the need to refer to
    the source.

30
TMaintainI
31
BAG 800000, 500000, 60000
Update Deletion
paints
32
BAG 800000, 500000, 60000
paints
33
BAG 500000, 60000
SELECT max(?highprice)
paints
34
AMD for Deletion
35
TMaintainD
36
Implementation and Experiment
  • Implemented in Java
  • Jena RDQL Engine of HP
  • Comparison with Relational Approach (standard
    view maintenance algorithm on relational tables)
  • Counting Algorithm in Gupta et al. "Maintaining
    Views Incrementally", SIGMOD 1993
  • Dataset Chef Moz Project RDF dump
  • Data stored in memory

37
(No Transcript)
38
Other Related Work
  • Voltz et al. DBFUSION02
  • the first to introduce a view mechanism for RDF
    data
  • Their views require that
  • the results contain class instances (i.e., a
    subject or object variable), or
  • the result itself has the pattern of RDF
    statement (i.e., a triple containing subject,
    predicate and object).
  • Magkanaraki et al ISWC03
  • proposed RVL, a view definition language that can
    also create virtual RDF schemas and restructure
    class and property hierarchies such that new
    resources, property values, classes and property
    types can be created.
  • None of these works specifically address (i)
    aggregates in RDF or (ii) the problem of
    maintaining aggregate RDF views.

39
Summary
  • RDQL Extension for Views and Aggregates
  • Compute Aggregates Algorithm CAA
  • Aggregate View Maintenance Algorithms AMX
Write a Comment
User Comments (0)
About PowerShow.com