Query languages II: equivalence - PowerPoint PPT Presentation

About This Presentation
Title:

Query languages II: equivalence

Description:

Conjunctive queries equivalence & containment. For CQ' q1, q2, with the same ... for checking containment, which boils down to finding containment mappings) ... – PowerPoint PPT presentation

Number of Views:16
Avg rating:3.0/5.0
Slides: 19
Provided by: off9
Category:

less

Transcript and Presenter's Notes

Title: Query languages II: equivalence


1
Query languages II equivalence
containment(Motivation rewriting queries using
views)
  • conjunctive queries CQs
  • Extensions of CQs

2
Conjunctive queries equivalence containment
  • For CQ q1, q2, with the same head predicate
  • Decision problems
  • The two problems are equivalent solved one,
    solved the other

3
Solution for containment ? for equivalence
Solution for equivalence ? for
containment (here, the ri and sj are db
predicates, not necessarily different)
4
  • Characterizations for containment assume q1,
    q2 are given
  • A mapping h from the variables of q2 to
    variables/constants (extended naturally to
    constants and atoms) is a
    homomorphism from q2 to q1 if
  • Maps head(q2) to head(q1)
  • (assuming same heads ?identity on
    head vars)
  • Maps each atom of q2 to an atom of q1
  • If there are constrains on the side, Ci in qi,
    then h(C2) is implied by C1
  • Notation

5
Thm The following are equivalent for CQs w/o
built-in preds Proof (ii) ? (i) is easy (and
holds even with b.i. preds) Every valuation from
q1 into a db D can be composed with h to a
valuation from q2. Hence, every answer of q1 on
D is also an answer of q2 on D
h
v
D
6
  • For (i) ? (ii)
  • The body of a CQ (w/o b.is) can be viewed as a
    db
  • consider each variable as a constant, different
    from all constants in the CQ and the other
    variables
  • or, replace each variable x by a distinct
    constant cx
  • Denote this db by db(q)
  • Obviously, q(db(q)) contains the head of q (or
    its image)
  • Example
  • Q q(d) - movies(t,d,a),
    directory(Plaza, t, 1930)
  • db(Q) movies(ct,cd,ca),
    directory(Plaza, ct,1930)
  • Obviously, applying Q to this db, one obtains
    q(cd) (use the identity
    valuation)

7
  • ? (ii) (q2 contains q1 ? homomorphism from q2 to
    q1)
  • Clearly, q1(db(q1)) contains head(q1)
  • Since , q2(db(q1)) contains
    head(q1)
  • The valuation from q2 to db(q1) that yields this
    answer is a homomorphism
  • Example
  • q1 p(d) - movies(t,d,Jane),
    directory(Plaza, t, 1930), location(Plaza
    , a, 01-58776655)
  • q2 p(z) - movies(t,z,a),
    directory(Plaza, t, 1930)
  • Obviously, q1 is contained in q2, with h t? t,
    z?d, a?Jane,
  • that maps the two atoms of body(q2) to the first
    two of body(q1), and head(q2) to head(q1)

8
  • Because of this characterization, such a
    homomorphism is also called

    a
    containment mapping from q2 to q1
  • Intuition q1 is contained in q2 iff
  • It has same or more atoms
  • It may have some constants where q2 has variables

9
Another characterization For a rule p(..)
- r1(..), , rk(..) a model is a set of facts
over p, r1, .., rk that satisfies the rule as a
logical formula (assuming all variables are
universally quantified) Thm the following are
equivalent The important useful
characterization
homomorphism, i.e., containment mapping
10
  • Algorithm and complexity
  • To decide if q1 is contained in q2, search for a
    containment mapping from the variables of q2 to
    the variables and constants of q1

    easy fast in many
    cases, exponential in worst case
  • The containment is in NP
  • given a mapping on the variables of q2 , it
    is easy to check it is a homomorphism to q1

11
  • It is NP-hard
  • given a graph G, it is 3-colorable iff
    there is a homomorphism from G (represented as an
    edge relation) to the 3-clique
  • one can represent G as the body of q2 (using
    distinct variables for distinct nodes), the
    3-clique as the body of q1
  • for both, the head can be q( )
  • Hence, containment equivalence are NP-complete
    (even for queries with no head variables)
  • Note this is expression complexity, not data
    complexity (here there is no db
    actually)
  • (when such a query is applied to a db, it
    returns either (), or )


12
  • Minimization of CQs
  • For q, define a minimal equivalent query as any
    equivalent q with a minimal number of body atoms
  • Thm the minimal equivalent query of q
  • is unique up to isomorphism,
  • and can be obtained by removing some atoms from
    body(q)
  • Proof

13
Thus, for every CQ Q, there is a subset of the
body that gives a minimal equivalent query Called
a core of Q It is not necessarily unique,
(different subsets may yield cores), but all
cores are isomorphic
14
Containment equivalence for extensions of CQs
  • Extension to UCQs let
  • Thm
  • Proof ? is obvious
  • ? if q1 is contained in q2, then each ri is
    contained in q2
  • q2(db(ri)) contains p(x)
  • for some sj, sj(db(ri)) contains p(x)
  • ? sj contains ri

q1 r1 p(x) - body1,1 rk
p(x)- body1,k
q2 s1 p(x) - body2,1 sm p(x)-
body2,m
15
Containment algorithm For each ri, loop over
sj, and search for a containment mapping from sj
to ri Still exponential in size (of both
queries) Complexity The containment problem is
now Explanation A relation R(..) is ptime if
membership can be verified in ptime
16
For a UCQ Q we can also consider the canonical
db of Q, denoted db(Q), obtained by taking the
bodies of all the rules together as a db (with
different existential variables in different
rules ) Here also Thm Q1 is contained in
Q2 iff Q2(db(Q1)) contains head(Q1) (this also
gives an algorithm for checking containment,
which boils down to finding containment
mappings)
17
  • Another extension of CQs b.i. preds in the
    body
  • Example
  • Q1 p(x, y) - q(x, y), r(u, v) , u lt v
  • Q2 p(x, y) - q(x, y) , r(u,v), r(v, u)
  • Is Q2 contained in/equivalent to Q1?
  • Q2 is equivalent to the union of
  • Q2,1 p(x, y) - q(x, y) , r(u,v), r(v, u),
    ult v
  • Q2,2 p(x, y) - q(x, y) , r(u,v), r(v, u),
    vlt u
  • Clearly, Q2,1 and Q2,2 are both contained in Q1
  • This can be generalized to an algorithm that
    reduces containment to that of UCQs (omitted)

18
Containment of a UCQ Q and a (recursive)
Datalog program P Still decidable, but double
exponential time (upper lower bound) Here
also Thm P contains Q iff P(db(Q)) contains
head Q this gives an algorithm for checking
containment apply P to db(Q), see if you
obtain head(Q) (do you see exponentials in this
algorithm?) Containment of Datalog programs
undecidable
Write a Comment
User Comments (0)
About PowerShow.com