Containment of Conjunctive Queries over Databases with Null Values - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Containment of Conjunctive Queries over Databases with Null Values

Description:

Introduced for conjunctive queries (Chandra/Merlin 1977) Investigated for a wealth of query types ... However, nulls are ubiquitous in SQL databases. ICDT - 12 ... – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 38
Provided by: Werne67
Category:

less

Transcript and Presenter's Notes

Title: Containment of Conjunctive Queries over Databases with Null Values


1
Containment of Conjunctive Queries
overDatabases with Null Values
Carles Farré 1 Werner Nutt 2 Ernest Teniente
1 Toni Urpí 1 1 Universitat Politecnica to
Catalunya 2 Freie Universität Bozen
2
Containment of Queries
  • Introduced for conjunctive queries
    (Chandra/Merlin 1977)
  • Investigated for a wealth of query types
  • Underlying assumption
  • databases contain constants
  • but no null values
  • However, nulls are ubiquitous in SQL databases

3
Our Topic
  • Containment
  • of conjunctive queries
  • (without comparisons)
  • evaluated under set-semantics
  • over databases that may contain null values
  • possibly with a test is null
  • according to the semantics of SQL for null values
  • ? Single block SQL queries with SELECT DISTINCT

4
Nulls in SQL
  • residence(person,loc)
  • Returns an answer if the WHERE clause evaluates
    to true
  • Equalities involving NULL have the logical value
    unknown
  • Conjunctions of conditions are true iff all
    conjuncts are true
  • ? q returns locations ?NULL

5
Null-containment Example
  • Over databases w/ null values q ? q?, but not
    q? ? q
  • Over databases w/o null values q q?

6
Query Rules
  • Rule notation
  • q(x) ? residence(y1,x), residence(y2,x)
  • The constant ? does not occur in queries
  • A join variable is a variable that
    occurs at least twice in the query body

7
Query Evaluation
  • A database D is a set of ground atoms, which may
    contain the value ?
  • A query
  • q(X) ? B
  • is satisfied by an assignment ? over D if
  • ?B ? D
  • no join variable of q is mapped to ?
  • Over D, the query q returns the set of answers
  • q(D) ?X ? satisfies q over D

8
Null-containment
  • q is null-contained in q? (q ?? q?)
  • iff
  • q(D) ? q?(D) for all databases D possibly
    containing nulls
  • Clearly,
  • q ?? q? implies q ? q?, but not vice versa

9
Checking Null-containment q ?? q??
  • Given q(X) ? B, q?(X) ? B?
  • Test Databases
  • View B as a database Dq (freeze the variables)
  • Let ? be a substitution that replaces some
    non-join variables of q with ?
  • The database ?Dq is a null-version of Dq
  • Theorem
  • q ?? q? iff ?X ? q?(?Dq) for all
    null-versions ?Dq of Dq

10
Null Versions Example
  • q?(x) ? residence(y1,x), residence(y2,x)
  • q(x) ? residence(y,x)
  • Test database Dq residence(y, x)
  • Null versions of Dq are
  • D1 residence(y, x)
  • D2 residence( ?, x)
  • D3 residence(y, ?)
  • D4 residence( ?, ?)
  • We have
  • q(D3) ?, but q?(D3) ?
  • Hence q ?? q?

11
Upper Complexity Bound
  • Corollary Null-containment of conjunctive
    queries is in ?P2
  • Questions
  • Is the upper bound tight?
  • Are there reasonable sufficient criteria
    that are less
    expensive?

12
Homomorphisms
  • Consider q(X) ? B, q?(X) ? B?
  • A substitution ? for the variables of q? is
  • a condition homomorphism if ?B? ? B
  • a query homomorphism if in addition ?X X
  • a J-homomorphism if in addition ? maps every
    join variable of q?
  • either to a join variable of q
  • or to a constant

13
Sufficient Condition
  • Proposition If there is a J-homomorphism ? from
    q? to q,
  • then q ?? q?
  • Proof If ? satisfies q over D, then ?? satisfies
    q? over D

  • When is this also necessary?

14
Null-containment of Boolean Queries
  • Theorem Let q(), q?() be boolean queries. Then
  • q ?? q? ? there is a J-homomorphism from q?
    to q
  • Intuition One test database ??Dq is enough,
  • where ?? replaces every non-join variable with ?
  • Corollary For boolean queries,
  • null-containment is NP-complete

15
Null-containment Example
q?(x) ? p(x, v1, w1), p(u, v2, w2), p(u, v3,
w3), r(v1, w2)
q?(x) ? p(x, v1, w1), p(u, v2, w2), p(u, v3,
w3), r(v1, w2)
q(x) ? p(x, y1, z1), p(x2, y2, z2), p(x3, y3,
x3), r(y1, z1), r(y1, z2), r(y2, x3)
  • Note
  • q? has three occurrences of p
  • q? has three join variables,
  • u occurs twice in a p-atom
  • We look
    for homomorphisms from q? to q

v1 , w2 and u
16
Null-containment Example
q?(x) ? p(x, v1, w1), p(u, v2, w2), p(u, v3,
w3), r(v1, w2)
?1
q(x) ? p(x, y1, z1), p(x2, y2, z2), p(x3, y3,
x3), r(y1, z1), r(y1, z2), r(y2, x3)
  • ?1 is a query homomorphism, but not a
    J-homomorphism

17
Null-containment Example
q?(x) ? p(x, v1, w1), p(u, v2, w2), p(u, v3,
w3), r(v1, w2)
?2
q(x) ? p(x, y1, z1), p(x2, y2, z2), p(x3, y3,
x3), r(y1, z1), r(y1, z2), r(y2, x3)
  • ?1 is a query homomorphism, but not a
    J-homomorphism
  • ?2 is a query homomorphism, but not a
    J-homomorphism

18
Null-containment Example
q?(x) ? p(x, v1, w1), p(u, v2, w2), p(u, v3,
w3), r(v1, w2)
?3
q(x) ? p(x, y1, z1), p(x2, y2, z2), p(x3, y3,
x3), r(y1, z1), r(y1, z2), r(y2, x3)
  • ?1 is a query homomorphism, but not a
    J-homomorphism
  • ?2 is a query homomorphism, but not a
    J-homomorphism
  • ?3 is a condition homomorphism, but not a query
    homomorphism

19
Null-containment Example
q?(x) ? p(x, v1, w1), p(u, v2, w2), p(u, v3,
w3), r(v1, w2)
?
q(x) ? p(x, y1, z1), p(x2, y2, z2), p(x3, y3,
x3), r(y1, z1), r(y1, z2), r(y2, x3)
  • ?1 is a query homomorphism, but not a
    J-homomorphism
  • ?2 is a query homomorphism, but not a
    J-homomorphism
  • ?3 is a condition homomorphism, but not a query
    homomorphism
  • There is no other condition homomorphism from q?
    to q
  • Hence There is no J-homomorphism from q? to q

  • What about containment?

20
Null-containment Example
q?(x) ? p(x, v1, w1), p(u, v2, w2), p(u, v3,
w3), r(v1, w2)
q(x) ? p(x, y1, z1), p(x2, y2, z2), p(x3, y3,
x3), r(y1, z1), r(y1, z2), r(y2, x3)
  • Suppose ? satisfies q over a database D and
    retrieves ?x
  • Case 1 ?x ? ?
  • Then ??1 satisfies the body of q? and ??1x ?x

21
Null-containment Example
q?(x) ? p(x, v1, w1), p(u, v2, w2), p(u, v3,
w3), r(v1, w2)
q(x) ? p(x, y1, z1), p(x2, y2, z2), p(x3, y3,
x3), r(y1, z1), r(y1, z2), r(y2, x3)
?
D
  • Suppose ? satisfies q over a database D and
    retrieves ?x
  • Case 2 ?x ? and ?x2 ? ?
  • Then ??2 satisfies the body of q? and ??2x ?x

22
Null-containment Example
q?(x) ? p(x, v1, w1), p(u, v2, w2), p(u, v3,
w3), r(v1, w2)
q(x) ? p(x, y1, z1), p(x2, y2, z2), p(x3, y3,
x3), r(y1, z1), r(y1, z2), r(y2, x3)
?
D
  • Suppose ? satisfies q over a database D and
    retrieves ?x
  • Case 3 ?x ? and ?x2 ?
  • Then ??3 satisfies the body of q? and ??3x ?x2
    ? ?x

23
Observations
  • Existence of J-homomorphisms is not necessary
    for null-containment
  • We may have to consider condition homomorphisms,
    not only query homomorphisms
  • Sometimes, several homomorphisms together give
    rise to containment

24
Complexity
  • Theorem Null-containment of conjunctive queries
    is
  • ?P2-complete
  • When is it impossible for
    such weird things to happen?

25
Binary Queries
  • Definition A query q(X) ? B is binary, if no
    predicate
  • occurs more than twice in B
  • Theorem (Sagiv/Saraiya 92) For binary queries,
    existence
  • of a query homomorphism can be checked in
    polynomial
  • time
  • None of the queries in the last
    example was binary

26
Null-containment of Binary Queries
  • Theorem Suppose q ?? q?. If q is binary, then
  • there exists a J-homomorphism from q? to q
  • Intuition
  • Weird things as in the last example can only
    happen
  • if some predicate occurs at least three times

27
Complexity for Binary Queries
  • Lemma
  • Existence of J-homomorphisms can be reduced in
    PTIME to the existence of query homomorphisms
  • The reduction translates binary queries into
    binary queries
  • Theorem For binary queries,
  • null containment can be decided in polynomial time

28
Test Is Null
  • In SQL, we can test whether an attribute value is
    null
  • att is null
  • We model this by a new built-in predicate
  • isNull(y)
  • How difficult is containment
    if we add isNull?

29
Test Is Not Null
  • In a conjunctive query, we can force a variable y
  • to have only non-null values
  • Transform q() ? , p(x, y, w),
  • into
  • q() ? , p(x, y, w), p(x?, y, w?),
  • where x?, w? are new variables.

30
Checking Null-containment with isNull
  • Modify the previous approach
  • Consider only null versions of test databases
    where
  • ?y ? if isNull(y)
  • Consequence Containment is still in ?P2
  • Sufficient conditions?

31
J-Homomorphisms
  • Now
  • J-Homomorphism
  • respects join variables
  • maps isNull to isNull
  • Existence of a J-homomorphism continues
    to be sufficient for
    null-containment

32
Null Tests Example
  • Also for boolean queries, existence of a
    J-homomorphism
  • is no more necessary for null-containment
  • q?() ? p(c,u1), p(u1,u2),
  • r(u1,w1), isNull(w1),
  • r(u2,w2), r(w3,w2)
  • q() ? p(c,v1), p(c,v2), p(v1,v2), p(v2,v3),
  • r(v1,z1), isNull(z1),
  • r(v2,z2),
  • r(w3,w2), r(w3,w2)

33
Null Tests Example
  • Note
  • w1 must be bound to ?
  • w2 must be bound to a non-null value
  • z1 must be bound to ?
  • z3 must be bound to a non-null value
  • There is no constraint on z2

q?()
q()
34
Null Tests Example
  • There is no
  • J-homomorphism

q?()
q()
35
Null Tests Example
  • Suppose, ? satisfies q
  • over a database D.
  • Case 1 ?z2 ? ?
  • ??1 satisfies q?
  • Case 2 ?z2 ?
  • ? ??2 satisfies q?

q?()
q()
36
Null Tests Complexity
  • Theorem Null-containment of boolean conjunctive
    queries
  • with null tests is ?P2-complete
  • Observation We only used the fact that a
    variable
  • is either bound to ?, or
  • to a value ??
  • The proof can be adapted to
  • x ? c, x lt c
  • x c, x ? c
  • p(x), ?p(x)

37
Summary
  • Formalisation of containment in the presence of
    SQL nulls
  • Null-containment is
  • ?P2-complete for conjunctive queries with null
    tests
  • ?P2-hard for
  • boolean queries with null tests
  • queries w/o null tests with distinguished
    variables
  • NP-complete for boolean queries w/o null tests
  • polynomial for binary queries w/o null tests
  • There is a sufficient PTIME test, which is
    necessary for binary queries
  • Containment is ?P2-hard for conjunctive queries
    with negated atoms
Write a Comment
User Comments (0)
About PowerShow.com