Title: View Matching for Outer-Join Views
1View Matching for Outer-Join Views
- Paul Larson and Jingren ZhouMicrosoft Research
2Outline of talk
- Motivation
- Definitions and notation
- Join-disjunctive normal form
- View-matching algorithm
- Experimental results
3Outer-join usage
- Outer joins can be used for several purposes
- OLAP queries that preserve tuples from dimension
tables - Construct hierarchical views (XML) that preserve
objects with no children - Computing negative subqueries (NOT EXISTS, NOT
IN, ALL)
SELECT Department.dname FROM Department WHERE NOT
EXISTS (SELECT FROM Employee WHERE
Department.dno Employee.dno)
SELECT Department.dname FROM Department
LEFT OUTER JOIN Employee ON
(Department.dno Employee.dno) WHERE
Employee.dno IS NULL
4Motiving example
SELECT ... FROM Part LEFT OUTER JOIN
Lineitem ON (p_partkeyl_partkey)
?
?
View V SELECT FROM Part LEFT OUTER JOIN
(Orders LEFT OUTER JOIN Lineitem ON
(o_orderkeyl_orderkey)) ON
(p_partkeyl_partkey) NOTEl_orderkey -gt
o_orderkey l_partkey -gt p_partkey
RewriteSELECT ... FROM VWHERE p_partkey is not
null
?
?
SELECT ... FROM Orders LEFT OUTER JOIN
Lineitem ON (o_orderkeyl_orderkey)
5Definitions and notation (1)
- Predicate P is strong (null-rejecting) if P is
false when any referenced column is null - Outer union R ? S
- Schema is union of columns in R and S
- Null-extend input tuples and take union
6Definitions and notation (2)
- A tuple t1 subsumes as tuple t2 if
- t1 agrees with t2 on all non-null columns and
- t1 contains fewer null values than t2
- Removal of subsumed tuples T? eliminates all
subsumed tuples from T - Minimum union R ? S (R ? S)?
- Is associative and commutative
- Left outer join T1 o?p T2 (T1 ?p T2) ? T1
- Full outer join T1 o?op T2 (T1 ?p T2) ? T1 ?
T2
7Join-disjunctive normal form
Full outer join
P2 is null-rejecting
sp2(P,O), sp1?p2(P,O,L) P O, sp1(O,L)
Left outer join
o?op2
Part
o?p1
O sp1(O,L)
Orders
Lineitem
Minimum union
sp1?p2(POL)
? sp1(OL) ? O ?
P
C. Galindo-Legaria, Sigmod, 1994
8Matching term by term
View V
Query 1
Query 2
o?p2
o?p1
o?p2
Part
Part
Lineitem
o?p1
Orders
Lineitem
Foreign-key join from L to O
Orders
Lineitem
sp1(O,L) ? O
sp2(P,L) ? P
sp1?p2(P,O,L) ? P
9Outer-join view matching
View
Step 1 Convert to normal form
Step 2 Check containment of terms
Step 3 Recover required terms -
selects with not-null predicates
Step 4 Select desired tuples -
apply residual query predicates
Step 5 Eliminate subsumed tuples -
selects duplicate elimination
Step 6 Combine partial results -
outer union of partial results -
reduce no of scans
Query
10Example view and query
View V1 Select lok, ln, lq, lp, ok, od, otp, ck,
cn, cnkfrom (select from C where cnk lt 10) Cr
right outer join ((select from O
where otp gt 50 ) Or full outer join
(select from L where lq lt 100) Lr
on (ok lok) ) OLj on (ck
ock) Query Q1 Select lok, lq, lp, od, otpfrom
(select from O where otp gt 150 ) Or
right outer join (select from L where
lq lt 100) Lr on (ok lok)
11Step 1 Convert to normal form
V1 scnklt10 otp gt50 lq lt100 jp_co
jp_ol (C,O,L) ?
scnklt10 otp gt50 jp_co (C,O) ?
sotp gt50 lq lt 100
jp_ol (O,L) ?
sotp gt50 (O) ?
slq lt 100 (L)
Q1 sotp gt150
lq lt 100 jp_ol (O,L) ?
slq lt
100 (L)
12Step 2 Check containment
Is every query tuple contained in the view? Test
each term pred(Q) ?pred(V) ?
?
(O,L)-term (otp gt150 lq lt 100 jp_ol) ?(otp
gt50 lq lt 100 jp_ol)
?
(L)-term (lq lt 100) ? (lq lt 100)
13Step 3 Term recovery
- (O,L)-term
- Duplicate elimination not required because the
(C,O,L) term has the same hub as the (O,L)
term - Non-null columns available for O and L
- sok ? null ln ? null V1
- (L)-term
- Duplicate elimination not required because the
terms (C,O,L) and (O,L) have the same hub as
the (L) term - Non-null columns available for L
- sln ? null V1
14Step 4 Select desired tuples
- (O,L)-term
- View predicate (otp gt 50 lq lt 100 jp_ol)
- Query predicate (otp gt150 lq lt 100 jp_ol)
- sotp gt 150 sok ? null ln ? null V1
- (L)-term
- View predicate (lq lt 100)
- Query predicate (lq lt 100)
- sln ? null V1
15Step 5 Eliminate subsumed tuples
- (O,L)-term
- Maximal term contains no subsumed tuples
- sotp gt 150 ok ? null ln ? null V1
- (L)-term
- An (L)-tuple may be subsumed by a tuple in the
(O,L)-term - Discard all tuples that also satisfy predicate
of (O,L)-term - sln ? null (otp gt 150 ok ? null ln ?
null) V1
16Step 6 Combine partial results
- Project each term onto required columns
- Combine using outer union
- Q1 ? lok, lq, lp, od, otp sotp gt 150 ok ?
null ln ? null V1 ? ? lok, lq, lp
sln ? null (otp gt 150 ok ? null ln ?
null) V1
- The two predicates are mutually exclusive
- Combine the two scans to produce final
substitute expression - Q1 ? lok, lq, lp, cstmt sln ? null ) V1
- cstmt case when ln ? null (otp gt 150
ok ? null ln ? null)
then null, null else
od, otp end
17Experimental results
Database TPC-R, 1GB View s(C)? s(C,O) ?
s(C,O,L) Query 1 s(C,O,L) Query 2 s(C,O) ?
s(C,O,L) Query 3 s(C,O) ? s(C,O,L),
two scans, one with duplicate
elimination Query 4 s(C,O),
duplicate elimination
18Conclusion
- First general view-matching algorithm for
outer-join views - Produces efficient substitute expressions
- Extended to outer-join views with aggregation
(see paper and tech report) - Additional results and proofs in tech report
(MSR-TR-2005-78) available at www.research.microso
ft.com