Title: Using Association Rules to Add or Eliminate Query Constraints Automatically
1Using Association Rules to Add or Eliminate Query
Constraints Automatically
Greek State Scholarships Foundation
National Bank of Greece
- Agathoniki Trigoni and Ken Moody
- July 2001
2Background Semantic Optimization
- Semantic Optimization Heuristics
- H1 Removing constraints
- H2 Adding constraints on indices
select x from Employees as x where
x.salarygt35,000
select x from Employees as x where
x.salarygt35,000 and x.year_of_birthlt1975
and x.year_of_employmentlt1995
3Motivation
a) What if association rules have exceptions?
Eoid2,oid5
salarygt35,000 ? year_of_birthlt1975
4Motivation
b) What if none of the rules can be used to apply
the heuristics?
select x from Employees x where x.salarygt35,000
and x.year_of_birthlt1975
- salarygt35,000 ? positionmanager
- position manager ? year_of_birthlt1970
5Objectives
What if
rules have exceptions?
none of the rules is applicable?
The proposed algorithm applies the semantic
optimization heuristics
- identifying direct or indirect rules useful for
the optimization - considering the exceptions of all rules involved
in the optimization solution
6Outline
- Mapping query predicates to rule constraints
- Identifying associations between constraints
- Identifying optimization solutions
- Transforming OQL queries
- Benefits of the optimization heuristics
7Query Predicates ? Rule Constraints H1
Constraint Elimination
select x from Employees as x where
x.salarygt35,000 and x.year_of_birthlt1975
Rules
year_of_birthlt1960
salarygt30,000
Rules
salarygt40,000 salary in 32,000,40,000
year_of_birthlt1960
Sources
Targets
8Query Predicates ? Rule Constraints H2
Constraint Introduction
select x from Employees as x where
x.salarygt35,000
Rules
year_of_employment lt1995
salarygt30,000
Rules
salarygt40,000 salary in 32,000,40,000
year_of_employment lt1995
Sources
Targets
9Algorithm - Step 1Backtracking
C1-R13
C2-R21
C5-R52
C1-R11
C4-R41
Ccomp2
C5-R54
Ccomp1
C3-R31
C5-R53
C5-R51
Path Annotations from C1-R11 to C4-R41
10Step 2 Combining annotations
E4
E2
C2-R21
C5-R52
E5
C1-R11
C4-R41
Ccomp2
E1
E3
C5-R54
Ccomp1
C3-R31
Epath1E1
Epath2E2
Epath3E2
Epath3f(E2,E4)
Epath(12) f(E1,E2)
Epath(12)f(E1,E2,E3)
Epath(123)f(E1,E2,E3,E4)
Epath(123)f(E1,E2,E3,E4,E5)
11Output of the algorithm
C1-R11
C4-R41
Epath(123)f(E1,E2,E3,E4,E5)
E(A?B)
E(B?C)
A
B
C
E(A?C) e e ?E(A?B), not(C(e)) union e
e ?E(B?C), A(e)
12Transforming OQL Queries
H1 Constraint Elimination Heuristic
select x from Employees as x where C1-R11 and
C4-R41
solutionilt C1-R11 , Epath(123)gt
(select x from Employees as x where C1-R11
) except Epath(123)
13Transforming OQL Queries
H2 Constraint Introduction Heuristic
select x from Employees as x where C1-R11
solutionilt C4-R41 , Epath(123) gt
(select x from Employees as x where C1-R11 and
C4-R41 ) union Epath(123)
14Selecting the Optimal Solution
Constraint Elimination Heuristic
select from where c1 and c2 and c3 c1 ? c2
? c3 (E1) c1 ? c3 ? c2 (E2) (select from
where c1 and c2 ) except E1 (select from
where c1 and c3 ) except E2
Solution with the Fewest Exceptions
15Selecting the Optimal Solution
select from where c1 and c2 c1 ? c31
(E1) c2 ? c32 (E2) (select from where c1
and c2 and c31) union E1 (select from where
c1 and c2 and c32) union E2
Constraint Introduction Heuristic
Solution with the Highest Index Selectivity
16Benefits of the heuristics
- Constraint Elimination Heuristic
- CPU-related benefits
- except operation (for exceptions)
- queries optimized once, executed frequently
- Constraint Introduction Heuristic
- data access time benefits
- union operation (for exceptions)
- ad-hoc queries
17Conclusion
- We provided an algorithm that applies the
optimization heuristics - using association rules with exceptions instead
of integrity rules, - taking advantage of indirect associations.
- The implementation of this framework showed that
- benefits (H2) gtgt benefits (H1)
- cost (H1 and H2) depends on the complexity of
the graph of constraints and the number of
exceptions per rule
18Algorithm Step 3Select useful associations
Relaxed ?Target
Con in RC ?Target
Ci ? C0
C ? C0
C C0
C C0
Relaxed ?Target
Con in RC / Relaxed ?Target
C1Cn ? C0
Combine two previous cases
C1 Cn C0
19Query Predicates ? Rule Constraints
select x from Employees as x where
x.salarygt35,000 and x.year_of_birthlt1975
- Relaxed constraints
- Relaxed combinations of constraints
- Tightened constraints (H1)
- Tightened combinations of constraints (H1)
- Index constraints (H2)
S
T
20Identifying solutions Heuristic 1
E1
E4
C1-index
C1-R1
C6-R6
E5
E2
E3
C2-index
C3-R3
C4-R4
C3-index
Solutions
21Algorithm - Step 2
Step 2.1 Absorbing Simple Constraints
E1
E2
E3
C1-R11
C2-R21
Ccomp2
C4-R41
C5-R54
E
E3
C1-R11
Ccomp2
C4-R41
C5-R54
Ee e?E1, not(C5-R54(e)) union e e?E2,
C1-R11(e)
22Algorithm Step 2
Step 2.2 Combining Path Annotations
E3
Ccomp2
C4-R41
C5-R52
C1-R11
Epath1
E3
Ccomp2
C5-R52
C4-R41
Epath2
C2-R22
E3
Ccomp2
C3-R31
Epath3
C5-R51
C4-R41
Epath4e e ? Epath1, C5-R51(e) union e
e ? Epath3, C1-R11(e)
23Transforming OQL Queries
H2 Constraint Introduction Heuristic
select x from Employees as x where C1 and and
Cn
solutionilt Ci-index, Ei gt
(select x from Employees as x where C1 and and
Cn and Ci-index) union Ei
24Transforming OQL Queries
H1 Constraint Elimination Heuristic
select x from Employees as x where C1 and and
Cn
solutionilt Ci1, , Cim, Ei gt
(select x from Employees as x where Ci1 and and
Cim) except Ei