ICS 214A: Database Management Systems Winter 2004 - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

ICS 214A: Database Management Systems Winter 2004

Description:

1. ICS 214A: Database Management Systems. Winter 2004. Lecture ... for sets, not bags. Example: sp1vp2 (R) = sp1(R) sp2(R) is true for sets, but not for bags. ... – PowerPoint PPT presentation

Number of Views:25

Avg rating:3.0/5.0

Slides: 31

Provided by: che7

Category:

more less

Transcript and Presenter's Notes

Title: ICS 214A: Database Management Systems Winter 2004

1
ICS 214A Database Management Systems Winter 2004

Lecture 15 Query Optimization (cont)
Professor Chen Li

2
Some transformation rules work for sets, not bags

Example sp1vp2 (R) sp1(R) ? sp2(R) is true for
sets, but not for bags.
For instance Ra,a,b,b,b,c, P1 satisfied by
a,b P2 satisfied by b,c
sp1(R) a,a,b,b,b
sp2(R) b,b,b,c
sp1vp2 (R) a,a,b,b,b,c
sp1(R) ? sp2(R) a,a,b,b,b,b,b,b,c ltgtsp1vp2
(R) !

3
Rules s combined

Let p predicate with only R attributes
q predicate with only S attributes
m predicate with only R, S attributes
sp (R S)
sq (R S)
Example
R(dept, id, salary), S(id, city, tel)
p depttoy, q city irvine, m id gt 500

sp (R) S R sq (S)
4
Rules s combined (continued)

Some rules can be derived

sp?q (R S) sp sq (R S) sp R
sq (S) sp (R) sq (S)
5
Derive the following by yourself

sp?q?m (R S) sm (sp R) (sq S)
spvq (R S) (sp R) S?R (sq S)

6
Rules p, s combined

Let x subset of R attributes
z attributes in predicate p (subset of R
attributes)
pxsp (R)
Example
R(dept, id, salary)
x salary
p depttoy and id gt 500 ? z dept, id

sp px (R)
7
Rules p, combined
Let x subset of R attributes y subset
of S attributes z intersection of R,S
attributes pxy (R S)pxypxz (R) pyz
(S)
Example R(dept, id, salary), S(id, city,
tel) x dept, y city zid
8

pxy sp (R S)

Example R(dept, id, salary), S(id, city,
tel) x dept, y city, zid p (id gt
500 city irvine)
9
Rules s, ? combined

sp(R ? S) sp(R) ? sp(S)
sp(R - S) sp(R) - S sp(R) - sp(S)

10
Some basic transformations

sp1?p2 (R) ? sp1sp2 (R)
sp (R S) ? sp (R) S
R S ? S R
px sp(R) ? pxsp pxz (R)

11
Conventional wisdom do projects early

Example R(A,B,C,D,E) xE P
(A3) ? (Bcat)
px sp (R) versus pE sppABE(R)

12
Query Optimizer

Find the best physical plan
Comprises the following
Plan space (search space)
huge number of alternative, semantically
equivalent plans
computationally expensive to examine all
Conventional wisdom avoid bad plans
need to include plans that have low cost
Cost model
facilitate comparisons of alternative plans
has to be accurate
Enumeration algorithm
search strategy (optimization algorithm) that
searches through the plan space
has to be efficient (low optimization overhead)

13
Plan Space

Left-deep trees right child has to be a base
table
Right-deep trees left child has to be a base
table
Deep trees one of the two children is a base
table
Bushy tree unrestricted

Deep tree
Bushy tree
Left-deep tree
14
Search Algorithms

Exhaustive enumerate each possible plan, and
pick the best
Greedy Techniques
Randomized/Transformation Techniques
System-R approach
Dynamic programming with pruning

15
1. Greedy Algorithm (Example)

Smallest relation next
Suppose Ri lt Rk for i lt k

R1
All plans must begin with R1
R2
R1
R3
R2
R4
R5
R3
R4
R5
All plans beginning with R2-R5 have been pruned!
16
2. Randomized Techniques

Employ randomized/transformation techniques for
query optimization
Hill climbing
State space -- space of plans, State -- plan
Each state has a cost associated with it
determined by some cost model
A move is a perturbation applied to a state to
get to another state
a move set is the set of moves available to go
from one state to another
any one move is chosen from this move set
randomly
each move set has a probability associated to
indicate the probability of selecting the move

17
Randomized Algorithm (Example)

We have covered many equivalence-preservation
rules.

R4
R4
R3
R3
R1
R2
R2
R1
R4
R4
R3
R3
R1
R2
R2
R1
18
Search space
19
Randomized Techniques (cont)

Two states are neighboring states if one move
suffices to go from one state to the other
A local minimum in the state space is a state
such that its cost is lower than that of all
neighboring states
A global minimum is a state which has the lowest
cost among all local minima
at most one global minimum value (could have
multiple states)
A move that takes one state to another with a
lower cost is called a downward move otherwise
it is an upward move
in a local/global minimum, all moves are upward
moves

20
3. Optimization in System-R

A.k.a. Selinger-style query optimization
The classic paper on query optimization (Selinger
et al. 1979)
Originally used in the IBM System-R project
Now widely used
Basic ideas
Left-deep trees only
Bottom-up generation of plans
Use dynamic-programming paradigm
Considering interesting orders

21
Bottom-up plan generation

Assumption 1 Once we have joined k relations,
the method of joining this result further with
another relation is independent of the previous
join methods (true?)
Assumption 2 Any subplan of an optimal plan must
also be optimal (otherwise we could replace the
subplan to get a better overall plan). (ture?)
Bottom-up generation of optimal plans
Compute the optimal plan for joining k relations
Suboptimal plans are pruned
From these plans, derive the optimal plans for
joining k1 relations

22
Basic version dynamic programming

The algorithm proceeds by considering
increasingly larger subsets of the set of all
relations.
Plans for a set of cardinality i are constructed
as extensions of the best plan for a set of
cardinality i-1
Search space can be pruned based on the principal
of optimality
if two plans differ only in a subplan, then the
plan with the better subplan is also the better
plan

23
Dynamic Programming (Cont)

1 2 3
4
1 2 1 3 1 4 2 3
2 4 3 4
1 2 3 1 2 4 2 3
4 1 3 4
1 2 3 4
24
Example

Consider joining relations R,S,T,U. Each with
1000 tuples.

S
R
T
U
Cost taken to be size of intermediate results.
Actually should take I/O cost or some other cost
metric.
25
Example (cont)
S
R
T
U
Only left-deep trees considered.
26
Dynamic Programming (cont)

Pass 1 find the best single-relation plans
Pass 2 find the best two-relation plans by
considering each single-relation plan (from pass
1) as the outer relation and every other relation
as the innner relation
Pass k Find the best k-relation plans by
considering each (k-1)-relation plan (from Pass
k-1) as the outer relation and every other
relation as the inner relation
Until k of relations in the query

27
Improvement interesting order

Is it always true that locally optimal is
globally optimal? NO!
Example R(A,B) JOIN S(A,C) JOIN T(A,D)
Best plan for R JOIN S hash join (beats
sort-merge join)
Best overall plan sort-merge join R and S, and
then sort-merge join with T
Subplan of the optimal plan is not optimal
Why?
The result of the sort-merge join of R and S is
sorted on A
This is an interesting order that can be
exploited by later processing (e.g., join,
duplicate elimination, GROUP BY, ORDER BY, etc.)!

28
Dealing with interesting orders

When picking the optimal plan
Comparing their costs is not enough
Plans are not totally ordered by cost anymore
Comparing interesting orders is also needed
Plans are now partially ordered
Plan X is better than plan Y if
Cost of X is lower than Y
Interesting orders produced by X subsume those
produced by Y
Need to keep a set of optimal plans for joining
every combination of k relations
Typically one for each interesting order

29
Summary

Query optimization is hard.
Instead of finding the best, the objective is to
avoid the bad plans
Many different optimization strategies have been
proposed
greedy heuristics are fast but may generate plans
that are far from optimal
dynamic programming is effective at the expense
of high optimization overhead

30
Summary of this course
SQL statements
Access plan
Query Processor
optimizer
Read write records, scan relations
Record-oriented file system
Get page containing tuples
Buffer manager
Basic file system
Read/write file pages
Hardware
Other related DB courses ICS214B, ICS215

Write a Comment

User Comments (0)