ICS 214A: Database Management Systems Winter 2004 - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

ICS 214A: Database Management Systems Winter 2004

Description:

1. ICS 214A: Database Management Systems. Winter 2004. Lecture ... for sets, not bags. Example: sp1vp2 (R) = sp1(R) sp2(R) is true for sets, but not for bags. ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 31
Provided by: che7
Category:

less

Transcript and Presenter's Notes

Title: ICS 214A: Database Management Systems Winter 2004


1
ICS 214A Database Management Systems Winter 2004
  • Lecture 15 Query Optimization (cont)
  • Professor Chen Li

2
Some transformation rules work for sets, not bags
  • Example sp1vp2 (R) sp1(R) ? sp2(R) is true for
    sets, but not for bags.
  • For instance Ra,a,b,b,b,c, P1 satisfied by
    a,b P2 satisfied by b,c
  • sp1(R) a,a,b,b,b
  • sp2(R) b,b,b,c
  • sp1vp2 (R) a,a,b,b,b,c
  • sp1(R) ? sp2(R) a,a,b,b,b,b,b,b,c ltgtsp1vp2
    (R) !

3
Rules s combined
  • Let p predicate with only R attributes
  • q predicate with only S attributes
  • m predicate with only R, S attributes
  • sp (R S)
  • sq (R S)
  • Example
  • R(dept, id, salary), S(id, city, tel)
  • p depttoy, q city irvine, m id gt 500

sp (R) S R sq (S)
4
Rules s combined (continued)
  • Some rules can be derived

sp?q (R S) sp sq (R S) sp R
sq (S) sp (R) sq (S)
5
Derive the following by yourself
  • sp?q?m (R S) sm (sp R) (sq S)
  • spvq (R S) (sp R) S?R (sq S)

6
Rules p, s combined
  • Let x subset of R attributes
  • z attributes in predicate p (subset of R
    attributes)
  • pxsp (R)
  • Example
  • R(dept, id, salary)
  • x salary
  • p depttoy and id gt 500 ? z dept, id

sp px (R)
7
Rules p, combined
Let x subset of R attributes y subset
of S attributes z intersection of R,S
attributes pxy (R S)pxypxz (R) pyz
(S)
Example R(dept, id, salary), S(id, city,
tel) x dept, y city zid
8
  • pxy sp (R S)

Example R(dept, id, salary), S(id, city,
tel) x dept, y city, zid p (id gt
500 city irvine)
9
Rules s, ? combined
  • sp(R ? S) sp(R) ? sp(S)
  • sp(R - S) sp(R) - S sp(R) - sp(S)

10
Some basic transformations
  • sp1?p2 (R) ? sp1sp2 (R)
  • sp (R S) ? sp (R) S
  • R S ? S R
  • px sp(R) ? pxsp pxz (R)

11
Conventional wisdom do projects early
  • Example R(A,B,C,D,E) xE P
    (A3) ? (Bcat)
  • px sp (R) versus pE sppABE(R)

12
Query Optimizer
  • Find the best physical plan
  • Comprises the following
  • Plan space (search space)
  • huge number of alternative, semantically
    equivalent plans
  • computationally expensive to examine all
  • Conventional wisdom avoid bad plans
  • need to include plans that have low cost
  • Cost model
  • facilitate comparisons of alternative plans
  • has to be accurate
  • Enumeration algorithm
  • search strategy (optimization algorithm) that
    searches through the plan space
  • has to be efficient (low optimization overhead)

13
Plan Space
  • Left-deep trees right child has to be a base
    table
  • Right-deep trees left child has to be a base
    table
  • Deep trees one of the two children is a base
    table
  • Bushy tree unrestricted

Deep tree
Bushy tree
Left-deep tree
14
Search Algorithms
  • Exhaustive enumerate each possible plan, and
    pick the best
  • Greedy Techniques
  • Randomized/Transformation Techniques
  • System-R approach
  • Dynamic programming with pruning

15
1. Greedy Algorithm (Example)
  • Smallest relation next
  • Suppose Ri lt Rk for i lt k

R1
All plans must begin with R1
R2
R1
R3
R2
R4
R5
R3
R4
R5
All plans beginning with R2-R5 have been pruned!
16
2. Randomized Techniques
  • Employ randomized/transformation techniques for
    query optimization
  • Hill climbing
  • State space -- space of plans, State -- plan
  • Each state has a cost associated with it
  • determined by some cost model
  • A move is a perturbation applied to a state to
    get to another state
  • a move set is the set of moves available to go
    from one state to another
  • any one move is chosen from this move set
    randomly
  • each move set has a probability associated to
    indicate the probability of selecting the move

17
Randomized Algorithm (Example)
  • We have covered many equivalence-preservation
    rules.

R4
R4
R3
R3
R1
R2
R2
R1
R4
R4
R3
R3
R1
R2
R2
R1
18
Search space
19
Randomized Techniques (cont)
  • Two states are neighboring states if one move
    suffices to go from one state to the other
  • A local minimum in the state space is a state
    such that its cost is lower than that of all
    neighboring states
  • A global minimum is a state which has the lowest
    cost among all local minima
  • at most one global minimum value (could have
    multiple states)
  • A move that takes one state to another with a
    lower cost is called a downward move otherwise
    it is an upward move
  • in a local/global minimum, all moves are upward
    moves

20
3. Optimization in System-R
  • A.k.a. Selinger-style query optimization
  • The classic paper on query optimization (Selinger
    et al. 1979)
  • Originally used in the IBM System-R project
  • Now widely used
  • Basic ideas
  • Left-deep trees only
  • Bottom-up generation of plans
  • Use dynamic-programming paradigm
  • Considering interesting orders

21
Bottom-up plan generation
  • Assumption 1 Once we have joined k relations,
    the method of joining this result further with
    another relation is independent of the previous
    join methods (true?)
  • Assumption 2 Any subplan of an optimal plan must
    also be optimal (otherwise we could replace the
    subplan to get a better overall plan). (ture?)
  • Bottom-up generation of optimal plans
  • Compute the optimal plan for joining k relations
  • Suboptimal plans are pruned
  • From these plans, derive the optimal plans for
    joining k1 relations

22
Basic version dynamic programming
  • The algorithm proceeds by considering
    increasingly larger subsets of the set of all
    relations.
  • Plans for a set of cardinality i are constructed
    as extensions of the best plan for a set of
    cardinality i-1
  • Search space can be pruned based on the principal
    of optimality
  • if two plans differ only in a subplan, then the
    plan with the better subplan is also the better
    plan

23
Dynamic Programming (Cont)

1 2 3
4
1 2 1 3 1 4 2 3
2 4 3 4
1 2 3 1 2 4 2 3
4 1 3 4
1 2 3 4
24
Example
  • Consider joining relations R,S,T,U. Each with
    1000 tuples.

S
R
T
U
Cost taken to be size of intermediate results.
Actually should take I/O cost or some other cost
metric.
25
Example (cont)
S
R
T
U
Only left-deep trees considered.
26
Dynamic Programming (cont)
  • Pass 1 find the best single-relation plans
  • Pass 2 find the best two-relation plans by
    considering each single-relation plan (from pass
    1) as the outer relation and every other relation
    as the innner relation
  • Pass k Find the best k-relation plans by
    considering each (k-1)-relation plan (from Pass
    k-1) as the outer relation and every other
    relation as the inner relation
  • Until k of relations in the query

27
Improvement interesting order
  • Is it always true that locally optimal is
    globally optimal? NO!
  • Example R(A,B) JOIN S(A,C) JOIN T(A,D)
  • Best plan for R JOIN S hash join (beats
    sort-merge join)
  • Best overall plan sort-merge join R and S, and
    then sort-merge join with T
  • Subplan of the optimal plan is not optimal
  • Why?
  • The result of the sort-merge join of R and S is
    sorted on A
  • This is an interesting order that can be
    exploited by later processing (e.g., join,
    duplicate elimination, GROUP BY, ORDER BY, etc.)!

28
Dealing with interesting orders
  • When picking the optimal plan
  • Comparing their costs is not enough
  • Plans are not totally ordered by cost anymore
  • Comparing interesting orders is also needed
  • Plans are now partially ordered
  • Plan X is better than plan Y if
  • Cost of X is lower than Y
  • Interesting orders produced by X subsume those
    produced by Y
  • Need to keep a set of optimal plans for joining
    every combination of k relations
  • Typically one for each interesting order

29
Summary
  • Query optimization is hard.
  • Instead of finding the best, the objective is to
    avoid the bad plans
  • Many different optimization strategies have been
    proposed
  • greedy heuristics are fast but may generate plans
    that are far from optimal
  • dynamic programming is effective at the expense
    of high optimization overhead

30
Summary of this course
SQL statements
Access plan
Query Processor
optimizer
Read write records, scan relations
Record-oriented file system
Get page containing tuples
Buffer manager
Basic file system
Read/write file pages
Hardware
Other related DB courses ICS214B, ICS215
Write a Comment
User Comments (0)
About PowerShow.com