Memory Cognizant Query Optimization - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Memory Cognizant Query Optimization

Description:

... give optimal execution time as the plan generated may itself be sub-optimal for ... For a given query no single plan may be optimal in entire memory range ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 30
Provided by: ARU52
Category:

less

Transcript and Presenter's Notes

Title: Memory Cognizant Query Optimization


1
Memory Cognizant Query Optimization
  • Arvind Hulgeri
  • S. Seshadri
  • S. Sudarshan
  • Department of Computer Science and Engineering
  • I.I.T. Bombay

2
Outline
  • Why Memory Cognizant Optimizer?
  • Contributions
  • 2-Phase versus 1-Phase Optimization
  • Related Work
  • Details of 1-Phase Approach
  • Experimental Evaluation
  • Conclusion and Future Work

3
Why Memory Cognizant Optimizer?
  • Memory intensive operators
  • Typical optimizers assume all the memory to be
    available to each operator in the query tree
    Wrong assumption

4
Why Memory Cognizant Optimizer? (contd)
  • Memory will get divided amongst all the operators
    running simultaneously in a pipeline
  • Cost of a plan depends upon this memory division

5
Contributions
  • We develop efficient techniques to divide the
    available memory optimally among operators in a
    pipeline
  • We show how to make a cost-based decision of
    breaking (i.e. converting) a pipelined edge into
    a blocking edge
  • We develop practical memory cognizant query
    optimization algorithm

6
2-Phase versus 1-Phase Approach
  • 2-Phase Approach (2PO)
  • Phase 1 Optimize the query using conventional
    optimizer
  • Phase 2 Distribute available memory optimally
    amongst the operators in a pipeline
  • Problem
  • May not give optimal execution time as the plan
    generated may itself be sub-optimal for the given
    memory

7
2-Phase versus 1-Phase Approach (contd)
  • 1-Phase Approach (1PO)
  • We propose to modify the traditional optimizer to
    make it memory cognizant
  • Plan chosen takes into account the division of
    memory amongst operators

8
Related Work
  • Two Phase Approach
  • Phase 1 Optimize the query to get an optimal
    query plan
  • Phase 2 Schedule the query plan generated by
    phase 1
  • Static Scheduling
  • Bouganim,et.al., CIKM'98, Nag, DeWitt,
    CIKM'98
  • Dynamic Scheduling Bouganim,et.al., CIKM'98

9
1-Phase Approach
  • We propose to modify the traditional optimizer to
    make it memory cognizant
  • Plan chosen takes into account the division of
    memory amongst operators
  • We show how to optimize the query given the cost
    versus memory allocated function for each operator

10
Overview of 1PO
  • Optimal Division of Available Memory
  • Approximating Cost Functions
  • Extended Cost Function and Plan
  • Extensions to Query Optimizer
  • Breaking Pipelined Edges

11
Optimal Division of Available Memory
  • All the operators in a pipeline run
    simultaneously in given space
  • The cost of running such a pipeline depends upon
    how much memory each operator in the pipeline gets

12
Optimal Division of Available Memory (contd)
  • Need to optimally divide the available memory
    amongst all the operators in the pipeline
  • Function OptMerge divides memory between two
    operators/plans in a pipeline and generates
    combined cost function

13
function OptMergefor general form of cost
function
  • for m 1 to MaxAvailMem do
  • mergeCost(m) cost_\infty
  • for mx 0 to m do
  • my m - mx
  • costxy costx(mx) costy(my)
  • if costxy lt mergeCost(m)
  • mergeCost(m) costxy
  • return mergeCost

14
Approximating Cost Functions
  • General Form of Cost Function
  • Time complexity of various operations on cost
    function depends on available memory. e.g.
  • OptMerge O(m2)
  • MinMerge O(m)
  • The routine MinMerge compares two input cost
    functions for the entire memory range and at each
    memory point picks up the lower cost value

15
Approximating Cost Functions (contd)
  • Piecewise Linear Cost Function
  • Time complexity is independent of available
    memory,
  • And is a function of number of segments in the
    cost functions. e.g.
  • OptMerge O((sum of segemts) 2) ltlt change
  • MinMerge O(sum of segemts)

16
function OptMerge for Piecewise Linear cost
Functions
  • mergeCost infinity
  • for each change-over point (m,c) in costx do
  • costy' costy shifted by (m,c)
  • mergeCost MinMerge(mergeCost, costy')
  • for each change-over point (m,c) in costy do
  • costx costx shifted by (m,c)
  • mergeCost MinMerge(mergeCost, costx)
  • return mergeCost

17
Extended Cost Function
  • Define cost as a function of memory allocated
    instead of a single cost value for each operator
    and plan

18
Extended Plan
  • For a given query no single plan may be optimal
    in entire memory range
  • Define optimal Plan as list of pairs
  • m1, m2, optPlan
  • where m1, m2 is a range of memory and optPlan
    is the optimal plan in that range

19
Extensions to Query Optimizer
  • Comparing alternative operators or plans
  • One plan is consistently better than the other in
    the entire memory range retain the one plan with
    less cost
  • One plan is better at some memory range and the
    other one is better at some other memory range
    maintain both of them indicating which one is
    better in which range

20
Breaking Pipelined Edges
  • A plan P feeding an operator O through a
    pipeline edge E will have a trade-off
  • Option I E is pipelined
  • P gets less memory, costs more
  • no IO incurred at E
  • Option II E is broken
  • (converted into a blocking edge)
  • P gets full memory, costs less
  • IO incurred at E

21
Breaking Pipelined Edges (contd)
  • Let the cost of P be PPC(i)
  • Option I
  • Cost of P PPC(i)
  • Option II
  • Cost of P BBC(i) PPC(M 1) IO
  • Where IO cost of IO at E when blocked
  • M total available memory

22
Breaking Pipelined Edges (contd)
  • The optimal cost of plan P with blocking decision
    incorporated is given by MinMerge(PPC, BPC)
  • The routine MinMerge compares two input cost
    functions for the entire memory range and at each
    memory point picks up the lower cost value

23
Breaking Pipelined Edges (contd)
24
Experimental Evaluation
  • Comparison 1PO against 2PO
  • Metric Estimated cost of the optimal plan
    produced by the optimizer
  • Tested the algorithms with around 20,000 randomly
    generated queries on a TPCD-1 based star schema

25
Experimental Evaluation (contd)
  • Each generated query is of the form
  • select sum(quantity)
  • from orders, supplier, part, customer, time
  • where join-list and select-list
  • group by groupby-list
  • Memory available is chosen randomly between 10
    blocks to 10,000 blocks

26
Estimated Cost Reduction
27
Optimization Times
28
Conclusion
  • For the class of queries we considered, 1PO gives
    benefits, but generally 2PO performs about as
    well as 1PO
  • The preliminary results indicate that using 1PO
    for query optimization may not be beneficial
    good news!!!

29
Future Work
  • Parametric Query Optimization
  • Memory Cognizant Query Optimization for
    ORDB(Object Oriented Databases)
Write a Comment
User Comments (0)
About PowerShow.com