Query Optimization - PowerPoint PPT Presentation

About This Presentation
Title:

Query Optimization

Description:

Query Optimization Allison Griffin Importance of Optimization Time is money Queries are faster Helps everyone who uses the server Solution to speed lies in the ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 17
Provided by: 03348
Learn more at: https://s2.smu.edu
Category:

less

Transcript and Presenter's Notes

Title: Query Optimization


1
Query Optimization
  • Allison Griffin

2
Importance of Optimization
  • Time is money
  • Queries are faster
  • Helps everyone who uses the server
  • Solution to speed lies in the algorithm
  • Different performance improvements with different
    database engines and schemas

3
Brief History
  • Before 1970s Dark days, manual optimization
  • Late 70s to mid 80s
  • Birth of relational data model and declarative
    SQL
  • Optimization is job of system
  • System R-beginning work on join order
    optimization
  • Dynamic Programming Heuristic Optimizers
  • Mid 80s to early 90s
  • Extensible query optimization (Exodus)
  • Mid 90s to late 90s
  • Materialized Views

4
Volcano Extensible Query Optimizer Generator
  • General purpose cost based query optimizer, based
    on equivalence rules in algebra
  • Equivalences join associativity, select push
    down, aggregate push down
  • Extensible new operations and equivalences can
    be easily added
  • Developed by Graefe and McKenna 1993

5
Materialized Views
  • Can materialize (pre-compute and store) views to
    speed up queries
  • Incremental maintenance
  • when database is updated, propagate updates to
    materialized view without complete re-computation
  • Deciding when to use materialized views
  • even if query does not refer to materialized
    view, optimizer can figure out it can be used

6
Deciding What to Materialize
  • Maintenance cost and query cost
  • Workload depends on what is materialized
  • queries and update transactions
  • weights for each component of workload
  • Goal find set of views that gives minimum cost
    if materialized, subject to space constraints

7
What we already know
  • Query optimizer analyzes set of query execution
    plans and gives optimal (least cost)
  • Heavily dependent on optimizers estimate for
    number of rows that will result at each step of
    QEP
  • Estimates rely on statistics typically stored in
    histograms

8
Recent Approaches to Improve Statistics
  • Paper Distinct-Value Synopses for Multiset
    Operations by Kevin Beyer, Rainer Gemulla, Peter
    J. Haas, Berthold Reinwalk, and Yannis Sismanis,
    2007
  • IBMs LEO (Learning Empirical Results in Query
    Optimization), 2001

9
Summary of Paper Results
  • Addresses the problem of efficient estimate of
    number of distinct values of an attribute
  • Builds on leveraging of randomized algorithms
  • Claim to have unbiased estimator for distinct
    values with lower mean squared error
  • Past attempts tend to by higher than the actual
    number so they have come up with way to cut that
    number down to be more reasonable

10
Distinct-Value Estimation
  • Propose summary structure (synopsis) for a
    relation
  • Synopsis can be used to estimate number of DVs in
    the partition
  • Synopses can be combined to create synopses for
    compound partitions created from base partitions
    using multiset union, intersection or difference
    operations
  • Updates can be performed on compound partitions
    by using synopses from base relations

11
LEO - Learning Emperical Results in Query
Optimization
  • Autonomic feedback loops that create a
    self-tuning database query optimizer
  • Self-validates and adjusts to improve query
    optimization and execution without requiring user
    interaction to repair incorrect statistics or
    cardinality estimates
  • Reduces the total cost of owning database
    management systems by simplifying database
    administration

12
How it works
  • Monitors queries as they execute
  • Compares the optimizers estimates with actuals
    at each step in a QEP
  • Then computes adjustments to its estimates that
    may be used during future optimizations of
    similar queries
  • Moreover, estimation errors can also trigger
    re-optimization of a query in mid-execution.

13
(No Transcript)
14
Challenges in Research of LEO
  • (1) ensuring stability and convergence of the
    autonomic system
  • (2) guaranteeing consistency of the overall
    optimizer's model upon refinements

15
Results
  • Reduction of query execution time by orders of
    magnitude at negligible additional run-time cost
  • Reduced administration time
  • Fewer problem queries
  • Overall improved query performance with increased
    robustness and predictability of query response
    times

16
Bibliography
  • LEO-Learning Empirical Results in Query
    Optimization. IBM. lthttp//domino.watson.ibm.com/
    comm/research.nsf/pages/r.datamgmt.innovation.html
    gt.
  • Optimizing for Query Speed. SQL.
    lthttp//www.devshed.com/c/a/MySQL/Optimizing-for-Q
    uery-Speed/1/
  • Optimizing Database Queries. IBM.
    lthttp//www.stevengould.org/portfolio/developerWor
    ks/efficientPHP/wa-effphp/wa-effphp-4-1.htmlgt.
  • Optimize Queries Theory in Practice.
    lthttp//www.serverwatch.com/tutorials/article.php/
    2175621/How-to-Optimize-Queries-Theory-an-Practice
    .htmgt.
  • Beyer, Kevin, Gemulla, Rainer, Haas, Peter J.,
    Reinwald, Berthold, Sismani, Yannis.
    Distinct-Value Synopses for Multiset
    Operations. Communications of the ACM. Vol. 52.
    October 2009.
  • Chaudhuri, Surajit. Technical Perspective
    Relational Query Optimization-Data Management
    Meets Statistical Estimation. Communications of
    the ACM. Vol. 52. October 2009.
Write a Comment
User Comments (0)
About PowerShow.com