Using State Modules for Adaptive Query Processing - PowerPoint PPT Presentation

About This Presentation
Title:

Using State Modules for Adaptive Query Processing

Description:

A SteMS must bounce back a build tuple s unless it is a duplicate of another s' ... Constraint: A tuple t that has been bounced back after probing into a SteMS must ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 24
Provided by: webC
Learn more at: http://web.cs.wpi.edu
Category:

less

Transcript and Presenter's Notes

Title: Using State Modules for Adaptive Query Processing


1
Using State Modules for Adaptive Query Processing
  • Vijayshankar Raman
  • IBM Almaden Research Center
  • Amol Deshpande
  • Joseph M. Hellerstein
  • University of California,
    Berkeley

2
  • All the material is taken directly or adapted
    from the paper
  • Using State Modules for Adaptive Query
    Processing
  • by Vijayshankar Raman, Amol Deshpande, Joseph M.
    Hellerstein

3
Contents
  • Background
  • Overview
  • Framework
  • Variations in Adaptations
  • Illustrated Examples and Experiments
  • Conclusion

4
Background
  • Uncertainties in query execution
  • Cardinality estimates are highly imprecise
  • Demands on memory, system load and network
    bandwidth are typically unknown at runtime
  • Data distribution and rates often cannot be known
    in advance
  • User preference in interactive system changes
    over time
  • Necessity of adaptive execution in stream system

5
Background
  • Federated Facts and Figures (FFF) query system to
    combine data from diverse and distributed data
    sources
  • Volatility of distributed data sources
  • Volatility of user interests during online query
    processing

6
What do you mean by adaptability?
  • No static plan of execution, dynamically changing
    execution plan according to the changing
    environment, at the same time, should guarantee
    the result is correct
  • Adaptability requires flexibility, such as
  • Choices of AM and Join Algorithms
  • Ordering of operators
  • Choices of query spanning tree

7
How to achieve flexibility?
  • Proposed Solutions
  • Refine the granularity of query models
  • Breaking down large operator, exposing the inside
    to the control of optimizer
  • Separate and encapsulate the state data structure
    from Join
  • Optimizer has more decisions to make
  • Consequence
  • Optimizer gains more flexibility and freedom at
    the expense of assuming more responsibilities

8
Overview
  • Adaptive Execution of SPJ
  • Routing constraints needed for on-the-fly
    Adaptation
  • Focus on the adaptive processing of join
  • Introduce the framework of dynamic routing
  • Keep on adding flexibility in execution by
    revising and relaxing the routing constraints
    gradually
  • Support other join algorithms
  • Support multiple access methods
  • Support cyclic query
  • Non-symmetric treatment of input relations

9
Join Operator
  • Logical construct, black box
  • Typically involve multiple physical operations
  • Q1 Which physical operations are involved?

10
Different Levels of Adaptation
  • Join of three table
  • Q2 What is the advantage of (b) compared to (a)?
  • Q3 What is the advantage of (c) compared to (b)?

11
Comparison Discussion
  • Both (a) and (b) make use of only the index
    access method on T and pre-chosen implementation
    for RS and ST joins
  • (c) allows all access methods (tuples from AM are
    routed to SteMs, rather than joins) and allows a
    variety of routing decisions that permit
    different join algorithms and join order
  • Q4 Decomposing of Join operator brings about
    adaptation. Why?

12
  • Does the routing framework work?
  • (Appendix) Showed all SPJ can be executed by
    routing tuples carefully between AM, SteMs and
    selections
  • Caution
  • Arbitrary routing results in
  • Duplicate results
  • Missing results
  • Infinite loops
  • Solution
  • Flexibility comes at the price of Routing
    constraints
  • Proposed Routing constraints

13
Framework Components (overview)
  • Four kinds of modules
  • Selection modules Query predicate
  • Access modules Access method over data source
  • State modules Encapsulate data structure in
    traditional join algorithms
  • Eddy modules Route tuples between the other
    modules
  • Each module runs asynchronously

14
Functionality of Main Modules
15
Query Planning
  • Check that the query is valid
  • Create an AM on each access method
  • Create a SM on each predicate
  • Create a SteM on each base table
  • Create any seed tuples needed for scans

16
Example of N-way Symmetric Hash Join
  • Demonstrate how to implement n-way symmetric join
    with SteMs

Q5 Comparing (ii) with (i), which one will you
choose?
17
Executing Arbitrary SPJ Queries with SteMs
1. Acyclic SPJ queries with single scan AM on
each table Example n-ary SHJ Required
Rules SteMs implemented with hash
indices. Eddy obeys Routing Constraints BuildFi
rst Singleton tuple from table T must first be
routed to build into SteMT SteM BounceBack All
Build tuples and NO Probe tuples Atomicity
Build and Probing Coupled BoundedRepetition No
tuple routed to same module more than once.
18
Relax constraints to allow other Join
Algorithms SteMs NEED NOT be implemented with
hash indices. Build and Probe operations
decoupled Potential problems?
19
  • 2. Competitive AMs
  • Example Queries with more than one AM.
  • Goal Run multiple AMs/ source and let Eddy
    dynamically choose one AM or switch between AMs
  • Duplicacy problem?
  • Required Rules
  • SteM BounceBack A SteMS must bounce back a build
    tuple s unless it is a duplicate of another s
    that is already in SteMS.

20
  • 3. Index AMs
  • When a data source has an index AM.
  • potential problem?
  • Required Rules
  • SteM BounceBack
  • A SteMS must bounce back a build tuple s unless
    it is a duplicate of another s that is already
    in SteMS.
  • A SteMS must bounce back a probe tuple r unless S
    has a scan AM, or SteMS already contains all
    matches for r.

21
  • More Adaptation
  • 4. Cyclic Queries
  • Static spanning tree choices hurt in two ways
  • The spanning tree choice is typically made based
    on selectivities
  • A static spanning tree choice can also constrain
    the generation of partial query results
  • Required Rules
  • ProbeCompletion Constraint A tuple t that has
    been bounced back after probing into a SteMS must
    not probe into any other SteM afterwards. The
    routing policy must however maintain t in the
    dataflow, routing it to other modules, until it
    has been probed into an AM on S.
  • Prior Probers and Probe Completion Table
  • 5. Relaxing the BuildFirst Constraint
  • if one of the input tables is much larger than
    the others?

22
Summary of Constraints
23
Conclusion
  • The salient points of our experimental study are
    as follows
  • Even a simple join algorithm like the index join
    encapsulates multiple physical operations, and
    this causes
  • A head-of-line blocking problem. This problem can
    be avoided by breaking the join module into
    SteMs.
  • SteMs allow the Eddy to efficiently learn between
    competitive access methods, while doing almost no
    redundant work.
  • SteMs allow the Eddy to dynamically choose the
    join spanning tree for cyclic queries.
  • SteMs allow the Eddy to dynamically switch
    between an index join algorithm and a symmetric
    hash join algorithm during query execution.
  • With SteMs, the Eddy can adaptively choose the
    way it reorders tuples in interactive
    environments.
  • Thank you ?
Write a Comment
User Comments (0)
About PowerShow.com