Title: Query Optimization for Object-Relational Database Systems
1Query Optimization for Object-Relational Database
Systems
- Navin Kabra
- Advisor Professor David J. DeWitt
2Query Optimization is
3Problem Areas
- Difficult to implement a new optimizer
- Difficult to extend/modify existing optimizer
- Difficult to debug optimizer bugs
- Optimizer doesnt always work
4Difficult times ahead
- Complex Queries
- Decision support, OLAP, Mining, Terabytes of data
- Complex Data
- User-defined types (UDTs), User-defined
functions/methods (UDFs)
5Road Map
- Introduction
- The Opt Architecture
- Extensibility features of Opt
- Debugging Support
- Conclusions
6OPT Basic Idea
Code Independent of Query Algebra
Well-Defined Interface
Code Dependent on Query Algebra
7OPT Overview
Code provided with OPT
Search Strategy
Abstract Classes
RUNTIME BINDING
Code written by Optimizer Implementor
Derived Classes
8Abstract Classes
- Operator class (logical algebra)
- Algorithm class (physical algebra)
- Generator classes (search space)
9An Example Query
Select from Emp, Dept where Emp.name Lee
AND Emp.dno Dept.dno
10An Operator Tree
11An Access Plan
12Operator Tree Access Plan
13Tree Descriptors
14Plan Descriptors
15Operators and Algorithms
16Expand - Example
17Expand - continued
18Expand - continued
19Going from a Tree to a Plan
20Generators
Search Strategy
Plan Generator
Operators
Algorithm
Tree Generator
HashJoin Gen
Select
Join
Index Gen
SelectPushdown
Index Scan
LeftDeep Gen
Hash Join
21Example Merge Join
22Example Merge Join - continued
23Enforcing Sorts
24Enforcing Sorts
25System-R style search strategy
26Changing the Search Strategy
Search Strategy
2PO
Bottom Up
SA
II
Transformative
Plan Generator
Operators
Algorithm
Tree Generator
HashJoin Gen
Select
Join
Index Gen
SelectPushdown
Index Scan
LeftDeep Gen
Hash Join
27A Transformation
28Another Transformation
29Transformation-based Search Strategies
- Volcano-style
- Randomized
- Iterative Improvement
- Simulated Annealing
- Two-phase Optimization
30Road Map
- Introduction
- The Opt Architecture
- Extensibility features of Opt
- Debugging Support
- Conclusions
31Optimizer Components
Search Strategy
Search Strategy
2PO
Bottom Up
SA
II
Transformative
Plan Generator
Operators
Algorithm
Tree Generator
HashJoin Gen
Select
Join
Index Gen
SelectPushdown
Index Scan
LeftDeep Gen
Hash Join
Search Space
Query Algebra
32Extending an Optimizer (Query Algebra)
- Add Operator and Algorithm classes
- Add Generator classes
- No changes to search strategy required
33Adding an Algorithm
Search Strategy
Search Strategy
2PO
Bottom Up
SA
II
Transformative
Plan Generator
Operators
Algorithm
Tree Generator
Select
Join
Index Scan
HashJoin Gen
SelectPushdown
Index Gen
LeftDeep Gen
Hash Join
Search Space
Query Algebra
34Adding an Operator
Search Strategy
Search Strategy
2PO
Bottom Up
SA
II
Transformative
Plan Generator
Operators
Algorithm
Tree Generator
Select
Join
HashJoin Gen
Index Scan
Index Gen
SelectPushdown
LeftDeep Gen
Hash Join
Search Space
Query Algebra
35Extending an Optimizer (Search Strategy)
- Add new SearchStrategy class
- Possibly add new Generator classes
- No changes to Query Algebra component
36Adding a Tree Generator
Search Strategy
Search Strategy
2PO
Bottom Up
SA
II
Transformative
Plan Generator
Operators
Algorithm
Tree Generator
SelectPushdown
HashJoin Gen
Select
Join
Index Gen
LeftDeep Gen
Index Scan
Hash Join
Search Space
Query Algebra
37Extending an Optimizer (Search Space)
- Add or remove appropriate Generator classes
- No changes to Search Strategy
- No changes to Query Algebra
38Adding a Search Strategy
Search Strategy
2PO
Search Strategy
SA
Bottom Up
II
Transformative
Plan Generator
Operators
Algorithm
Tree Generator
HashJoin Gen
Select
Join
Index Gen
SelectPushdown
Index Scan
LeftDeep Gen
Hash Join
Search Space
Query Algebra
39Road Map
- Introduction
- The Opt Architecture
- Extensibility features of Opt
- Debugging Support
- Conclusions
40Debugging Support
- Lots of time spent debugging the optimizer
- Non-optimal plans produced due to
- Bugs in enumeration logic
- Bugs in cost model
41Debugging Support - continued
- Optimizer Implementor indicates expected plan
- Optimizer indicates why expected plan was not
produced - Sets automatic breakpoints (inside a debugger) at
appropriate location - Only for Bottom Up Search Strategy
- Partially useful for other search strategies
42Bug in Cost Model
Expected Plan
Generated Plan
43Bug in Plan Enumeration
Expected Plan
Generated Plan
Generated Tree
44Bug in Tree Enumeration
Expected Plan
Generated Tree
45Partial Plan Specification
- Optimizer Implementor does not want to specify
full plan - Allow specification of a plan fragment
- A plan tree with incomplete information (and
Dont Cares)
46Partial Plan - Join Order
47Partial Plan - Index Select
48Partial Plan - General
49Partial Plan - Complex
50Using Partial Plans
- Allow Breakpoints using Partial Plans
- Force Plans using Partial Plans (discussed next)
51Force Plans Using Partial Plans
- Over-ride the Optimizer
- Useful because
- During cost model validation
- Debugging and tuning the system
- End user wants to force a plan
- Optimizer tries to produce a plan that matches
the specification
52Force Plans - DML
Select from Emp, Dept where Emp.name
Lee AND Dept.dno Emp.dno hint algorithm
hashjoin leftinput algorithm indexscan
53Conclusions
- The Opt architecture simplifies
- implementation
- maintenance
- experimentation
- Techniques for debugging an optimizer
- very useful
- more work needed