BI515: Query Processing in Adaptive Server IQ - PowerPoint PPT Presentation

1 / 33

About This Presentation

Title:

BI515: Query Processing in Adaptive Server IQ

Description:

Designed from the ground up to maximize the performance of many users running ad ... All possible bushy join plans that avoid cartesian product joins are considered ... – PowerPoint PPT presentation

Number of Views:29

Avg rating:3.0/5.0

Slides: 34

Provided by: sybas

Category:

more less

Transcript and Presenter's Notes

Title: BI515: Query Processing in Adaptive Server IQ

1
BI515Query Processing in Adaptive Server IQ
Steven Kirk Architect ASIQ Query Engine
Group stevek_at_sybase.com
2
IQ Query Processing Topics

IQ Query Engine
Differences from traditional database engines
Differences from traditional query engines
Query engine architecture
Query optimizer
IQ Query Plans
How to get query plans, and how to read them
IQ Query Performance

3
IQ Query Engine

Different Goals than Traditional Database Engines
Designed from the ground up to maximize the
performance of many users running ad-hoc queries
Prioritized IQ Design Goals
Query Performance - very high priority
Bulk Update Speed - high priority
Small Update Speed - very low priority

4
IQ Query Engine

Different Goals Yeild a Different Design than
Traditional Databases
Column-oriented storage instead of row-oriented
IQ reads only the data for columns referenced
Enables efficient data compression
Most data pages compressed when written
In practice, fully indexed databases are usually
60 to 120 of the raw ASCII data size
Single row inserts and deletes are painfully slow

5
IQ Query Engine

Different Design than Traditional Databases
(cont.)
Column-oriented storage instead of row-oriented
Unique row identifiers for the life of a row
Bit-map representation for sets of row
identifiers
Multiple patented bit-mapped index types
Assumes majority of columns will have multiple
indexes defined

6
IQ Query Engine

Differences from Traditional Query Engines
Effects of most predicates local to a table can
be combined with each other before any tuples are
actually projected from the table.
Numerous query operators are available directly
within the indexes, including
Grouping and distincting
Inter-column comparisons
Hash containment

7
IQ Query Engine

Similarities with Traditional Query Engines
Once IQ has completed all the work it can in the
indexes for each table, tuples are projected and
then joined or grouped in the same ways as in
other query engines
Whenever multiple access paths (or operator
algorithms) are available, then a cost-based
decision must be made, often based on predicate
selectivity estimates

8
IQ Query Engine Architecture
IQ Server Front End Shared with
ASAnywhere Handles TDS and ODBC Parses Incoming
Statements Text Substitution Transformations Cross
-DB Decomposition (CIS) Security Checking Java
Support Stored Procedures
IQ Optimizer Predicate Inference Predicate
Selectivity Estimation Join Optimization Grouping
Algorithm Selection Subquery Optimization Index
Access Selection
IQ Run-Time Engine Prefetch Manager Predicate
Execution Tuple Projection Join
Execution Grouping Execution Sorting Subquery
Execution
9
IQ Query Optimizer

Predicate Inference
Inference of additional predicates via transitive
closure
Inference of additional local predicates from
complex multi-table predicates
Elimination of any redundant predicates after
join order has been decided

10
IQ Query Optimizer

Predicate Selectivity Estimations
Because so many columns are indexed, more
predicates have accurate selectivity estimates
If no complex indexes are available, or if the
predicate expression is too complicated, then we
use guesses based on the type of predicate
Users can supply their own selectivity estimates
thereby overriding the optimizers estimates

11
IQ Query Optimizer

Join Optimization
Applicable join indexes are considered
All possible bushy join plans that avoid
cartesian product joins are considered
Evaluated join sub-plans are stored in a hash
table
Alpha-pruning is used to limit the search space

12
IQ Query Optimizer

Join Optimization (cont.)
Join Algorithm Selection from among
Classic nested-loop
Sort-merge
Classic hash
Nested-loop push-down
Hash push-down semi-join
Sort-merge push-down semi-join (new
in 12.4.2)

13
IQ Query Optimizer

Grouping Algorithm Selection
Three different algorithms are compared for each
GROUP BY or DISTINCT operator
Sort-based
Classic hash-based
Index-based

14
IQ Query Optimizer

Subquery Optimization
Some EXISTS subqueries are flattened into
distinct joins
Results of uncorrelated subqueries are used
directly in the indexes of tables in the parent
query blocks
Correlated subqueries
Are migrated to minimize subquery re-executions
Have a 1-value result cache, and outer references
are ordered to further minimize re-executions
Are optimized to maximize invariant operators

15
IQ Query Optimizer

Index Access Selection
The optimizer does a cost-based selection of what
order to execute predicates and which index to
use for each predicate, considering
Predicate type
Indexes available on that column (or columns)
Estimated selectivity of this predicate
Estimated number of rows passing earlier
predicates
Whether invariant predicate results can be cached

16
IQ Query Plans

IQ uses a Volcano style data-flow query engine
Queries are transformed into an upside-down tree
of operator nodes, each of which can accept
tuples from any nodes immediately below it, and
can pass tuples up to the node above it
Each node transforms the tuple stream in some
way, such as filtering some tuples out, unioning
or joining multiple tuple streams together, etc.
A Query Plan is the textual display of a query
tree

17
IQ Query Plans
The Query Tree
The Query
select s_suppkey, count(ps_partkey) fr
om partsupp, supplier where ps_suppkey
s_suppkey and ps_availqty gt 100 group by
s_suppkey
18
IQ Query Plans
The Query Tree
The Query Plan
Query Plan 1 04 Group By (Hash)
Child Node 1 03 Query_Name
techwave_example_1 Output Vector 2
entries (12 data bytes) 2 . 03 Join (Hash)
Left Child Node 01 Right
Child Node 02 Valid Join Algorithms
NLJ, NLPDJ, SMJ, SMPDJ, HJ, HPDJ Left
Input Table 1 partsupp Right Input
Table 1 supplier Join Result
Constraint Many to 1 Condition 1
(partsupp.ps_suppkey supplier.s_suppkey)
Output Vector 2 entries (8 data bytes) 3
. . 01 Vertical Cursor Table Name
partsupp Table Row Count 8000
Condition 1 (Invariant)
(partsupp.ps_availqty gt 100)
Condition 1 Selectivity 0.98951089
Output Vector 2 entries (8 data bytes) 3 . .
02 Vertical Cursor Table Name
supplier Table Row Count 100
Output Vector 1 entries (4 data bytes)
19
IQ Query Plans
The Query Tree
The Query Plan
Query Plan 1 04 Group By (Hash)
Child Node 1 03 Query_Name
techwave_example_1 Output Vector 2
entries (12 data bytes) 2 . 03 Join (Hash)
Left Child Node 01 Right
Child Node 02 Valid Join Algorithms
NLJ, NLPDJ, SMJ, SMPDJ, HJ, HPDJ Left
Input Table 1 partsupp Right Input
Table 1 supplier Join Result
Constraint Many to 1 Condition 1
(partsupp.ps_suppkey supplier.s_suppkey)
Output Vector 2 entries (8 data bytes) 3
. . 01 Vertical Cursor Table Name
partsupp Table Row Count 8000
Condition 1 (Invariant)
(partsupp.ps_availqty gt 100)
Condition 1 Selectivity 0.98951089
Output Vector 2 entries (8 data bytes) 3 . .
02 Vertical Cursor Table Name
supplier Table Row Count 100
Output Vector 1 entries (4 data bytes)
20
IQ Query Plans

Query plans are enabled by setting an option
Query plans appear in the server log (.iqmsg)
file
All users query plans plus all other server
messages are interleaved in the server log file
All plan lines in the server log are prefixed
like
2000-06-20 204523 0001 20535
Prefix is composed of date, time, connection
id, message number for query plan lines 20535

set temporary option Query_Plan on
21
IQ Query Plans

Other useful options with query plans
Query_Name - adds one line to the query plan with
the name string supplied makes it much easier to
find your query plan in the midst of the server
log file
Query_Detail - adds lots of extra info to the
query plan that is occasionally useful, like the
data types, distinct counts, and indexes
available on each column used
NoExec - generates a plan without actually
executing
Query_Plan_After_Run - delays query plan printing
so that actual rows projected, etc. can be
included in plans

22
IQ Query Plan Elements

Leaf Node one table or one join index

3 . . 01 Vertical Cursor Table
Name partsupp Table Row Count
8000 Condition 1 (Invariant)
(partsupp.ps_availqty gt 100)
Condition 1 Selectivity 0.98951089
Output Vector 2 entries (8 data bytes)
23
IQ Query Plan Elements

Leaf Node with Query_Plan_After_Run

3 . . 01 Vertical Cursor
Generated Result Rows 7916 Table
Name partsupp Table Row Count
8000 Condition 1 (Invariant)
(partsupp.ps_availqty gt 100)
Condition 1 Selectivity 0.98951089
Output Vector 2 entries (8 data bytes)
24
IQ Query Plan Elements
3 . . 01 Vertical Cursor Table
Name partsupp Table Row Count
8000 Condition 1 (Invariant)
(partsupp.ps_availqty gt 100)
Condition 1 Selectivity 0.98951089
Condition 1 Index HNG ps_availqty_hng
Output Vector 2 entries (8 data bytes)
Output 1 partsupp.ps_partkey
Output 1 Data Type INTEGER (10, 0)
Output 1 Base Distincts 2000
Output 1 Indexes FP, HG
Output 2 partsupp.ps_suppkey
Output 2 Data Type INTEGER (10, 0)
Output 2 Base Distincts 100
Output 2 Indexes FP, HG
Column 1 partsupp.ps_availqty
Column 1 Data Type INTEGER (10, 0)
Column 1 Base Distincts 5497
Column 1 Indexes FP, HNG, HG
Index Output 1 (Projected) partsupp.ps_partkey
Index Output 2 (Projected)
partsupp.ps_suppkey

Leaf Node with Query_Detail

25
IQ Query Plan Elements

Join Node

2 . 03 Join (Hash) Left Child
Node 01 Right Child Node 02
Valid Join Algorithms NLJ, NLPDJ, SMJ,
SMPDJ, HJ, HPDJ Left Input Table 1
partsupp Right Input Table 1
supplier Join Result Constraint Many
to 1 Condition 1 (partsupp.ps_suppkey
supplier.s_suppkey) Output Vector
2 entries (8 data bytes)
26
IQ Query Plan Elements

Join Node with Query_Plan_After_Run

2 . 03 Join (Hash) Left Child
Node 01 Right Child Node 02
Generated Result Rows 7916
Valid Join Algorithms NLJ, NLPDJ, SMJ, SMPDJ,
HJ, HPDJ Left Input Table 1
partsupp Right Input Table 1
supplier Join Result Constraint Many
to 1 Condition 1 (partsupp.ps_suppkey
supplier.s_suppkey) Hash actual
number of keys 100 Output Vector 2
entries (8 data bytes)
27
IQ Query Plan Elements

Join Node with Query_Detail

Join Node Hash Push-Down

2 . 03 Join (Hash PushDown) Left
Child Node 01 Right Child Node
02 Valid Join Algorithms NLJ,
NLPDJ, SMJ, SMPDJ, HJ, HPDJ Left Input
Table 1 partsupp Right Input Table
1 supplier Join Result Constraint
Many to 1 Condition 1
(partsupp.ps_suppkey supplier.s_suppkey)
Condition 2 (Pushed) (partsupp.ps_suppkey
IN BV(0)) Output Vector 2 entries (8
data bytes)
29
IQ Query Plan Elements

Join Node Sort-Merge Push-Down

2 . 03 Join (Sort-Merge PushDown)
Left Child Node 05 Right Child
Node 06 Valid Join Algorithms
NLJ, NLPDJ, SMJ, SMPDJ, HJ, HPDJ Left
Input Table 1 partsupp Right Input
Table 1 supplier Join Result
Constraint Many to 1 Condition 1
(partsupp.ps_suppkey supplier.s_suppkey)
Condition 2 (Pushed) (partsupp.ps_suppkey
PROBABLY_IN BV(0)) Output Vector 2
entries (8 data bytes)
30
IQ Query Plans

Other query plan node types
Group By - grouping operator seen earlier
Order By - sort operator used by sort-merge
joins, subquery outer reference ordering, and
sort-based grouping, as well as the ORDER BY
clause
Vertical Grouping Cursor - leaf node where
grouping is being done in the indexes
Root - only appears when Query_Detail is on site
of internal data type to external data type
conversions

31
IQ Query Plans

Other query plan node types (cont.)
Scrolling Cursor Store - results cache that
appears at the top of a query tree to support
scrolling cursor access (see the
Force_No_Scroll_Cursors option description)
Store - results cache for invariant portions of
correlated subqueries
Subquery - executes correlated subqueries, or
uncorrelated subqueries that could not be used
directly with the parent query blocks indexes
Filter - site of non-pushable non-join predicates

32
IQ Query Performance Hints

Use a sensible schema design
Follow the recommendations in the manual for
where to create indexes, and what types to
create, including HG indexes on all join columns
Define the IQ UNIQUE attribute on all columns
with modest expected distinct value counts (lt
10000)
Do not use tiny page sizes. For big benchmarks
on 1Gb memory machines we often use 128K pages
RTFM and RTFRN

33
IQ Query Performance Experimentation

The IQ optimizer usually does a far better job of
creating a good query plan than we can, but there
will be times when you have a specific query that
just runs too slow (IYHO). Things you can try
If there is any local predicate where the
estimated selectivity is way off, supply the
actual selectivity.
You can often influence the choice of algorithms
by experimenting with the options
Join_Preference, Aggregation_Preference, and
Index_Preference
You can force the optimizer to accept your
specified join order exactly by setting
Join_Optimization off.