Title: QUERY PROCESSING and OPTIMIZATION
1QUERY PROCESSING and OPTIMIZATION
- PRESENTED DECEMBER 5, 2006 BY
- JOHN KRETTEK
- ED BANTI
- YUYANG CHEN
2Agenda of Discussion
- I. Query Processing and Optimization Why?
- II. Steps of Processing
- III. Methods of Optimization
- Heuristic (Logical Transformations)
- Transformation Rules
- Heuristic Optimization Guidelines
- Cost Based (Physical Execution Costs)
- Data Storage/Access Refresher
- Catalog Costs
- IV. What All This Means To YOU?
3Query Processing Optimization
- What is Query Processing?
- Steps required to transform high level SQL query
into a correct and efficient strategy for
execution and retrieval. - What is Query Optimization?
- The activity of choosing a single efficient
execution strategy (from hundreds) as determined
by database catalog statistics.
4Questions for Query Optimization
- Which relational algebra expression, equivalent
to the given query, will lead to the most
efficient solution plan? - For each algebraic operator, what algorithm (of
several available) do we use to compute that
operator? - How do operations pass data (main memory buffer,
disk buffer,)? - Will this plan minimize resource usage?
(CPU/Response Time/Disk)
5Query Processing Who needs it?
A motivating example
Identify all managers who work in a London
branch SELECT FROM Staff s, Branch b WHERE
s.branchNo b.branchNo AND s.position
Manager AND b.city london
- Results in these equivalent relational algebra
statements - (1)?s(positionManager)(cityLondon)(Sta
ff.branchNoBranch.branchNo) (Staff X Branch) - (2) s(positionManager)(cityLondon)
(Staff wvStaff.branchNo Branch.branchNo Branch) - (3) s(positionManager) (Staff)
wvStaff.branchNo Branch.branchNo
s(cityLondon) (Branch)
6A Motivating Example (cont)
- Assume
- 1000 tuples in Staff.
- 50 Managers
- 50 tuples in Branch.
- 5 London branches
- No indexes or sort keys
- All temporary results are written back to disk
(memory is small) - Tuples are accessed one at a time (not in blocks)
7Motivating Example Query 1 (Bad)
- s(positionManager)(cityLondon)(Staff.branc
hNoBranch.branchNo) (Staff X Branch) - Requires (100050) disk accesses to read from
Staff and Branch relations - Creates temporary relation of Cartesian Product
(100050) tuples - Requires (100050) disk access to read in
temporary relation and test predicate - Total Work (100050) 2(100050)
- 101,050 I/O operations
8Motivating Example Query 2 (Better)
s(positionManager)(cityLondon) (Staff
wvStaff.branchNo Branch.branchNo Branch)
- Again requires (100050) disk accesses to read
from Staff and Branch - Joins Staff and Branch on branchNo with 1000
tuples - (1 employee 1 branch )
- Requires (1000) disk access to read in joined
relation and check predicate - Total Work (100050) 2(1000)
- 3050 I/O operations
- 3300 Improvement over Query 1
9Motivating Example Query 3 (Best)
s(positionManager) (Staff)
wvStaff.branchNo Branch.branchNo
s(cityLondon) (Branch)
- Read Staff relation to determine Managers (1000
reads) - Create 50 tuple relation(50 writes)
- Read Branch relation to determine London
branches (50 reads) - Create 5 tuple relation(5 writes)
- Join reduced relations and check predicate (50
5 reads) - Total Work 1000 2(50) 5 (50 5)
- 1160 I/O operations
- 8700 Improvement over Query 1
- Consider if Staff and Branch relations were 10x
size? 100x? Yikes!
10Three Major Steps of Processing
- (1) Query Decomposition
- Analysis
- Derive Relational Algebra Tree
- Normalization
- (2) Query Optimization
- Heuristic Improve and Refine relational algebra
tree to create equivalent Logical Query Plans - Cost Based Use database statistics to estimate
physical costs of logical operators in LQP to
create Physical Execution Plans - (3) Query Execution - Go Go Query Cheapest!
11Processing Steps
12Query Decomposition
- ANALYSIS
- Lexical Is it even valid SQL?
- Syntactic Do the relations/attributes exist and
are the operations valid? -
- Result is internal tree representation of SQL
query (Parse Tree)
13Query Decomposition (cont)
- RELATIONAL ALGEBRA TREE
- Root The desired result of query
- Leaf Base relations of query
- Non-Leaf Intermediate relation created from
relational algebra operation - NORMALIZATION
- Convert WHERE clause into more easily manipulated
form - Conjunctive Normal Form(CNF) (a v b) Ù (c v d)
Ù e Ù f (more efficient) - Disjunctive Normal Form(DNF) Ú
Ú
14Heuristic Optimization
- GOAL
- Use relational algebra equivalence rules to
improve the expected performance of a given query
tree. - Consider the example given earlier
- Join followed by Selection ( 3050 disk reads)
- Selection followed by Join ( 1160 disk reads)
15Relational Algebra Transformations
- Cascade of Selection
- (1) sp Ù q Ù r (R) sp(sq(sr(R)))
- Commutativity of Selection Operations
- (2) sp(sq(R)) sq(sp(R))
- In a sequence of projections only the last is
required - (3) PLPMPN(R) PL(R)
- Selections can be combined with Cartesian
Products and Joins - (4) sp( R x S ) R wvp S
- (5) sp( R wvq S ) R wvq p S
Note The above is an incomplete List! For a
complete list see the text.
16More Relational Algebra Transformations
- Join and Cartesian Product Operations are
Commutative and Associative - (6) R x S S x R
- (7) R x (S x T) (R x S) x T
- (8) R wvp S S wvp R
- (9) (R wvp S) wvq T R wvp (S wvq T)
- Selection Distributes over Joins
- If predicate p involves attributes of R only
- (10) sp( R wvq S ) sp(R) wvq S
- If predicate p involves only attributes of R and
q involves only attributes of S - (11) spq(R wvr S) sp(R) wvr sq(S)
17Optimization Uses The Following Heuristics
- Break apart conjunctive selections into a
sequence of simpler selections (preparatory step
for next heuristic). - Move s down the query tree for the earliest
possible execution (reduce number of tuples
processed). - Replace s-x pairs by wv (avoid large intermediate
results). - Break apart and move as far down the tree as
possible lists of projection attributes, create
new projections where possible (reduce tuple
widths early). - Perform the joins with the smallest expected
result first
18Heuristic Optimization Example
- What are the ticket numbers of the pilots flying
to France on 01-01-06? -
- SELECT p.ticketno
- FROM Flight f , Passenger p, Crew c
- WHERE f.flightNo p.flightNo AND
- f .flightNo c.flightNo AND
- f.date 01-01-06 AND
- f.to FRA AND
- p.name c.name AND
- c.job Pilot
Canonical Relational Algebra Expression
19Heuristic Optimization (Step 1)
20Heuristic Optimization (Step 2)
21Heuristic Optimization (Step 3)
22Heuristic Optimization (Step 4)
23Heuristic Optimization (Step 5)
24Heuristic Optimization (Step 6)
25Physical Execution Plan
- Identified optimal Logical Query Plans
- Every heuristic not always best transform
- Heuristic Analysis reduces search space for cost
evaluation but does not necessarily reduce costs - Annotate Logical Query Plan operators with
physical operations (1 ) - Binary vs. Linear search for Selection?
- Nested-Loop Join vs. Sort-Merge Join?
- Pipelining vs. Materialization?
- How does optimizer determine cheapest plan?
26Physical Searching
27Physical Storage
- Record Placement
- Types of Records
- Variable Length
- Fixed Length
- Record Separation
- Not needed when record size lt block size
- Fixed records dont need it
- If needed, indicate records with special marker
and give record lengths or offsets
28Record Separation
- Unspanned
- Records must stay within a block
- Simpler, but wastes space
- Spanned
- Records are across multiple blocks
- Require pointer at the end of the block to the
next block with that record - Essential if record size gt block size
29Record Separation
- Mixed Record Types Clustering
- Different record types within the same block
- Why cluster? Frequently accessed records are in
the same block - Has performance downsides if there are many
frequently accessed queries with different
ordering - Split Records
- Put fixed records in one place and variable in
another block
30Record Separation
- Sequencing
- Order records in sequential blocks based on a key
- Indirection
- Record address is a combination of various
physical identifiers or an arbitrary bit string - Very flexible but can be costly
31Accessing Data
- What is an index?
- Data structure that allows the DBMS to quickly
locate particular records or tuples that meet
specific conditions - Types of indicies
- Primary Index
- Secondary Index
- Dense Index
- Sparse Index/Clustering Index
- Multilevel Indicies
32Accessing Data
- Primary Index
- Index on the attribute that determines the
sequencing of the table - Guarantees that the index is unique
- Secondary Index
- An index on any other attribute
- Does not guarantee unique index
33Accessing Data
- Dense Index
- Every value of the indexed attribute appears in
the index - Can tell if record exists without accessing files
- Better access to overflow records
- Clustering Index
- Each index can correspond to many records
34Dense Index
35Accessing Data
- Sparse Index
- Many values of the indexed attribute dont appear
- Less index space per record
- Can keep more of index in memory
- Better for insertions
- Multilevel Indices
- Build an index on an index
- Level 2 Index -gt Level 2 Index -gt Data File
36Sparse Index
37B Tree
- Use a tree model to hold data or indices
- Maintain balanced tree and aim for a bushy
shallow tree
100
120 150 180
30
3 5 11
120 130
180 200
100 101 110
150 156 179
30 35
38B Tree
- Rules
- If root is not a leaf, it must have at least two
children - For a tree of order n, each node must have
between n/2 and n pointers and children - For a tree of order n, the number of key values
in a leaf node must be between (n-1)/2 and (n-1)
pointers and children
39B Tree (cont)
- Rules
- The number of key values contained in a non-leaf
node is 1 less than the number of pointers - The tree must always be balanced that is, every
path from the root node to a leaf must have the
same length - Leaf nodes are linked in order of key values
40Hashing
- Calculates the address of the page in which the
record is to be stored based on more or more
fields - Each hash points to a bucket
- Hash function should evenly distribute the
records throughout the file - A good hash will generate an equal number of keys
to buckets - Keep keys sorted within buckets
41Hashing
. . .
records
key ? h(key)
. . .
42Hashing
- Types of hashing
- Extensible Hashing
- Pro
- Handle growing files
- Less wasted space
- No full reorganizations
- Con
- Uses indirection
- Directory doubles in size
43Hashing
- Types of hashing
- Linear Hashing
- Pro
- Handle growing files
- Less wasted space
- No full reorganizations
- No indirection like extensible hashing
- Con
- Still have overflow chains
44Indexing vs. Hashing
- Hashing is good for
- Probes given specific key
- SELECT FROM R WHERE R.A 5
- Indexing is good for
- Range searches
- SELECT FROM R WHERE R.A gt 5
45Cost Model
46Cost Model
- To compare different alternatives we must
evaluate and estimate how expensive (resource
intensive) a specific execution is - Inputs to evaluate
- Query
- Database statistics
- Resource availability
- Disk Bandwidth/CPU costs/Network Bandwidth
47Cost Model
- Statistics
- Held in the system catalog
- nTuples(R), number of tuples in a relation
- bFactor(R), number of tuples that fit into a
block - nBlocks(R), number of blocks required to store R
- nDistinctA(R), number of distinct values that
appear for attribute in relation - minA(R), maxA(R), min/max possible values for
attribute in relation - SCA(R), average number of tuples that satisfy an
equality condition
48Selection Operation
- Example
- Branch (branchNo, street, city, postcode)
- Staff (staffNo, fName, lName, branchNo)
- Staff
- 100,000 tuples (denoted as CARD(S))
- 100 tuples per page
- 1000 pages (StaffPages)
- Branch
- 40,000 tuples (denoted as CARD(B))
- 80 tuples per page
- 500 pages (BranchPages)
49Selection Operation
- SELECT FROM Branch WHERE city B
- General Form
- Without Index
- Search on attributes BranchPages 500
- Search on primary key attribute BranchPages
500/2 250 - Using Clustered B Tree
- Costs
- Path from root to the leftmost leaf with
qualifying data entry - Retrieve all leaf pages fulfilling search
criteria - For each leaf page get corresponding data pages
- Each data page only retrieved once
50Selection Operation
- Using Index for Selections (Clustered B Tree)
- Example 1
- SELECT FROM Branch WHERE branchNo 6
- 1 tuple match (1 data page)
- Cost 1 index leaf page 1 data page
- Example 2
- SELECT FROM Branch WHERE street LIKE c
- 100 tuple matches (2 data pages)
- Cost 1 index leaf page two data pages
- Example 3
- SELECT FROM Branch WHERE street lt c
- 10,000 tuple matches (125 data pages)
- Cost 2 index leaf pages 125 data pages
51Selection Operation
- Using Index for Selections (Non-Clustered B
Tree) - Example 1
- SELECT FROM Branch WHERE branchNo 6
- 1 tuple match (1 data page)
- Cost 1 index leaf page 1 data page
- Example 2
- SELECT FROM Branch WHERE street LIKE c
- 100 tuple matches (80 data pages)
- Cost 1 index leaf page 100 data pages (some
pages are retrieved twice) - Example 3
- SELECT FROM Branch WHERE street lt c
- 10,000 tuple matches (490 data pages)
- Cost 2 index leaf pages 10,000 data pages
(pages will be retrieved several
times)
52Projection Operation
Extracts vertical subset of relation R, and
produces single relation S
- Steps to implementing the Projection Operation
- 1. Remove the attributes that are not required
- Eliminate any duplicate tuples from the required
attribute - Only required if the attributes do not include
the key of relation) - Done by sorting or hashing.
53Projection Operation
- Estimation of Cardinality of result set
- Projection contains key
- ntuples(S) nTuples(R)
- Projection contains non-key attribute A
- ntuples(S) SCA(R)
54Projection Operation
- Duplicate elimination using sorting
- Sort tuples of the reduced relation using all
remaining attributes as a sort key. - Duplicates are adjacent and can be easily
removed. - To remove unwanted tuples, need to read all the
tuples of R and copy the required attributes to a
temporary relation. - Estimated Cost
- nBlocks(R) nBlocks(R) log2(nBlocks(R))
55Projection Operation
- Duplicate elimination using hashing
Useful if there is a large number of buffer
blocks for R.
- Partitioning
- Allocate one buffer block for reading R allocate
(nBuffer 1) blocks for the output. - For each tuple in R,
- remove the unwanted attributes
- apply hash function h to the combination of the
remaining attributes - write the reduced tuple to the hashed value.
- h chosen so that tuples are uniformly distributed
to on of the (nBuffer 1) partitions. Then, two
tuples belonging to different partitions are not
duplicates
56Projection Operation
- Read each of the (nBuffer 1) partitions in
turn. - Apply a different hash function h2() to each
tuple as is read. - Insert the computed hash value into an in-memory
hash table. - If the tuple hashes to the same value as some
other tuple, check whether the two are the same,
and eliminate the new one if it is a duplicate. - Once a partion has been removed, write the tuples
in the hash table to the result file.
57Projection Operation
- Estimated Cost nBlocks(R) nb
- nb number of blocks required for a tempory
table that results from Projection on R before
duplicate elimination. - Assume hashing has no overflow
- Exclude cost of writing result relation
58Join Operation
- Join is very expensive and must be carefully
optimized - Ways the processor can join
- Simple nested loop join
- Block nested loop join
- Indexed nested loop join
- Sort-merge join
- Hash join
59Simple Nested Loop Join
- For each tuple in the outer relation, we scan
the entire inner relation (scan each tuple in
inner). - Simplest algorithm
- Can easily improve on this using Block Nested
Loop - Basic unit of read/write is disk block
- Add two additional loops that process blocks
60Block Nested Loop Join
- Outer loop iterates over one table, inner loop
iterates over the other (two additional loops on
the outside for the disk blocks) - for ib 1nBlocks(R)
- for jb 1nBlocks(S)
- for i 1nTuples
- for j 1nTuples
- Cost depends on buffer for outer block loops
- nBlocks(R) (nBlocks(R) nBlocks(S))
- If buffer has only one block for R and S
- nBlocks(R) nBlocks(S) (nBlocks(R) /
(nBuffer-2)) - If (nBuffer-2) blocks for R (same number of R
blocks, less for S) - nBlocks(R) nBlocks(S),
- If all blocks of R can be read into database
buffer (no outside loops)
61Block Nested Loop Join
62Indexed Nested Loop Join
- For each tuple in R, use index to lookup matching
tuples in S - avoids enumeration of the Cartesian product of R
and S) - Cost depends on indexing method for example
- nBlocks(R) nTuples(R) (nLevelsA(I) 1),
- If join attribute A in S is the primary key
- nBlocks(R) nTuples(R) (nLevelsA(I) SCA(R)
/ bFactor(R)), - For clustering index I on attribute A
63Sort- Merge Join
For Equijoins, the most efficient join occurs
when both relations are sorted on the join
attributes. We can look for quality tuples of R
and S by merging the two relations.
- If the tables are not sorted on key values,then
sort first and merge the table (log(N)) - If tables are sorted on key values,then just
merge (linear time) - Cost
- nBlocks(R) log2(nBlocks(R) nBlocks(S)
log2(nBlocks(S)), for sorts - nBlocks(R) nBlocks(S), for merge
64Hash Join
- Use hash map for indexing into other table
- Cost
- 3(nBlocks(R) nBlocks(S))
- If hash index is held in memory
- Read R S to partition
- Write R S to disk
- Read R S again to find matching tuples
- 2(nBlocks(R) nBlocks(S)) lognBuffer-1(nBlocks
(S))-1 Blocks(R) nBlocks(S) - Otherwise (if hash index cannot be held in memory)
65Hash Join
66Example Cost Estimation for Join Operation
- Assumptions
- There are separate hash indexes with no overflow
on the primary key attributes staffNo of Staff
and branchNo of Branch - There are 100 database buffer blocks
- The system catalog holds the following
statistics - nTuples(Staff) 6000
- bFactor(Staff) 30 ? nBlocks(Staff) 200
- nTuples(Branch) 500
- bFactor(Branch) 50 ? nBlocks(Branch) 10
- nTuples(PropertyForRent) 100,000
- bFactor(PropertyForRent) 50 ?
nBlocks(PropertyForRent) 2000 - J1 Staff XstaffNo PropertyForRent
- J2 Branch XbranchNo PropertyForRent
67Estimated I/O Costs of Join Operations
68Questions?
- Thank you for your time.
- Questions? Comments?