Title: Query Execution
1Query Execution
2Resolving an SQL query
- Since our SQL queries are very high level, the
query processor must do a lot of additional
processing to supply all of the missing details. - In practice, an SQL query is translated
internally into a relational algebra expression. - One advantage of using relational algebra is that
it makes alternative forms of a query easier to
explore. - The different algebraic expressions for a query
are called logical query plans. - We will focus first on the methods for executing
the operations of the relational algebra. - Then we will focus on how transform logical query
plans.
3Preview
- Parsing read SQL, output relational algebra tree
- Query rewrite Transform tree to a form, which is
more efficient to evaluate - Physical plan generation
- select implementation for each operator in
tree, - and for passing results up the tree.
- In this chapter we will focus on the
implementation for each operator.
4Relational Algebra recap
- RA union, intersection and difference correspond
to UNION, INTERSECT, and EXCEPT in SQL - Selection ? corresponds to the WHERE-clause in
SQL - Projection ? corresponds to SELECT-clause
- Product ? corresponds to FROM-clause
- Joins corresponds to JOIN, NATURAL JOIN, and
OUTER JOIN in the SQL2 standard - Duplicate elimination ? corresponds to DISTINCT
in SELECT-clause - Grouping ? corresponds to GROUP BY
- Sorting ? corresponds to ORDER BY
5Expression trees
MovieStar(name,addr,gender,birthdate) StarsIn(titl
e, year, starName) SELECT title, birthdate FROM
MovieStar, StarsIn WHERE year 1996 AND Gender
F AND starName name
6How to generate such alternative expression trees
will be Chapter 16.
- Join method?
- Can we pipeline the result of one or both
selections, and avoid storing the result on disk
temporarily? - Are there indexes on MovieStar.gender and/or
StarsIn.year that will make the ?'s efficient?
7Relational algebra for real SQL
- Keep in mind the following fact
- A relation in algebra is a set, while a relation
in SQL is probably a bag - In short, a bag allows duplicates.
- Not surprisingly, this can effect the cost of
related operations.
8Bag union, intersection, and difference
- Card(t,R) means the number of occurrences of
tuple t in relation R - Card(t, R?S) Card(t,R) Card(t,S)
- Card(t,R?S) minCard(t,R), Card(t,S)
- Card(t,RS) maxCard(t,R)Card(t,S), 0
- Example R A,B,B, S C,A,B,C
- R ? S A,A,B,B,B,C,C
- R ? S A,B
- R S B
9Beware Bag Laws ! Set Laws
- Not all algebraic laws that hold for sets also
hold for bags. - For one example, the commutative law for union
- R ? S S ? R does hold for bags.
- Since addition is commutative, adding the number
of times that tuple x appears in R and S doesnt
depend on the order of R and S. - Set union is idempotent, meaning that S ? S S.
- However, for bags, if x appears n times in S,
then it appears 2n times in S ? S. - Thus S ? S ! S in general.
10Physical operators
- Physical query plans are built from physical
operators. - In most cases, the physical operators are direct
implementations of the relational algebra
operators. - However, there are several other physical
operators for various supporting tasks. - Table-scan (the most basic operation we want to
perform in a physical query plan) - Index-scan (E.g. if we have an index on some
relation R we can retrieve the blocks of R by
using the index) - Sort-scan (takes a relation R and a specification
of the attributes on which the sort is to be
made, and produces R in sorted order)
11Computational Model
- When comparing algorithms for the same operations
we do not consider the cost of writing the
output. - Because the cost of writing the output on the
disk depends on the size of the result, not on
the way the result was computed. In other words,
it is the same for any computational alternative. - Also, we can often pipeline the result to other
operators when the result is constructed in main
memory. So a final output phase may not even be
required.
12Cost parameters
- In order to evaluate or estimate the cost of
query resolution, we need to identify the
relevant parameters. - Typical cost parameters include
- R the relation on disk
- M number of main memory buffers available
(1buffer 1block) - B(R) number of blocks of R
- T(R) number of tuples of R
- V(R, a) number of distinct values in column a
of R - V(R, L) number of different tuples in R (where
L is a list of attributes or columns) - Simple cost estimate
- Basic scan B(R)
- 2PMMS 3B(R)
- Recall that final output is not counted
13Algorithms for implementing RA-operators
- Classification of algorithms
- Sort based methods
- Hash based methods
- Index based methods
- Degree of difficultness of algorithms
- One pass (when one relation can fit into main
memory) - Two pass (when no relation can fit in main
memory, but again the relations are not very
extremely large) - Multi pass (when the relations are very extremely
large) - Classification of operators
- Tuple-at-a-time, unary operations(s, p)
- Full-relation, unary operations (d, g)
- Full-relation, binary operations (union, join,)
14Selection --
- The condition C might involve
- Arithmetic (,-, ?) or string operators such as
LIKE - Comparison between terms, e.g. a lt b or ab 10.
- Boolean connectives AND, OR, and NOT
- Example R
a b ---- 2 3 4 5 2 3
a b ---- 0 1 2 3 4 5 2 3
a b ---- 4 5
One pass, tuple-at-a-time
15Projection --
- Argument L of ? is a sequence of elements of the
following form - A single attribute in R, or
- An expression x ? y, where x and y are attribute
names, or - An expression E ? z, where E is an expression
involving attributes in R and z is a new
attribute name not in R - Example R
a b c ------ 0 1 2 0 1 2 3 4 5
a x ---- 0 3 0 3 3 9
x y ---- 1 1 1 1 1 1
One pass, tuple-at-a-time
16One pass, tuple-at-a-time
- Selection and projection
- Cost B(R) or T(R)
- (if the relation is
- not clustered)
- Space requirement
- M ? 1 block
- Principle
- Read one block (or one tuple if the relation is
not clustered) at a time - Filter in or out the tuples of this block.
17Duplicate Elimination
- R1 ?(R2).
- R1 consists of one copy of each tuple that
appears in R2 one or more times.
R A B 1 2 3 4 1 2
One pass, unary full-relation operations
18One pass, unary full-relation operations
- Duplicate elimination
- for each tuple decide
- seen before ignore
- new output
- Principle
- It is the first time we have
- seen this tuple, in which case we copy it to the
output. - We have seen the tuple before,in which case we
must not output this tuple. - We need a Main Memory hash-table to be efficient.
- Requirement ?
19One pass, unary full-relation operations
20One pass, unary full-relation operations
21One pass, unary full-relation operations
- Grouping Accumulate the information on groups in
main memory. - Need to keep one entry for each value of the
grouping attributes, through a main memory
search structure (hash table). - Then, we need for each group to keep an
aggregated value (or values if the query asks for
more than one aggregation). - For MIN/MAX we keep the min or max value seen so
far for the group. - For COUNT aggregation we keep a counter which is
incremented each time we encounter a tuple
belonging to the group. - For SUM, we add the value if the tuple belongs to
the group. - For AVG?
- MM requirement.
- Typically, a (group) tuple will be smaller than a
tuple of the input relation, - Typically, the number of groups will be smaller
than the number of tuples in the input relation.
The number of groups is - Requirement ? lt M
22One pass, binary operators
- Requirement min(B(R),B(S)) M
- Exception bag union
- Cost B(R) B(S)
- Assume R is larger than S.
- How to perform the operations below
- Set union, set intersection, set difference
- Bag intersection, bag difference
- Cartesian product, natural join
- All these operators require reading the smaller
of the relations into main memory using there a
search scheme (e.g. main memory hash table) for
easy search and insertion.
23Set Union
- Let R and S be sets.
- We read S into M-1 buffers of main memory.
- All these tuples are also copied to the output.
- We then read each block of R into the Mth buffer,
one at a time. - For each tuple t of R we see if t is in S, and if
not, we copy t to output. - At the end we output S from main mem.
24Set Intersection
- Let R and S be sets.
- The result will be set.
- We read S into M-1 buffers of main memory.
- We then read each block of R into the M-th
buffer, one at a time. - For each tuple t of R we see if t is in S, and if
so, we copy t to output.
25Set Difference
- Let R and S be sets.
- Since difference is not a commutative operator,
we must distinguish between R-S and S-R. - Read S into M-1 buffers of main memory.
- Then read each block of R into the Mth buffer,
one at a time. - To compute R-S
- for each tuple t of R we see if t is not in S,
and if so, we copy t to output. - To compute S-R
- for each tuple t of R we see if t is is in S, we
delete t from S in such a case. At the end we
output those tuples of S that remain.
26Bag Intersection
- Let R and S be bags.
- Read S into M-1 buffers of main memory.
- Also, associate with each tuple a count, which
initially measures the number of times the tuple
occurs in S. - Then read each block of R into the M-th buffer,
one at a time. - For each tuple t of R we see if t is in S. If not
we ignore it. - Otherwise, if the counter is greater than zero,
we output t and decrement the counter.
27Bag Difference
- We read S into M-1 buffers of main memory.
- Also, we associate with each tuple a count, which
initially measures the number of times the tuple
occur in S. - We then read each block of R into the M-th
buffer, one at a time. - To compute S-R
- for each tuple t of R we see if t is is in S, we
decrement its counter. - At the end we output those tuples of S that
remain with counter positive. - To compute R-S
- we may think of the counter c for tuple t as
having c reasons to not output t. - Now, when we process a tuple of R we check to see
if that tuple appears in S. If not we output t. - Otherwise, we check to see the counter c of t. If
it is 0 we output t. - If not, we dont output t, and we decrement c.
28Product
- We read S into M-1 buffers of main memory. No
special structure is needed. - We then read each block of R into the M-th
buffer, one at a time. And combine each tuple
with all the tuples of S.
29(No Transcript)
30Natural Join
- We read S into M-1 buffers of main memory and
build a search structure where the search key is
the shared attributes X of R and S. - We then read each block of R into the M-th
buffer, one at a time. For each tuple t of R we
look for tuples t in S, and if tX uX, we
copy t.u to the output.
31Nested-Loop joins
- one-and-a-half pass method, since one relation
will be read just once. - Tuple-Based Nested-loop Join Algorithm
- FOR each tuple s in S DO
- FOR each tuple r in R DO
- IF r and s join to make a tuple t THEN
- output t
- Improvement to Take Advantage of Disk I/O Model
- Instead of retrieving tuples of R, T(S) times,
load memory with as many tuples of S as can fit,
and match tuples of R against all S tuples in
memory.
32Block-based nested loops
- Assume B(S) B(R), and B(S) gt M
- Read M-1 blocks of S into main memory and compare
to all of R, block by block - FOR each chunk of M-1 blocks of S DO
- FOR each block b of R DO
- FOR each tuple t of b DO
- find the tuples of S in memory that join
with t - output the join of t with each of these
tuples
33Example
- B(R) 1000, B(S) 500, M 101
- Outer loop iterates 5 times
- At each iteration we read M-1 (i.e. 100) blocks
of S and all of R (i.e. 1000) blocks. - Total time 5(100 1000) 5500 I/Os
- Question What if we reversed the roles of R and
S? - We would iterate 10 times, and in each we would
read 100500 blocks, for a total of 10(100500)
6000 I/Os. - Compare with one-pass join, if it could be done!
- We would need 1500 disk I/Os if B(S) ? M-1
34Analysis of blocks nested loops
- Number of disk I/Os
- B(S)/(M-1) (M-1 B(R))
- or
- B(S) (B(S)B(R))/(M-1)
- or approximately (B(S)B(R))/M
35Summary of one-pass algorithms
36Two-pass algorithms based on sorting
- This special case of multi-pass algorithms is
sufficient for most of the relation sizes. - Main idea for unary operations on R
- Suppose B(R) ? M (main memory size in blocks)
- First pass
- Read M blocks of R into Main Memory
- Sort the content of Main Memory
- Write the sorted result (sublist/run) into M
blocks on disk. - Second pass create final result
37Duplicate elimination ? using sorting
- In the second phase (merging) we dont sort but
copy each tuple just once. - We can do that because the identical tuples will
show up at the same time, i.e. they will be all
the first ones at the buffers (for the sorted
sublists). - As usual, if one buffer gets empty we refill it.
38Duplicate-Elimination using Sorting Example
- Assume M3, each buffer holds 2 records and
relation R consists of the following 17 tuples - 2, 5, 2, 1, 2, 2, 4, 5, 4, 3, 4, 2, 1, 5, 2, 1,
3 - After the first pass the following sorted
sub-lists are created - 1, 2, 2, 2, 2, 5
- 2, 3, 4, 4, 4, 5
- 1, 1, 2, 3, 5
- In the second pass we dedicate a memory buffer to
each sub-list.
39Example (Contd)
40Example (Contd)
41Example (Contd)
42Analysis of ?(R)
- 2B(R) to create sorted sublists, B(R) to read
each sublist in phase 2. Total 3B(R) - How large can R be?
- There can be no more than M sublists since we
need one buffer for each one. - So, B(R)/M M, (B(R)/M is the number of
sublists) -
- i.e. B(R) M2
- To compute ?(R) we need at least sqrt(B(R))
blocks of Main Memory.
43Sort-based ?, ?, -
Example set union.
- Create sorted sublists of R and S
- Use input buffers for sorted sublists of R and S,
one buffer per sublist. - Output each tuple once.
- We can do that since all the identical tuples
appear at the same time.
- Analysis 3(B(R) B(S)) disk I/Os
- Condition B(R) B(S) M2
- Similar algorithms for sort based intersection
and difference (bag or set versions).
44Simple sort-based join
- For R(X,Y) S(Y,Z) with M buffers of memory
- Completely sort R on Y, sort S on Y
- Merge phase
- Use 2 input buffers 1 for R, 1 for S.
- Pick tuple t with smallest Y value in the buffer
for R - If t doesnt match with the first tuple in the
buffer for S, then just remove t. - Otherwise, read all the tuples from R with the
same Y value as t and put them in the M-2 part of
the memory. - When the input buffer for R is exhausted fill it
again and again. - Then, read the tuples of S that match. For each
one we produce the join of it with all the tuples
of R in the M-2 part of the memory.
45Example of sort join
- B(R) 1000, B(S) 500, M 101
- To sort R, we need 4B(R) I/Os, same for S.
- Number of I/Os 4(B(R) B(S))
- Doing the join in the merge phase
- Number of I/Os B(R) B(S)
- Total disk I/Os 5(B(R) B(S)) 7500
- Memory Requirement?
- To be able to do the sort, we should have B(R)
M2 and B(S) M2 - Recall for nested-loop join, we needed 5500 disk
I/Os, but the memory requirement was quadratic
(it is linear, here), i.e., nested-loop join is
not good for joining relations that are much
larger than main memory.
46Potential problem ...
S(Y, Z) --------- a z1 a z2 ...
a zm
R(X , Y) ----------- x1 a x2 a
xn a
What if Size of n tuples gt M-2 and Size of m
tuplesgt M-2?
- If the tuples from R (or S) with the same value y
of Y do not fit in M-2 buffers, then we use all
M-2 buffers to do a nested-loop join on the
tuples with Y-value y from both relations. - Observe that we can smoothly continue with the
nested loop join when we see that the R tuples
with Y-value y do not fit in M-2 buffers.
47Can We Improve on Sort Join?
- Do we really need the fully sorted files?
- Suppose we are not worried about many common Y
values
48A more efficient sort-based join
- Suppose we are not worried about many common Y
values - Create Y-sorted sublists of R and S
- Bring first block of each sublist into a buffer
(assuming we have at most M sublists) - Find smallest Y-value from heads of buffers. Join
with other tuples in heads of buffers, use other
possible buffers, if there are many tuples with
the same Y values.
- Disk I/O 3(B(R) B(S))
- Requirement B(R) B(S) M2
49Example
- B(R) 1000, B(S) 500, M 101
- Total of 15 sorted sublists
- If too many tuples join on a value Y, use the
remaining 86 MM buffers for a one pass join on Y - Total cost 3(1000 500) 4500 disk I/Os
- M2 10201 gt B(R) B(S), so the requirement is
satisfied
50Summary of sort-based algorithms