How the CBO works - PowerPoint PPT Presentation

About This Presentation

Title:

How the CBO works

Description:

Design, Strategy, Reviews, Briefings, Seminars, Tutorials, Trouble-shooting ... selectivity * leaf_blocks selectivity * clustering_factor ... – PowerPoint PPT presentation

Number of Views:94

Avg rating:3.0/5.0

Slides: 38

Provided by: jonatha123

Learn more at: http://www.nocoug.org

Category:

more less

Transcript and Presenter's Notes

Title: How the CBO works

1
How the CBO works

Jonathan Lewis
www.jlcomp.demon.co.uk

2
Who am I
Independent Consultant. 18 years
experience. Design, Strategy, Reviews,
Briefings, Seminars, Tutorials,
Trouble-shooting www.jlcomp.demon.co.uk
3
Highlights

A Puzzle
Basic Costs
Correcting Oracle's assumptions
Oracle 9 learns
(Join Mechanics - time permitting)
Q and A

4
A puzzle (v8.1 - 4K)

create table t1 as
select
trunc((rownum-1)/15) n1,
trunc((rownum-1)/15) n2,
rpad('x',100) v1
from all_objects where rownum lt 3000

create table t2 as select mod(rownum,200) n1,
mod(rownum,200) n2, rpad('x',100) v1 fr
om all_objects where rownum lt 3000
We construct two sets of data with identical
content, although we do use two different
mathematical methods to get 15 rows each for 200
different values.
5
A puzzle - indexed

create index t_i1 on t1(n1)
create index t_i2 on t2(n1)
analyze table t1 compute statistics
analyze table t2 compute statistics

We create indexes and generate statistics. In
newer versions, we should use the dbms_stats
package, not analyze. (Note- compute is often
over-kill)
6
A puzzle - checking data

USER_TABLES
TABLE_NAME BLOCKS NUM_ROWS AVG_ROW_LEN
----------- ------ ---------- -----------
T1 96 3000 111
T2 96 3000 111

USER_TAB_COLUMNS TAB COL LOW_VALUE HIGH_VALUE
NUM_DISTINCT --- ---- --------- ----------
------------ T1 N1 80 C20264
200 T2 N1 80 C20264 200
We can check that statistics like number of rows,
column values and column counts are identical.
The data contents is identical across the two
tables.
7
A puzzle - the problem
select from t1 where n1 45 SELECT STATEMENT
OptimizerCHOOSE (Cost2 Card15) TABLE ACCESS
(BY INDEX ROWID) OF 'T1' (Cost2 Card15)
INDEX (RANGE SCAN) OF 'T_I1' (NON-UNIQUE) (Cost1
Card15)

select from t2 where n1 45
SELECT STATEMENT OptimizerCHOOSE (Cost15
Card15)
TABLE ACCESS (FULL) OF 'T2' (Cost15 Card15)

We now run exactly the same query against the two
sets of data - with autotrace switched on - and
find that the execution plans are different.
8
A puzzle - force it

select / index(t2) / from t2 where n1 45
SELECT STATEMENT OptimizerCHOOSE (Cost16
Card15)
TABLE ACCESS (BY INDEX ROWID) OF 'T2' (Cost16
Card15)
INDEX (RANGE SCAN) OF 'T_I2' (NON-UNIQUE)
(Cost1 Card15)
select from t2 where n1 45
SELECT STATEMENT OptimizerCHOOSE (Cost15
Card15)
TABLE ACCESS (FULL) OF 'T2' (Cost15 Card15)

Why has Oracle ignored the index on T2 ? Put in
the hint(s) to make it happen, and see if we get
any clues. The cost of a tablescan is cheaper !
9
A puzzle - the detail

select
table_name tab,
num_rows num_rows,
avg_leaf_blocks_per_key l_blocks,
avg_data_blocks_per_key d_blocks,
clustering_factor cl_fac
from user_indexes

TAB NUM_ROWS L_BLOCKS D_BLOCKS CL_FAC ----
-------- -------- -------- ------ T1
3000 1 1 96 T2 3000
1 15 3000
Why is the tablescan cheaper ? We look at the
data scattering, rather than the data content,
and find the answer. The clustering is different.
10
A puzzle - the difference
The data on the left shows the effect of the
trunc() function, the data on the right shows the
mod() effect. The statistics describe the data
perfectly.
11
The arithmetic

T2 by index
one index block, 15 data blocks 16

T2 by scan 96 blocks / 8 (multiblock read)
12 (this is a first approximation)
T1 by index one index block, one data block 2
Silly assumption 1 Every logical request turns
into a physical read. Silly assumption 2 A
multiblock read is just as fast as a single block
read.
12
Multiblock Read
96 / 6.589 14.57 (and add 1 in 9.2)
The cost of a tablescan uses an 'adjusted'
db_file_multiblock_read_count. Under
'traditional' costing the v9 cost is always one
more than the v8 cost..
13
Multiblock Read Count

select from t2 where n1 45 -- tablescan
cost was 15

alter session set DB_FILE_MULTIBLOCK_READ_COUNT
4 SELECT STATEMENT OptimizerCHOOSE (Cost16
Card15) TABLE ACCESS (BY INDEX ROWID) OF 'T2'
(Cost16 Card15) INDEX (RANGE SCAN) OF
'T_I2' (NON-UNIQUE) (Cost1 Card15)
(tablescan cost would be 23)
alter session set DB_FILE_MULTIBLOCK_READ_COUNT
16 SELECT STATEMENT OptimizerCHOOSE (Cost10
Card15) TABLE ACCESS (FULL) OF 'T2' (Cost10
Card15)
We can affect access paths by changing the
db_file_multiblock_read_count. But this is a bit
of a drastic change on a production system.
14
Single-block adjustment

select from t2 where n1 45 -- index access
cost was 16

alter session set OPTIMIZER_INDEX_COST_ADJ 50
SELECT STATEMENT OptimizerCHOOSE (Cost8
Card15) TABLE ACCESS (BY INDEX ROWID) OF 'T2'
(Cost8 Card15) INDEX (RANGE SCAN) OF
'T_I2' (NON-UNIQUE) (Cost1 Card15)
alter session set OPTIMIZER_INDEX_COST_ADJ
25 SELECT STATEMENT OptimizerCHOOSE (Cost4
Card15) TABLE ACCESS (BY INDEX ROWID) OF 'T2'
(Cost4 Card15) INDEX (RANGE SCAN) OF
'T_I2' (NON-UNIQUE) (Cost1 Card15)
Under Oracle 9 the numbers are slightly different.
We can fix one of Oracle's silly assumptions -
let it know that single block reads are cheaper
(faster) than multiblock reads - by setting a
percentage cost
15
Single-block adjustment

select event, average_wait
from vsystem_event -- vsession_event
where event like 'db file sread'

EVENT
AVERAGE_WAIT db file sequential read
1.05 db file scattered read
3.72
sequential read time / scattered read time
1.05/3.72 0.28226 alter session set
optimizer_index_cost_adj 28 init.ora or login
trigger
Tim Gorman (www.evdbt.com) - The search for
intelligent life in the CBO. but see Garry
Robinson http//www.oracleadvice.com/Tips/optind.
htm
The really nice thing about this is that we can
set a genuine and realistic cost factor by
checking recent, or localised, history. (snapshot
vsession_event)
16
Join cost-adjustment

select t2.n1, t1.n2
from t2,t1
where t2.n2 45
and t2.n1 t1.n1

SELECT STATEMENT (Cost31 Card225) HASH JOIN
(Cost31 Card25) TABLE ACCESS (FULL) OF T2
(Cost15 Card15) TABLE ACCESS (FULL) OF T1
(Cost15 Card3000)
31 15 15 a bit
We can even see the effect of this price fixing
in joins. Unhinted, or unfixed, the optimizer
chooses a hash join as the cheapest way to our
two tables.
17
Hash Join (1)
The first table is hashed in memory, the second
table is used to probe the hash (build) table for
matches. In simple cases the cost is easy to
calculate.
18
Force a nested loop

select
/ ordered use_nl(t1) index(t1) /
t2.n1, t1.n2
from t2,t1
where t2.n2 45
and t2.n1 t1.n1

NESTED LOOPS (Cost45 Card225) TABLE ACCESS
(FULL) OF T2 (Cost15, Card15) TABLE ACCESS
(BY ROWID) OF T1(Cost2,Card3000)
INDEX(RANGE SCAN) OF T_I1(NON-UNIQUE)(Cost1)
As usual, to investigate why a plan is going
wrong, we hint it to make it do what we want -
and then look for clues in the resulting cost
lines.
19
Forced NL cost

alter session set OPTIMIZER_INDEX_COST_ADJ 100
-- def
NESTED LOOPS (Cost45 Card225)
TABLE ACCESS(FULL) OF T2 (Cost15, Card15)
TABLE ACCESS(BY ROWID) OF T1(Cost2, Card3000)
INDEX(RANGE SCAN) OF T_I1(NON-UNIQUE)(Cost1)

T2 - cost 15 Estimated rows 15
For each row from T2 we access T1 by complete key
value T1 - Cost per access 2 Cost for 15
accesses 15 x 2 30
Total cost of query cost of T2 total cost of
T1 15 30 45
The nested loop algorithm is for each row in
the outer table, use the value in that row to
access the inner table - hence the simple formula.
20
Nested Loop
T1
The basic arithmetic of the nested loop join is
visible in the picture. We do three indexed
access into T2, but need the three driving rows
from T1 first.
21
Nested Loops - recosted

alter session set OPTIMIZER_INDEX_COST_ADJ 50

NESTED LOOPS (Cost30 Card225) TABLE
ACCESS(FULL) OF T2 (Cost15, Card15) TABLE
ACCESS(BY ROWID) OF T1(Cost1, Card3000)
INDEX(RANGE SCAN) OF T_I1(NON-UNIQUE)(Cost1)
T2 - cost 15 Estimated rows 15
For each row from T2 we access T1 by complete key
value T1 - Cost per access 2 x 50 1 Cost
for 15 accesses 15 x 1 15
Total cost of query cost of T2 total cost of
T1 15 15 30
What happens to the cost when we tell Oracle that
single block reads cost half as much as it would
otherwise charge ?
22
Index Caching (NL only)

Basic nested loop cost (hinted)
NESTED LOOPS (Cost45 Card225)
TABLE ACCESS (FULL) OF 'T2' (Cost15, Card15)
TABLE ACCESS (BY INDEX ROWID) OF 'T1' (Cost2,
Card3000)
INDEX (RANGE SCAN) OF 'T_I1' (NON-UNIQUE)
(Cost1)

alter session set OPTIMIZER_INDEX_CACHING
100 NESTED LOOPS (Cost30 Card225) TABLE
ACCESS (FULL) OF 'T2' (Cost15, Card15) TABLE
ACCESS (BY INDEX ROWID) OF 'T1' (Cost1,
Card3000) INDEX (RANGE SCAN) OF 'T_I1'
(NON-UNIQUE)
We can improve silly assumption 2 (every logical
I/O is also a physical I/O). Index blocks are
often cached. So tell the optimizer how good our
cache is.
23
Simplifications

What about the blevel ?
What about multi-column indexes ?
What about unbounded ranges ?
What about unique indexes ?
What about bitmap indexes ?

This was a visually helpful introduction
This walk-through is intended to give you a
gut-feeling of how the optimizer. works. But
there are plenty of special cases, and bits of
funny arithmetic.
24
An improved approximation
blevel selectivity leaf_blocks selectivity
clustering_factor (see Wolfgang Breitling's
paper to IOUG-A 2002)
For equality on all index columns avg_leaf_blocks
_per_key _at_ sel leaf_blocks
avg_data_blocks_per_key _at_ sel
clustering_factor
For multi column indexes, or when using a range
scan, we need more precise arithmetic - but even
our example was a special case of the general
formula
25
Adjusted cost
( (blevel selectivity leaf_blocks) (1 -
optimizer_index_caching/100) selectivity
clustering_factor ) optimizer_index_cost_adj
/ 100
Index bit
Table bit
The formula that Wolfgang Breitling proposed has
to be adjusted to handle the two 'fudge factor'
parameters. This formula seems to be about right.
26
System Statistics (v9)

dbms_stats.gather_system_stats('start')
dbms_stats.gather_system_stats('stop')

SNAME PNAME PVAL1 SYSSTATS_MAIN
CPUSPEED 357 MHz SYSSTATS_MAIN SREADTIM
7.179 ms SYSSTATS_MAIN MREADTIM 18.559
ms SYSSTATS_MAIN MBRC 5
Single block is cheaper than multi.
Used in t/s cost.
But in version 9 you need the 'fudge factors'
less (You could still use them as indicators of
caching) - instead, you let Oracle learn about
your hardware
27
Conclusions

Understand your data
Data distribution is important
Think about your parameters
Help Oracle with the truth
Use system statistics in v9

28
Sort / Merge

select
count(t1.v1) ct_v1,
count(t2.v2) ct_v2
from big1 t1, big2 t2
where t2.n2 t1.n1

SELECT STATEMENT (choose) Cost (963) SORT
(aggregate) MERGE JOIN Cost (963 174
789) SORT (join) Cost (174) TABLE
ACCESS (analyzed) T1 (full) Cost (23) SORT
(join) Cost (789) TABLE ACCESS
(analyzed) T2 (full) Cost (115)
The cost of a sort-merge equi-join is typically
the cost of acquiring each of the two data sets,
plus the cost of making sure the two data sets
are sorted.
29
Sort Merge
Once the two sets are in order, they can be
shuffled together. The shuffling can be quick -
the sorting may be the most expensive bit.
30
In-memory sort
In a merge join, even if the first sort completes
in memory, it will still dump the excess over
sort_area_retained_size to disc. (and so will the
second sort)
31
Big Sorts
A one-pass sort. The data has been read, sorted,
and dumped to disc in chunks, then re-read once
to be merged into order, and dumped again.
32
Huge Sorts
Multipass sort. After sorting the data in chunks,
Oracle was unable to re-read the top of every
chunk simultaneously, so we have multiple merge
passes.
33
Hash Join (1)
The first table is hashed in memory, the second
table is used to probe the hash (build) table for
matches. In simple cases the cost is easy to
calculate.
34
Hash Join (2)
If the smaller data set cannot be hashed in
memory, it partitioned, mapped, and partly dumped
to disc. The larger data set is partitioned in
the same way
35
Hash Join (3)
And if things go really wrong (bad statistics)
Oracle uses partitions which are too large - and
the probe (secondary) partitions are re-read many
times.
36
Version 9 approach
pga_aggregate_target 500M worksize_area_policy
auto

vsysstat
Sorts, hashes, bitmap creates (v.9)
workarea executions - optimal
The job completed in memory - perfect.
workarea executions - onepass
The job required a dump to disk and single
re-read.
workarea executions - multipass
Data was dumped to disc and re-read more than
once.

Under Oracle 9, you should be setting
workarea_size_policy to true, and use the
pga_aggregate_target to something big - the limit
per user is 5
37
Conclusions 2