You - PowerPoint PPT Presentation

About This Presentation

Title:

You

Description:

Title: PowerPoint Presentation Last modified by: p1c1227 Created Date: 1/1/1601 12:00:00 AM Document presentation format: On-screen Show Other titles – PowerPoint PPT presentation

Number of Views:23

Avg rating:3.0/5.0

Slides: 54

Provided by: bobbydurr

Category:

more less

Transcript and Presenter's Notes

Title: You

1
Youre Smarter than a Database

Overcoming the optimizers bad cardinality
estimates

2
About me

Bobby Durrett
US Foodservice
bobbyandmarielle_at_yahoo.com
Scripts in http//www.geocities.com/bobbyandmariel
le/sqltuning.zip

3
What you know
4
What the database knows
5
Before SQL

Example - mainframe Datacom/DB COBOL
List index names
Write loops
read a from one index i1 where one.c10
while more table one rows exist get next row
read b from two index i2 where two.a one.a
while more table two rows exist get next row
print one.a,two.b
end while
end while

6
SQL

Tell what you want, not how to get it
select one.a,two.b
from
one,two
where
one.c10 and one.atwo.a

7
Pre-SQL versus SQL

Pre-SQL code very efficient runs in megabytes
VSE mainframe COBOL
Labor intensive
SQL can be inefficient runs in gigabytes (if
you are lucky!)
Much more productive do in minutes what took
hours before create tables

8
What the database doesnt know

Optimizer has a limited set of statistics that
describe the data
It can miscalculate the number of rows a query
will return, its cardinality
A cardinality error can lead optimizer to choose
a slow way to run the SQL

9
Example plan/Cardinality

-------------------------------------------------
Id Operation Name Rows Cost
-------------------------------------------------
0 SELECT STATEMENT 10 3
1 TABLE ACCESS FULL TEST1 10 3
-------------------------------------------------
Plan how Oracle will run your query
Rows how many rows optimizer thinks that step
will return
Cost estimate of time query will take, a
function of the number of rows

10
How to fix cardinality problems

Find out if it really is a cardinality issue
Determine the reason it occurred
Single column
Multiple columns
Choose a strategy
Give the optimizer more information
Override optimizer decision
Change the application

11
Four examples

Four examples of how the optimizer calculates
cardinality
Full scripts and their outputs on portal, pieces
on slides edited for simplicity

12
Step 1 Find out if it really is a cardinality
issue

Example 1
Data
select a,count() from test1 group by a
A COUNT()
---------- ----------
1 10
Query
select from test1 where a1

13
Step 1 Find out if it really is a cardinality
issue

Get estimated cardinality from plan
-------------------------------------------
Id Operation Name Rows
-------------------------------------------
0 SELECT STATEMENT 10
1 TABLE ACCESS FULL TEST1 10
-------------------------------------------
Do query for actual number of rows
select count() from test1 where a1

14
Step 1 Find out if it really is a cardinality
issue

Plan is a tree find cardinality and select
count() on part of query represented by that
part of plan.

15
Step 2 Understand the reason for the wrong
cardinality

Unequal distribution of data
Within a single column
Last name
Smith or Jones
Among multiple columns
Address
Zipcode and State

16
Step 2 Understand the reason for the wrong
cardinality

Example 2 - Unequal distribution of values in a
single column
1,000,000 rows with value 1
1 row with value 2
select a,count() from TEST2 group by a
A COUNT()
---------- ----------
1 1000000
2 1

17
Step 2 Understand the reason for the wrong
cardinality

SQL statement returns one row
select from TEST2 where a2

18
Step 2 Understand the reason for the wrong
cardinality

Plan with wrong number of rows 500,000
Full scan instead of range scan 100 times
slower
---------------------------------------------
Operation Name Rows
---------------------------------------------
SELECT STATEMENT 500K
INDEX FAST FULL SCAN TEST2INDEX 500K
---------------------------------------------

19
Step 2 Understand the reason for the wrong
cardinality

Column statistics two distinct values
LOW HIGH NUM_DISTINCT
---------- ---------- ------------
1 2 2
Table statistic total of rows 1,000,001
NUM_ROWS
----------
1000001

20
Step 2 Understand the reason for the wrong
cardinality

Rows in plan (rows in table)/ (distinct values
of column)
5000001000001/2
Optimizer knew that there were only two values
assumed they had equal number of rows

21
Step 2 Understand the reason for the wrong
cardinality

Example 3 - Combinations of column values not
equally distributed
1,000,000 rows with values 1,1
1,000,000 rows with values 2,2
1 row with value 1,2
Equal numbers of 1s and 2s in each column
A B COUNT()
---------- ---------- ----------
1 1 1000000
1 2 1
2 2 1000000

22
Step 2 Understand the reason for the wrong
cardinality

SQL statement retrieves one row
select sum(ab)
from TEST3
where
a1 and b2

23
Step 2 Understand the reason for the wrong
cardinality

Plan with wrong number of rows 500,000
Inefficient full scan
----------------------------------------------
Operation Name Rows
----------------------------------------------
SELECT STATEMENT 1
SORT AGGREGATE 1
INDEX FAST FULL SCAN TEST3INDEX 500K
----------------------------------------------

24
Step 2 Understand the reason for the wrong
cardinality

Column statistics
C LOW HIGH NUM_DISTINCT
- ---------- ---------- ------------
A 1 2 2
B 1 2 2
Table statistic total of rows 2,000,001
NUM_ROWS
----------
2000001

25
Step 2 Understand the reason for the wrong
cardinality

Rows in plan (rows in table)/ (distinct
values A distinct values B)
5000002000001/(2 2)
Optimizer assumes all four combinations
(1,1),(1,2),(2,1),(2,2) equally likely

26
Step 2 Understand the reason for the wrong
cardinality

How to tell which assumption is in play?
Select count() each column
select a,count() from TEST3 group by a
select b,count() from TEST3 group by b
count() each column combination
select a,b,count() from TEST3
group by a,b

27
Step 3 Choose the best strategy for fixing the
cardinality problem

Giving the optimizer more information
Histograms
SQL Profiles
Overriding optimizer decisions
Hints
Changing the application
Try to use optimizer as much as possible to
minimize development work

28
Step 3 Choose the best strategy for fixing the
cardinality problem

Giving the optimizer more information using
histograms
Works for unequal distribution within a single
column
A histogram records the distribution of values
within a column in up to 254 buckets
Works best on columns with fewer than 255
distinct values

29
Step 3 Choose the best strategy for fixing the
cardinality problem

Run gather_table_stats command to get histograms
on the column 254 is max number of buckets
method_optgt'FOR ALL COLUMNS SIZE 254'

30
Step 3 Choose the best strategy for fixing the
cardinality problem

Plan for Example 2 with correct number of rows
with histogram
Uses range scan
-----------------------------------------
Operation Name Rows
-----------------------------------------
SELECT STATEMENT 1
INDEX RANGE SCAN TEST2INDEX 1
-----------------------------------------

31
Step 3 Choose the best strategy for fixing the
cardinality problem

Column statistics two buckets
LOW HIGH NUM_DISTINCT
NUM_BUCKETS
---------- ---------- ------------ -----------
1 2 2 2
Table statistic unchanged
NUM_ROWS
----------
1000001

32
Step 3 Choose the best strategy for fixing the
cardinality problem

Time without histograms (1 second)
Elapsed 000001.00
Time with histograms(1/100th second)
Elapsed 000000.01

33
Step 3 Choose the best strategy for fixing the
cardinality problem

Giving the optimizer more information using SQL
Profiles
Works for unequal distribution among multiple
columns
Includes information about the relationship
between columns in the SQL correlated columns
or predicates

34
Step 3 Choose the best strategy for fixing the
cardinality problem

SQL Tuning Advisor gathers statistics on the
columns
...DBMS_SQLTUNE.CREATE_TUNING_TASK(...
...DBMS_SQLTUNE.EXECUTE_TUNING_TASK(...
Accept the SQL Profile it creates to use the new
statistics
...DBMS_SQLTUNE.ACCEPT_SQL_PROFILE (...

35
Step 3 Choose the best strategy for fixing the
cardinality problem

Example 3 plan with correct number of rows 1
using SQL profile
--------------------------------------------------
Operation Name Rows Bytes
--------------------------------------------------
SELECT STATEMENT 1 6
SORT AGGREGATE 1 6
INDEX RANGE SCAN TEST3INDEX 1 6
-------------------------------------------------

36
Step 3 Choose the best strategy for fixing the
cardinality problem

Time without a profile (1 second)
Elapsed 000001.09
Time with a profile(1/100th second)
Elapsed 000000.01

37
Step 3 Choose the best strategy for fixing the
cardinality problem

Overriding optimizer decisions using hints
Example 4 has unequal distribution of column
values across two tables histograms and SQL
Profiles dont work
Hint forces index range scan
Small amount of additional code not like Cobol
on mainframe

38
Step 3 Choose the best strategy for fixing the
cardinality problem

Example 4 - SMALL table
MANY relates to 1 there are many rows with
value 1
FEW relates to 2 there are few with value 2
insert into SMALL values ('MANY',1)
insert into SMALL values ('FEW',2)

39
Step 3 Choose the best strategy for fixing the
cardinality problem

Example 4 - LARGE table
1,000,000 rows with value 1
1 row with value 2
NUM COUNT()
---------- ----------
1 1000000
2 1

40
Step 3 Choose the best strategy for fixing the
cardinality problem

SQL statement returns one row
select B.NUM
from SMALL A,LARGE B
where
A.NUMB.NUM and
A.NAME'FEW'

41
Step 3 Choose the best strategy for fixing the
cardinality problem

Plan with wrong number of rows 125,000
----------------------------------------------
Operation Name Rows
----------------------------------------------
SELECT STATEMENT 125K
HASH JOIN 125K
TABLE ACCESS FULL SMALL 1
INDEX FAST FULL SCAN LARGEINDEX 1000K
----------------------------------------------

42
Step 3 Choose the best strategy for fixing the
cardinality problem

Column statistics two buckets on all columns
using histograms
LOW HIGH NUM_DISTINCT NUM_BUCKETS
---------- ---------- ------------ -----------
1 2 2 2
LOW HIGH NUM_DISTINCT NUM_BUCKETS
---- ---- ------------ -----------
FEW MANY 2 2

43
Step 3 Choose the best strategy for fixing the
cardinality problem

Table statistics SMALL has 2 rows, LARGE
1000001
NUM_ROWS
----------
2
NUM_ROWS
----------
1000001

44
Step 3 Choose the best strategy for fixing the
cardinality problem

1250001000001/8
Optimizer appears to assume all eight
combinations of the three columns values are
equally likely
Cant verify formula references dont include
formula with histograms
Even worse without histograms cardinality is
500000

45
Step 3 Choose the best strategy for fixing the
cardinality problem

No SQL profile from SQL Tuning Advisor
There are no recommendations to improve the
statement.
Neither histograms nor SQL profiles help example 4

46
Step 3 Choose the best strategy for fixing the
cardinality problem

Statement with hints
Use index
Dont do full scan
select / INDEX(B LARGEINDEX)
NO_INDEX_FFS(B LARGEINDEX) /
B.NUM
from SMALL A,LARGE B
where
a.NUMB.NUM and
A.NAME'FEW'

47
Step 3 Choose the best strategy for fixing the
cardinality problem

Time without a hint (1 second)
Elapsed 000001.03
Time with a hint (1/100th second)
Elapsed 000000.01

48
Step 3 Choose the best strategy for fixing the
cardinality problem

Changing the application
Change your tables so that the optimizer gets
your SQLs cardinality right
Requires more work designing tables, but keeps
productivity benefits of SQL

49
Step 3 Choose the best strategy for fixing the
cardinality problem

Example 4 moved NAME column to LARGE table and
split table in two
One million (MANY,1) rows in LARGEA
One (FEW,2) row in LARGEB
Query
select NUM
from (select from largea
union
select from largeb)
where
NAME'FEW'

50
Step 3 Choose the best strategy for fixing the
cardinality problem

Plan is just as efficient as with hint
Number of rows 2 (reality is 1)
Range Scan
--------------------------------------------------
------------
Id Operation Name
Rows
--------------------------------------------------
------------
0 SELECT STATEMENT
2
1 VIEW
2
2 SORT UNIQUE
2
3 UNION-ALL
4 TABLE ACCESS BY INDEX ROWID LARGEA
1
5 INDEX RANGE SCAN
LARGEAINDEX 1
6 TABLE ACCESS BY INDEX ROWID LARGEB
1
7 INDEX RANGE SCAN
LARGEBINDEX 1
--------------------------------------------------
------------

51
Step 3 Choose the best strategy for fixing the
cardinality problem

Time without table change (1 second)
Elapsed 000001.03
Time with table change (1/100th second)
Elapsed 000000.01

52
Conclusion

SQL improves productivity, optimizer has limits
Identify cases where cardinality is wrong
Understand why the database got it wrong
One column
Multiple columns
Choose best strategy to fix
Give optimizer more info
Override optimizers choices
Redesign tables

53
References

Cost Based Optimizer Fundamentals, Jonathan Lewis
Metalink Note212809.1, Limitations of the Oracle
Cost Based Optimizer
Metalink Note68992.1, Predicate Selectivity
Histograms Myths and Facts, Wolfgang Breitling
Select Journal,
Volume 13, Number 3

Write a Comment

User Comments (0)