You - PowerPoint PPT Presentation

About This Presentation
Title:

You

Description:

Title: PowerPoint Presentation Last modified by: p1c1227 Created Date: 1/1/1601 12:00:00 AM Document presentation format: On-screen Show Other titles – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 54
Provided by: bobbydurr
Category:

less

Transcript and Presenter's Notes

Title: You


1
Youre Smarter than a Database
  • Overcoming the optimizers bad cardinality
    estimates

2
About me
  • Bobby Durrett
  • US Foodservice
  • bobbyandmarielle_at_yahoo.com
  • Scripts in http//www.geocities.com/bobbyandmariel
    le/sqltuning.zip

3
What you know
4
What the database knows
5
Before SQL
  • Example - mainframe Datacom/DB COBOL
  • List index names
  • Write loops
  • read a from one index i1 where one.c10
  • while more table one rows exist get next row
  • read b from two index i2 where two.a one.a
  • while more table two rows exist get next row
  • print one.a,two.b
  • end while
  • end while

6
SQL
  • Tell what you want, not how to get it
  • select one.a,two.b
  • from
  • one,two
  • where
  • one.c10 and one.atwo.a

7
Pre-SQL versus SQL
  • Pre-SQL code very efficient runs in megabytes
    VSE mainframe COBOL
  • Labor intensive
  • SQL can be inefficient runs in gigabytes (if
    you are lucky!)
  • Much more productive do in minutes what took
    hours before create tables

8
What the database doesnt know
  • Optimizer has a limited set of statistics that
    describe the data
  • It can miscalculate the number of rows a query
    will return, its cardinality
  • A cardinality error can lead optimizer to choose
    a slow way to run the SQL

9
Example plan/Cardinality
  • -------------------------------------------------
  • Id Operation Name Rows Cost
  • -------------------------------------------------
  • 0 SELECT STATEMENT 10 3
  • 1 TABLE ACCESS FULL TEST1 10 3
  • -------------------------------------------------
  • Plan how Oracle will run your query
  • Rows how many rows optimizer thinks that step
    will return
  • Cost estimate of time query will take, a
    function of the number of rows

10
How to fix cardinality problems
  • Find out if it really is a cardinality issue
  • Determine the reason it occurred
  • Single column
  • Multiple columns
  • Choose a strategy
  • Give the optimizer more information
  • Override optimizer decision
  • Change the application

11
Four examples
  • Four examples of how the optimizer calculates
    cardinality
  • Full scripts and their outputs on portal, pieces
    on slides edited for simplicity

12
Step 1 Find out if it really is a cardinality
issue
  • Example 1
  • Data
  • select a,count() from test1 group by a
  • A COUNT()
  • ---------- ----------
  • 1 10
  • Query
  • select from test1 where a1

13
Step 1 Find out if it really is a cardinality
issue
  • Get estimated cardinality from plan
  • -------------------------------------------
  • Id Operation Name Rows
  • -------------------------------------------
  • 0 SELECT STATEMENT 10
  • 1 TABLE ACCESS FULL TEST1 10
  • -------------------------------------------
  • Do query for actual number of rows
  • select count() from test1 where a1

14
Step 1 Find out if it really is a cardinality
issue
  • Plan is a tree find cardinality and select
    count() on part of query represented by that
    part of plan.

15
Step 2 Understand the reason for the wrong
cardinality
  • Unequal distribution of data
  • Within a single column
  • Last name
  • Smith or Jones
  • Among multiple columns
  • Address
  • Zipcode and State

16
Step 2 Understand the reason for the wrong
cardinality
  • Example 2 - Unequal distribution of values in a
    single column
  • 1,000,000 rows with value 1
  • 1 row with value 2
  • select a,count() from TEST2 group by a
  • A COUNT()



















  • ---------- ----------



















  • 1 1000000



















  • 2 1

17
Step 2 Understand the reason for the wrong
cardinality
  • SQL statement returns one row
  • select from TEST2 where a2

18
Step 2 Understand the reason for the wrong
cardinality
  • Plan with wrong number of rows 500,000
  • Full scan instead of range scan 100 times
    slower
  • ---------------------------------------------
  • Operation Name Rows
  • ---------------------------------------------
  • SELECT STATEMENT 500K
  • INDEX FAST FULL SCAN TEST2INDEX 500K
  • ---------------------------------------------

19
Step 2 Understand the reason for the wrong
cardinality
  • Column statistics two distinct values
  • LOW HIGH NUM_DISTINCT
  • ---------- ---------- ------------
  • 1 2 2
  • Table statistic total of rows 1,000,001
  • NUM_ROWS
  • ----------
  • 1000001

20
Step 2 Understand the reason for the wrong
cardinality
  • Rows in plan (rows in table)/ (distinct values
    of column)
  • 5000001000001/2
  • Optimizer knew that there were only two values
    assumed they had equal number of rows

21
Step 2 Understand the reason for the wrong
cardinality
  • Example 3 - Combinations of column values not
    equally distributed
  • 1,000,000 rows with values 1,1
  • 1,000,000 rows with values 2,2
  • 1 row with value 1,2
  • Equal numbers of 1s and 2s in each column
  • A B COUNT()
  • ---------- ---------- ----------
  • 1 1 1000000
  • 1 2 1
  • 2 2 1000000

22
Step 2 Understand the reason for the wrong
cardinality
  • SQL statement retrieves one row
  • select sum(ab)
  • from TEST3
  • where
  • a1 and b2

23
Step 2 Understand the reason for the wrong
cardinality
  • Plan with wrong number of rows 500,000
  • Inefficient full scan
  • ----------------------------------------------
  • Operation Name Rows
  • ----------------------------------------------
  • SELECT STATEMENT 1
  • SORT AGGREGATE 1
  • INDEX FAST FULL SCAN TEST3INDEX 500K
  • ----------------------------------------------

24
Step 2 Understand the reason for the wrong
cardinality
  • Column statistics
  • C LOW HIGH NUM_DISTINCT
  • - ---------- ---------- ------------
  • A 1 2 2
  • B 1 2 2
  • Table statistic total of rows 2,000,001
  • NUM_ROWS
  • ----------
  • 2000001

25
Step 2 Understand the reason for the wrong
cardinality
  • Rows in plan (rows in table)/ (distinct
    values A distinct values B)
  • 5000002000001/(2 2)
  • Optimizer assumes all four combinations
    (1,1),(1,2),(2,1),(2,2) equally likely

26
Step 2 Understand the reason for the wrong
cardinality
  • How to tell which assumption is in play?
  • Select count() each column
  • select a,count() from TEST3 group by a
  • select b,count() from TEST3 group by b
  • count() each column combination
  • select a,b,count() from TEST3
  • group by a,b

27
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Giving the optimizer more information
  • Histograms
  • SQL Profiles
  • Overriding optimizer decisions
  • Hints
  • Changing the application
  • Try to use optimizer as much as possible to
    minimize development work

28
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Giving the optimizer more information using
    histograms
  • Works for unequal distribution within a single
    column
  • A histogram records the distribution of values
    within a column in up to 254 buckets
  • Works best on columns with fewer than 255
    distinct values

29
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Run gather_table_stats command to get histograms
    on the column 254 is max number of buckets
  • method_optgt'FOR ALL COLUMNS SIZE 254'

30
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Plan for Example 2 with correct number of rows
    with histogram
  • Uses range scan
  • -----------------------------------------
  • Operation Name Rows
  • -----------------------------------------
  • SELECT STATEMENT 1
  • INDEX RANGE SCAN TEST2INDEX 1
  • -----------------------------------------

31
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Column statistics two buckets
  • LOW HIGH NUM_DISTINCT
    NUM_BUCKETS
  • ---------- ---------- ------------ -----------
  • 1 2 2 2
  • Table statistic unchanged
  • NUM_ROWS
  • ----------
  • 1000001

32
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Time without histograms (1 second)
  • Elapsed 000001.00
  • Time with histograms(1/100th second)
  • Elapsed 000000.01

33
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Giving the optimizer more information using SQL
    Profiles
  • Works for unequal distribution among multiple
    columns
  • Includes information about the relationship
    between columns in the SQL correlated columns
    or predicates

34
Step 3 Choose the best strategy for fixing the
cardinality problem
  • SQL Tuning Advisor gathers statistics on the
    columns
  • ...DBMS_SQLTUNE.CREATE_TUNING_TASK(...
  • ...DBMS_SQLTUNE.EXECUTE_TUNING_TASK(...
  • Accept the SQL Profile it creates to use the new
    statistics
  • ...DBMS_SQLTUNE.ACCEPT_SQL_PROFILE (...

35
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Example 3 plan with correct number of rows 1
    using SQL profile
  • --------------------------------------------------
  • Operation Name Rows Bytes
  • --------------------------------------------------
  • SELECT STATEMENT 1 6
  • SORT AGGREGATE 1 6
  • INDEX RANGE SCAN TEST3INDEX 1 6
  • -------------------------------------------------

36
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Time without a profile (1 second)
  • Elapsed 000001.09
  • Time with a profile(1/100th second)
  • Elapsed 000000.01

37
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Overriding optimizer decisions using hints
  • Example 4 has unequal distribution of column
    values across two tables histograms and SQL
    Profiles dont work
  • Hint forces index range scan
  • Small amount of additional code not like Cobol
    on mainframe

38
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Example 4 - SMALL table
  • MANY relates to 1 there are many rows with
    value 1
  • FEW relates to 2 there are few with value 2
  • insert into SMALL values ('MANY',1)
  • insert into SMALL values ('FEW',2)

39
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Example 4 - LARGE table
  • 1,000,000 rows with value 1
  • 1 row with value 2
  • NUM COUNT()
  • ---------- ----------
  • 1 1000000
  • 2 1

40
Step 3 Choose the best strategy for fixing the
cardinality problem
  • SQL statement returns one row
  • select B.NUM
  • from SMALL A,LARGE B
  • where
  • A.NUMB.NUM and
  • A.NAME'FEW'

41
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Plan with wrong number of rows 125,000
  • ----------------------------------------------
  • Operation Name Rows
  • ----------------------------------------------
  • SELECT STATEMENT 125K
  • HASH JOIN 125K
  • TABLE ACCESS FULL SMALL 1
  • INDEX FAST FULL SCAN LARGEINDEX 1000K
  • ----------------------------------------------

42
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Column statistics two buckets on all columns
    using histograms
  • LOW HIGH NUM_DISTINCT NUM_BUCKETS
  • ---------- ---------- ------------ -----------
  • 1 2 2 2
  • LOW HIGH NUM_DISTINCT NUM_BUCKETS
  • ---- ---- ------------ -----------
  • FEW MANY 2 2

43
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Table statistics SMALL has 2 rows, LARGE
    1000001
  • NUM_ROWS
  • ----------
  • 2
  • NUM_ROWS
  • ----------
  • 1000001

44
Step 3 Choose the best strategy for fixing the
cardinality problem
  • 1250001000001/8
  • Optimizer appears to assume all eight
    combinations of the three columns values are
    equally likely
  • Cant verify formula references dont include
    formula with histograms
  • Even worse without histograms cardinality is
    500000

45
Step 3 Choose the best strategy for fixing the
cardinality problem
  • No SQL profile from SQL Tuning Advisor
  • There are no recommendations to improve the
    statement.
  • Neither histograms nor SQL profiles help example 4

46
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Statement with hints
  • Use index
  • Dont do full scan
  • select / INDEX(B LARGEINDEX)
  • NO_INDEX_FFS(B LARGEINDEX) /
  • B.NUM
  • from SMALL A,LARGE B
  • where
  • a.NUMB.NUM and
  • A.NAME'FEW'

47
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Time without a hint (1 second)
  • Elapsed 000001.03
  • Time with a hint (1/100th second)
  • Elapsed 000000.01

48
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Changing the application
  • Change your tables so that the optimizer gets
    your SQLs cardinality right
  • Requires more work designing tables, but keeps
    productivity benefits of SQL

49
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Example 4 moved NAME column to LARGE table and
    split table in two
  • One million (MANY,1) rows in LARGEA
  • One (FEW,2) row in LARGEB
  • Query
  • select NUM
  • from (select from largea
  • union
  • select from largeb)
  • where
  • NAME'FEW'

50
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Plan is just as efficient as with hint
  • Number of rows 2 (reality is 1)
  • Range Scan
  • --------------------------------------------------
    ------------
  • Id Operation Name
    Rows
  • --------------------------------------------------
    ------------
  • 0 SELECT STATEMENT
    2
  • 1 VIEW
    2
  • 2 SORT UNIQUE
    2
  • 3 UNION-ALL
  • 4 TABLE ACCESS BY INDEX ROWID LARGEA
    1
  • 5 INDEX RANGE SCAN
    LARGEAINDEX 1
  • 6 TABLE ACCESS BY INDEX ROWID LARGEB
    1
  • 7 INDEX RANGE SCAN
    LARGEBINDEX 1
  • --------------------------------------------------
    ------------

51
Step 3 Choose the best strategy for fixing the
cardinality problem
  • Time without table change (1 second)
  • Elapsed 000001.03
  • Time with table change (1/100th second)
  • Elapsed 000000.01

52
Conclusion
  • SQL improves productivity, optimizer has limits
  • Identify cases where cardinality is wrong
  • Understand why the database got it wrong
  • One column
  • Multiple columns
  • Choose best strategy to fix
  • Give optimizer more info
  • Override optimizers choices
  • Redesign tables

53
References
  • Cost Based Optimizer Fundamentals, Jonathan Lewis
  • Metalink Note212809.1, Limitations of the Oracle
    Cost Based Optimizer
  • Metalink Note68992.1, Predicate Selectivity
  • Histograms Myths and Facts, Wolfgang Breitling
    Select Journal,
    Volume 13, Number 3
Write a Comment
User Comments (0)
About PowerShow.com