Storage Operations Management Work Plan - PowerPoint PPT Presentation

1 / 133
About This Presentation
Title:

Storage Operations Management Work Plan

Description:

Add Intelligence. Kyle Hailey. Kyle.Hailey_at_oracle.com ... Possible group by Sessions, Modules, Actions, Client id, Service or Machine. Machine Statistics ... – PowerPoint PPT presentation

Number of Views:114
Avg rating:3.0/5.0
Slides: 134
Provided by: gajakrishn
Category:

less

Transcript and Presenter's Notes

Title: Storage Operations Management Work Plan


1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
11
11
12
Reduce Data Sprawl
13
Add Intelligence
14
Using Graphics to Tune Oracle
  • Saturday, October 10, 2009
  • Kyle Hailey
  • Kyle.Hailey_at_oracle.com

15
Overview
16
Target Audience Production DBA
  • A Developer would have a different view

17
Performance Anxiety
  • When the manager is asking why is the
  • Response is slow
  • Database is hung
  • Batch job behind schedule
  • How do you respond?

18
Respond Quickly and Confidently
  • Like the Doctor
  • Test collect Data
  • Diagnosis - assessment
  • Cure fix problems

19
A New Approach to Oracle Tuning
  • Old OEM Design Approach
  • New OEM Design Approach
  • Macintosh Design Approach
  • Result of Good Design
  • Our Goal

20
Old Design Method
  • List All Tuning data
  • Create screens for all data
  • Verify that all data is displayed
  • gt Overwhelming and confusing

21
OEMs New Design Method
  • Start with Typical Problems
  • Identify Solutions
  • Design a Visual Representation
  • gt Put intelligence into the interface

22
Macintosh Method
  • What do people do
  • How do they do it
  • Design to support it
  • gt can be used without reading a manual

23
Without Reading a Manual
  • Handsprings site crashed Nov 25, 1999
  • Biggest Sales Day of the Year
  • Library cache latch contention
  • No DBAs
  • Downloaded Quests Spotlight
  • Installed and Identified problem with Minutes
  • Solution in code fix

24
Goal
  • Make Oracle Tuning easy for Anyone

25
Manageability
Obstacles and Approaches
26
What Data to Collect ?
  • On NT 9iR2
  • Oracle Tuning Data in 9iR2 NT
  • Stats 248
  • Waits 361
  • Latches 239
  • Parameters 257
  • Hidden Parameters 540
  • Total 1645
  • As well as
  • 259 V tables
  • 394 X tables
  • And machine data

27
Statspack Collects the Data
  • Standard for Data Collection
  • Selects from 44 x, v and dba tables and views
  • 1000-1500 lines output

28
But, How to Analyze the Statspack Data?
  • One page Summary
  • Waits are Emphasized to Guide Analysis
  • Despite that, Data can be confusing to Analyze
  • Oracle Press book is 600 pages
  • Users still get led astray
  • Recent mail
  • IO waits dominate
  • one SQL was 99 of the IO
  • still concerned about parsing.

29
Origin of Problem Tuning Rules Lists
  • Originated from Oracle Version 6
  • Calculate ratios to deduce bottlenecks
  • Often lead down the wrong path
  • Result tuning inconsequential aspects

30
Wait Interface The Guide
  • Introduced in Version 7
  • Pinpoints performance Bottlenecks
  • Used by Statspack
  • Needs a clearer presentation than Statspack

31
Silver Bullet Myth
  • _go_fastertrue
  • Coping with overwhelming amount of data
  • Oracle Expert type products
  • Not All Solutions are Known

32
Combine Analysis with Diagnostics
  • Known Solutions
  • Diagnostics Methods
  • Powerful Graphic Interface
  • Flexibility to Grow with Changes

33
Graphics
the Advantage
34
Why Use Graphics
  •  
  • You can't imagine how many times I was told that
    nobody wanted or would use graphics now
    everybody uses them.
  • -- Jef Raskin, the creator of the Macintosh

35
Macintosh Paradigm
  • Introduced Graphical Windows
  • Introduced the use of the mouse
  • Revolutionized the Interface to the computer
  • First computer I used without Reading the Manual
  • Manageable computer

36
Oracle and Manageability
  • Microsoft's perceived ease of use
  • Competition Brings Innovation
  • Oracle is now tackling manageability
  • BMC, Quest, Precise improve the manageability
    perception of Oracle

37
Why Use Graphics in Oracle Tuning
  • The human nervous system is exceptionally
    good at parsing visual information, especially
    when that information is coded by color and/or
    motion.
  • Knowledge representation in cognitive
    science. Westbury, C. Wilensky, U. (1998)

38
Why Use Graphics in Oracle Tuning
  • Increase comprehension of Data by
  • Correlating
  • Increasing data Density
  • Providing known Solutions
  • Guiding Investigation

39
Parsability of Graphics
  • Infocus (overhead projectors) sited a study
    that humans can parse graphical information
    400,000 times faster than textual data

40
Trends What are the Trends?
41
Trends With Graphics
I II
III IV
42
Trends Waits Over Time
  • date 1713 1713 1714 1714 1715 1715 1716 17
    16 1717 1717 1718 1718 1719 1719 1720 1720
    1721 1721 1722 1722 1723 1723 1724 1724 1
    725 1725 1726 1726 1727 1727 1728 1728 17
    29 1729 1730 1730 1731 1731 1732 1732 1733
    1733 1734 1734 1735 1735 1736 1736 1737 1
    737 1738 1738 1739 1739 1740 1740 1741 17
    41 1742
  • log_file_switch_completion_3 0 0 0 0 2982936 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 237642 0 0 0 0 0 0 0 0 0 0 0 0 210041
    9 0 0 0 0
  • log_buffer_space_3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 6478221 0 14945538 5219836
    0 0
  • QLNet_break/reset_to_client_3 8501199 11385539 70
    44359 8234690 6183237 2333297 7776659 6118198 6495
    648 7518489 11455326 7702865 11085762 7634300 1058
    2833 13260400 8544422 13172524 8502041 12010958 92
    08533 9384118 10208957 7399676 6747457 6231611 134
    41429 7344615 8761884 6642950 7256315 7073623 8442
    915 7060399 7485662 6601072 6349018 9929440 490223
    0 6079359 5751021 9455790 9087866 9341209 7664906
    16499162 10234535 10588829 8638014 8918795 4312962
    8056957 4944408 7195059 3332565 3154348 5530740 4
    965315 7011341 5248675 6846116 6740378 5817539 901
    2238 7464656 9055304 8781313 9115954 9858761
  • latch_free_3 4306495 1931327 3327222 2181023 17640
    23 3238736 1637877 1922399 1786672 2312507 2308379
    3 1091835 1862666 6949805 6453080 781313 2767384 1
    164596 2249654 440342 2076414 4062214 2408959 2886
    800 2177722 3332627 1237003 2098221 5986478 828048
    2570183 2104943 2686288 2259440 2388341 2044871 1
    462208 2306024 1635624 2132461 783102 2677123 4743
    636 1292606 1499421 2439544 2753717 1905407 301793
    3 1651176 3686294 1088209 922402 2319624 2645539 1
    260079 4316997 1311242 2087917
  • rdbms_ipc_reply_3 27920139 31888722 29909917 27903
    331 27939736 29891410 29901186 29953756 27895419 2
    9879828 28086221 29906822 27903067 29915873 298963
    07 31878918 28271932 29883910 29885870 29906113 27
    933209 27794807 29879997 28842040 30138759 1210295
    4 0 0 0 0 0 0 0 0 0 0 0 4681562 0 0 1018 22981 0 0
    0 383 0 0 0 0 0 0 0 0 0 0 0 3079 0 700 0 604 0 0
    42240 0 0 0 508 2702
  • log_file_sync_3 4897094 5984818 8773859 6732708 25
    28699 21640100 4667088 9349937 6102384 5648418 162
    38557 9202562 4214425 12911555 8677281 1982443 554
    9138 2978201 4361914 2924674 3969540 4608851 44990
    43 5563593 7480457 8736114 4087525 7916623 9531711
    4204114 4002465 4230277 4795701 6324440 4740469 5
    123660 2891044 8931372 4254477 6177218 5119836 470
    1245 9094181 6751040 6969569 8678671 10627609 8452
    116 13505152 9052070 9199762 6939591 39153732 1897
    4176 71305938 25005388 6781095 3927165 enqueue_3 3
    9067 87598 58429 93754 38807 270867 64623 76158 12
    5199 39092426 73909741 82618 78157 12192579 300841
    39 30154100 30426042 30034564 30058775 30062114 27
    062239 27585615 30040241 23741759 139008 75112 467
    78 78246 30794 93934 850847 83964 91959 572765 189
    38 214037 112043 93392 513358 80052 121984 148309
    1683718 73709 33330 73894 476811 2072430 34138 402
    73 88209 512731 460956 492492 2979781 50846 80892
    84755 248836
  • db_file_sequential_read_3 107039051 109566587 1270
    08942 97033615 74615747 70728654 109231682 9375567
    6 86966703 117638046 76065788 88137834 66759488 89
    709570 80277614 50762622 58207938 61815450 6561058
    0 51599402 74260828 78138965 49867005 21069074 176
    80159 11460495 12467891 33786092 33211896 40736046
    49212320 41162084 45786601 24122201 55651100 3431
    6957 11773008 62455231 48897870 67248119 64077462
    88899277 89622175 34483526 57402401 35319735 30619
    562 61707668 34514340 39504714 37681062 34165368 2
    4952296 30172023 25510144 40260331 83215876 784503
    52 56963813 33846491 28869234 52964974 38161443 43
    878778 31095295 26018747 58239638 19793917 7914975
  • itch_(checkpoint_incomplete)_3 0 0 0 0 285950959 2
    47915390 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
    0

43
Trends Waits over Time
44
Quantity Map Alone is 20,000 points
45
Quantity Extent Mapping
46
Quantity Extent Mapping with Mouse
47
Quantity Seeing Patterns Extent Anomalies
48
Quantity Zoom Out, Over 8000 data pts
49
Correlating Data
50
PerspectiveTufte
51
PerspectiveTufte
52
Perspective Oracle
53
Perspective Oracle
54
Clutter
55
Clear Concise
56
Tuning Lists are Like Full Table Scans
  • Examples of tuning lists

Rule 1
Rule 2
Rule 3
Rule 4
Rule 5
Rule 6
Rule 7
Rule 8
Rule 9
Rule 10
Rule 11

57
Use an Index on the Tuning Rules
58
Guiding User Shallow Index
A GUI can prune the decision by highlighting
and/or guiding the user (like a shallow index
tree)
59
Keeping Context - Schema
A GUI can keep context giving keeping making the
tuning process seem less like Labyrinth
60
Keeping Contexkt - Example
61
Extending Visual Metaphors
  • Graphics can increase the density and clarity

62
Buffer Cache Advisor Example
63
Ex Buffer Cache with Graph
64
Using Visual Metaphors
65
Larger Contention Area
66
Providing Interactivity Simulations
67
Attempts at Graphics
68
First Try 1994 Perot/Europcar
  • Goal no interaction needed
  • Visible across the room
  • Used statistics shown in dials and bar graphs
  • Used TCL/TK

69
Second try at a Graphical Monitor
  • TCL/TK front end that communicated over TCP/IP
    with daemon attached to the SGA and reading out
    Oracle statistics data
  • Latch Vlatch
  • Library Cache Vlibrary Cache
  • Statistics vsysstat
  • Waits vsystem_event
  • Rollback Segments vrollstat
  • Row Cache vrowcache

70
Data
  • Values are Cumulative
  • Necessitates snapshots
  • Deltas need to be calculated between snapshots
  • Normalizing Data needed to compare any interval

71
Monitor Everything Approach
72
Wait Events Became the Focus
73
Improvements
  • Scale Graphs Equally
  • Eliminate
  • Background waits
  • Idle Waits
  • Extraneous waits
  • Wait counts

74
With Improvements
75
160K Log Buffer
76
50M Log Files
77
2000 Block_buffers
78
Log Files on Raw
79
4000 Block Buffers
80
8000 Block Buffers
81
Wait vs CPU over tuning tests
http//oraperf.sourceforge.net/seminar/examples.ht
ml
82
Quest Paradigm stacking Waits
83
GUI Buffer Busy Waits
84
GUI Log File Switch
85
GUI Log Buffer Space
86
GUI Latch and Log Sync
87
GUI All Tests 8.1.7
88
What do CPU and Wait Mean ?
89
Service Time
90
Service Time over 5 sec Snapshots
91
Two Users Sharing the CPU
92
Two Users Combining the Service Times
93
Displaying CPU and Wait Time
94
Wait Bound System
95
CPU Bound System
96
Resulting Graphic Interface
97
TOP Triangle of Oracle Performance
HOST
System
ORACLE
System
SQL
Users
Sessions
SQL
98
Methodology
  • 1) Machine running correctly
  • Run Q and Paging
  • 2) Oracle instance correctly configure
  • Compare Waits to CPU
  • 3) Drill down to SQL and/or Sessions
  • Root of all problems
  • Possible group by Sessions, Modules, Actions,
    Client id, Service or Machine

99
Machine Statistics
  • Which Machine Statistics to use?
  • CPU
  • User
  • system
  • Memory
  • Size
  • Free
  • Swap
  • IO Wait
  • Network

100
1 Machine Statistics Core
  • Only need
  • Paging
  • Run Queue
  • Paging and Run Queue can cause latch free and
    unreliable CPU statistics in Oracle and are not
    directly detectible in Oracle statistics. Other
    Machine statistics such as Network and I/O
    problems can be seen directly in Oracle
    statistics.

101
2 Oracle System Level
  • Compare WAIT time to CPU
  • If CPU bound look at CPU consumers
  • IF wait bound drilldown to data relevant to the
    highest wait type

102
3 Oracle Drill Down SQL
  • SQL is the source of All problems
  • Aggregate SQL usage by Session
  • Possibly Aggregate by
  • Machine
  • Module
  • Action
  • Client ID
  • Service

103
Putting it Together Graphically
1
2
3
104
1) Host Information
  • Run Queue
  • top sql ( and top user etc)
  • Paging
  • PGA Advisor
  • Shared Pool Advisor
  • Buffer Cache Advisor
  • Else improve Hardware

105
2) Oracle System View
106
3) Oracle Correlation Data
107
Sampling
108
Snapshot Method has Limitations
  • Performance Page populated by Snapshoting
  • Snapshoting is prohibitively expensive for the
    drill down data

109
Snapshots and Drilldown Data
  • Top SQL by CPU
  • Expensive
  • Top SQL by Wait
  • Doesnt exist
  • Top Session by CPU
  • Feasible
  • Top Session by wait
  • Expensive
  • Top Object
  • Expensive (could be thousands)

110
Sampling Solution for Collecting Data
  • Based on VSESSION_WAIT
  • Allows easy Aggregation by
  • Session
  • SQL
  • Machine
  • Module
  • Action
  • Client Id
  • Identifies SQL, Users and Objects to track

111
Sample Data Example
112
Sampled Data Visually
113
Sampled Data for Top SQL
114
Developing a Drilldown
User I/O Example
115
Determine I/O Problems
  • Problem SQL
  • Hot File or Disk
  • Hot Objects due to Sparseness or Row Chaining
  • Undersized Buffer Cache

116
Determine Solutions
  • Tune SQL
  • Redistribute I/O load or change I/O subsystem
  • Rebuild indexes or Tables
  • Increase Buffer Cache size

117
Design Visual Interface
  • First, Orient the Viewer , give a context

118
Generic Drilldown for Waits
  • SQL

119
I/O Specific Drilldowns File Response
120
I/O Specific Drilldowns Hot Objects
121
I/O Specific Drilldowns Buffer Cache
122
Story Board
123
IO Wait Problem
124
IO Look at Wait Groups
125
IO Click on Problem Group
126
IO Drill Down Problem SQL
127
IO Drill Down Problem SQL FlyOvers
128
IO Drill Down Buffer Cache Size
129
IO Drill Down Problem Disks
130
CPU Bound
131
CPU Bound SQL?
132
CPU Bound Users
133
CPU Drill Down on User
Write a Comment
User Comments (0)
About PowerShow.com