Title: PerfMan for Tape Libraries Ned Diehl
1The power behind great IT decisions
2A Statistical Method to Extract Value From RMON
Data John G. DeRosa, Jr.The Information
Systems Manager, Inc.john.derosa_at_perfman.comwww.
perfman.com610-865-0300
3Contents
- An Introduction to SNMP
- RMON
- RMON Etherstats, Host, and Matrix Data
- Intelligent RMON Data Limitation
- Etherstats and Network Utilization
- A VERY brief review of Statistical Data
Variability - Host Data Limitation
- Matrix Data Limitation
4The Network Is There a Problem?
Router
Switch
A Lot More Network
Switch
Router
Switch
Switch
5An Introduction to SNMP
- SNMP (Simple Network Management Protocol)
- A TCP/IP network management request/response
protocol (a set of rules) that defines the
communication of information between the SNMP
Manager and SNMP Agent. - Provides a structure for the definition of the
managed network information
6An Introduction to SNMP
- SNMP Agent
- Software or firmware that runs on the network and
is responsible for maintaining the managed
information and delivering it to the SNMP Manager - SNMP Manager
- Software that manages the agent and acts upon the
data delivered from the agent.
7An Introduction to SNMP
Network Switch
Embedded SNMP Agent
Request?
?Response
Software SNMP Agent
SNMP Manager
8An Introduction to SNMP
- MIB (Management Information Base)
- Describes how the SNMP Agent gathers and stores
the network information - Assigns a unique OID (Object Identification) to
each piece of data - May be industry standard MIBs (IETF-Internet
Engineering Task Force) or proprietary MIBs - Can be read as a document and used as input in
SNMP software tools
9An Introduction to SNMP The OID Tree
ccitt(0)
iso(1)
joint-iso-ccitt(2)
standard(0)
registration- authority(1)
member- body(2)
identified- organization(3)
dod(6)
internet(1)
directory(1)
mgmt(2)
experimental(3)
private(4)
security(5)
snmpv2(6)
mib2(1)
system(1) interface(2) ip(4) icmp(5) tcp(6)
udp(7) egp(8) transmission(10) sample(11) RMON(16)
10RMON
- RMON (Remote Monitoring)
- An industry standard MIB that describes the
monitoring of network activity - Agents usually found as embedded software in
switches and routers - Composed of nine groups of information
- A given RMON agent may contain all or a subset of
the nine groups
11RMON
- RMONs Nine Groups
- 1.Statistics (Etherstats)
- 2.History
- 3.Alarm
- 4.Host
- 5.HostTopN
- 6.Matrix
- 7.Filter
- 8.Capture
- 9.Event
12RMON Etherstat Counters
- Octets (1.3.6.1.2.1.16.1.1.1.4)
- Packets
- Multicast Packets
- Broadcast Packets
- Drop Events
- Fragments
- Jabbers
- Oversize Packets
- CRC Alignment Errors
- Collisions
- Packet Size Distribution
13RMON Host Counters
- In Packets (1.3.6.1.2.1.16.4.2.1.4)
- Out Packets
- In Octets
- Out Octets
- Broadcast Packets
- Multicast Packets
- Errors
14RMON Matrix Counters
- Packets (1.3.6.1.2.1.16.6.2.1.4)
- Octets
- Errors
15RMON Data The Challenge
- Can RMON be used to answer the following
- When is my network busy?
- Which hosts are causing my network to be busy?
- Who are those busy hosts talking to?
16RMON Data The Problem
- Host data collected for every host found on the
monitored network - Hundreds or thousands of hosts
- How to determine which hosts are causing the most
traffic?
17RMON Data The Problem
- Matrix data collected for every host to host
conversation found on the monitored network. - Hundreds to tens of thousands of conversations
- How to determine which conversations are causing
the most traffic?
18RMON Data The Problem
Etherstats Data
Host Data
Matrix Data
19The Solution Intelligent RMON Data Limitation
- Network Utilization Threshold
- Host Data Limitation
- Matrix Data Limitation
20Etherstats and Network Utilization
- An RMON agent gathers Etherstats for each
interface (Port) - Network Utilization for each interface can be
calculated using the RMON Etherstats
- Network Utilization
- 100 ((Packets 160) (Octets 8))
- ______________________________
- Port Speed Time Delta in Secs.
21A Network Example
W29
W25
W23
W27
W16
W14
W22
W24
W28
W26
FileServer1
W15
W17
UnixWS2
FileServer2
U6
UnixWS3
UnixWS1
Hub
W20
W18
W19
W21
P3
Hub
24 23 22 21 20
19 18 17 16 15
14 13
Switch A
1 2 3 4
5 6 7
8 9 10
11 12
Bkup
Hub
Hub
W13
ProxySvr
Hub
P1
P2
W9
MonSvr
W11
DatabaseSvr2
W1
W3
W5
W7
W10
W12
MailSvr
DatabaseSvr
W2
W4
W6
W8
22RMON Network Activity
23High Utilization Ports Detected
W29
W25
W23
W27
W16
W14
W22
W24
W28
W26
FileServer1
W15
W17
UnixWS2
FileServer2
U6
UnixWS3
UnixWS1
Hub
W20
W18
W19
W21
P3
Hub
24 23 22 21 20
19 18 17 16 15
14 13
Switch A
1 2 3 4
5 6 7
8 9 10
11 12
Bkup
Hub
Hub
W13
ProxySvr
Hub
P1
P2
W9
MonSvr
W11
DatabaseSvr2
W1
W3
W5
W7
W10
W12
MailSvr
DatabaseSvr
W2
W4
W6
W8
24Network Utilization Threshold
- Continuously gather RMON Etherstats and calculate
the network utilization - Allow for a user defined Network Utilization
Threshold - At times of low network utilization (below the
threshold) do not collect RMON Host and Matrix
data - RMON Host and Matrix data is only collected at
critical times - Greatly reduces unnecessary data
25A VERY brief review of Statistical Data
Variability
26Measurement of the Variability of Data
- Mean (Average)
- Interested in the Variability of the data, the
Deviation from the Mean. Deviation - Want to calculate a sort of average deviation
but this calculation will always yield 0.
27Measurement of the Variability of Data
- Sum of the squares of the deviations of the
individual values from the mean of all the
values or Sum of Squares - The variation in a set of numerical data is
defined as the Variance. S2
28Measurement of the Variability of Data
- The Standard Deviation is a sort of average of
the individual deviations of the values from
their mean to which each deviation can be
compared. - S
- Any values that are one Standard Deviation above
or below the mean represent statistical outliers.
29Data Variability Example
12
.
11
.
d36.5
10
d65.5
9
Mean Standard Deviation 8
8
Standard Deviation 3.5
7
.
d51.5
6
Value
Mean 4.5
5
.
.
.
4
d4-0.5
d9-0.5
d7-0.5
3
.
.
d2-2.5
2
.
.
d10-2.5
1
d1-3.5
d8-3.5
1 2 3 4 5 6
7 8 9 10 Sample
Number
Samples 1, 2, 11, 4, 6, 10, 4, 1, 4, 2
Variance 12.5
Sum of Squares 112.5
Standard Deviation 3.5
30Host Data Can we find the needle in the haystack?
31Host Data Limitation
- Host data stored in tables
- Size of tables equal to the number of hosts
detected by the RMON agent - Size of tables only limited by the resources of
the agent - May be hundreds or thousands of hosts
- Is there a method to identify the busiest hosts?
32Host Data Limitation
- Host Data Limitation Calculations
- Overall Host Activity for each host
- Host Activity Mean
- Host Activity Standard Deviation
- Host Activity Threshold
33Host Data Limitation
Overall Host Activity (for each host)
In Octets Out Octets
34Host Data Limitation
Host Activity Mean
Sum of all Overall Host Activity Calculations
Total Number of Hosts
35Host Data Limitation
Host Activity Standard Deviation
Where S Each Overall Host Activity
Calculation M Host Activity Mean
36Host Data Limitation
Host Activity Threshold
Host Activity Mean Host Activity Standard
Deviation
37Host Data Limitation
- Any Host whose Overall Host Activity is greater
than the Host Activity Threshold is defined as
a Busy Host - All hosts activity that is below the threshold
could be combined to represent a virtual host
called All Other Hosts - Answers the question, Which hosts are causing my
network to be busy?
38Host Data Limitation
39Host Data Limitation
40Busy Hosts Detected
W29
W25
W23
W27
W16
W14
W22
W24
W28
W26
FileServer1
W15
W17
UnixWS2
FileServer2
U6
UnixWS3
UnixWS1
Hub
W20
W18
W19
W21
P3
Hub
24 23 22 21 20
19 18 17 16 15
14 13
Switch A
1 2 3 4
5 6 7
8 9 10
11 12
Bkup
Hub
Hub
W13
ProxySvr
Hub
P1
P2
W9
MonSvr
W11
DatabaseSvr2
W1
W3
W5
W7
W10
W12
MailSvr
DatabaseSvr
W2
W4
W6
W8
41Matrix Data Even more data to sift through!
42Matrix Data Limitation
- Matrix (host to host conversations) data stored
in tables - Size of tables equal to the number of host to
host conversations detected by the RMON agent - Size of tables only limited by the resources of
the agent - Can easily be tens of thousands of conversations
- Is there a method to identify the busiest
conversations?
43Matrix Data Limitation
- Conversations are directional there is an entry
for each direction (i.e. H1 to H2 and H2 to H1) - The Busy Hosts detected using Hosts Data
Limitation will be used in Matrix Data
Limitation
44Matrix Data Limitation
- Filter out all conversations between non-busy
hosts - The remaining conversations are separated into
groups representing all the directional
communication of a busy host and all the hosts
its communicating with - These groups defined as Directional
Conversational Host Groups - Each of these groups will be examined for
statistically interesting communication levels
45Matrix Data Limitation
Example of a small Matrix Table. H1 and H4 were
determined to be Busy Hosts using Host Data
Limitation
46Matrix Data Limitation
Conversations between non-busy hosts removed
from Matrix Table.
47Matrix Data Limitation
The remaining conversations separated into
Directional Conversational Host Groups
48Matrix Data Limitation
- Matrix Data Limitation Calculations
- Directional Conversational Host Group Mean
- Directional Conversational Host Group Standard
Deviation - Matrix Group Threshold
49Matrix Data Limitation
Directional Conversational Host Group Mean
Sum of all Octets in the Group
Total Number of Conversations in the Group
50Matrix Data Limitation
Directional Conversational Host Group Standard
Deviation
Where S Each Octet Sample in the Group M
Directional Conversational Host Group Mean
51Matrix Data Limitation
Matrix Group Threshold
Directional Conversational Host Group Mean
Directional Conversational Host Group Standard
Deviation
52Matrix Data Limitation
- Any conversation in a group whose Octets are
greater than the Matrix Group Threshold is
defined as a Busy Conversation - All conversations activity that is below the
threshold in a group could be combined to
represent a virtual conversation called All
Other Conversations - Answers the question, Who are my busy hosts
talking to?
53Matrix Data Limitation
54Most Active Conversations Detected
W29
W25
W23
W27
W16
W14
W22
W24
W28
W26
FileServer1
W15
W17
UnixWS2
FileServer2
U6
UnixWS3
UnixWS1
Hub
W20
W18
W19
W21
P3
Hub
24 23 22 21 20
19 18 17 16 15
14 13
Switch A
1 2 3 4
5 6 7
8 9 10
11 12
Bkup
Hub
Hub
W13
ProxySvr
Hub
P1
P2
W9
MonSvr
W11
DatabaseSvr2
W1
W3
W5
W7
W10
W12
MailSvr
DatabaseSvr
W2
W4
W6
W8
55Further Investigation
56Problem Resolution
57Summary
- Sifting through the vast quantities of RMON Host
and Matrix data to find interesting information
can be a difficult problem - Intelligent RMON Data Limitation is a
programmable solution to statistically determine
which hosts and conversations are causing a
network to be busy - This is a three level algorithm composed of
- Network Utilization Threshold
- Host Data Limitation
- Matrix Data Limitation
58Questions?
A Statistical Method to Extract Value From RMON
Data