Title: Network Monitoring and Management
1Network Monitoring and Management
2ICMP
- Internet Control Message Protocol
- RFC 792
- Transfer of (control) messages from routers and
hosts to hosts - Feedback about problems
- e.g. time to live expired
- Encapsulated in plain IP datagram
- Not reliable
3Application
Transport
TCP
UDP
IGMP
ICMP
Network
IP
Link
Ethernet Driver
incoming frame
4FTP server
telnet server
7
21
23
SMTP
25
data
TCP src port
TCP dest port
header
UDP
17
TCP
TCP
ICMP
6
1
dest addr
source addr
hdr cksum
data
protocol type
IP header
ARP
x0806
IP
IP
x0800
dest addr
source addr
data
Ethernet frame type
CRC
(Ethernet frame types in hex, others in decimal)
5(No Transcript)
6ICMP Types
7ICMP
- Uses IP but is a separate protocol in the network
layer - ICMP messages contain
- Type
- Code
- 1st 8 bytes of bad datagram
IP HEADER PROTOCOL 1 TYPE CODE
CHECKSUM REMAINDER OF ICMP MESSAGE (FORMAT IS
TYPE SPECIFIC)
IP HEADER
IP DATA
8ICMP Message Formats
9(No Transcript)
10(No Transcript)
11(No Transcript)
12Destination Unreachable
- TYPE CODE CHECKSUM
- UNUSED
- IP HEADER 64 bits data from original DG
TYPE 3 CODE 0 Net unreachable 1 Host
unreachable 2 Protocol unreachable 3 Port
unreachable 4 Fragmentation needed but DF
set 5 Source route failed 6 Dest network
unknown 7 Dest host unknown
13Source Quench
- TYPE CODE CHECKSUM
- UNUSED
- IP HEADER 64 bits data from original DG
- TYPE 4 CODE 0
- Flow control
- Indicates that a router has dropped the original
DG or may indicate that a router is approaching
its capacity limit. - Correct behavior for source host is not defined.
14(No Transcript)
15Time Exceeded
- TYPE CODE CHECKSUM
- UNUSED
- IP HEADER 64 bits data from original DG
TYPE 11 CODE 0 Time to live exceeded in
transit 1 Fragment reassembly time exceeded
16Redirect
TYPE CODE CHECKSUM NEW ROUTER ADDRESS IP
HEADER 64 bits data from original DG
TYPE 5 CODE 0 Network redirect 1 Host
redirect 2 Network redirect for specific
TOS 3 Host redirect for specific TOS
17Redirection Concept
Internet
18(No Transcript)
19QUERY Message Echo and Echo Reply
- TYPE CODE CHECKSUM
- IDENTIFIER SEQUENCE
- DATA .
TYPE 8 ECHO 0 ECHO REPLY CODE
0 IDENTIFIER An identifier to aid in matching
echoes and replies SEQUENCE Same use as for
IDENTIFIER UNIX ping uses echo/echo reply
20Replaced by Network Time Protocol (NTP)
21ICMP Timestamp Message
- Hosts on different networks who are trying to
communicate using software that requires time
synchronization can sometimes encounter problems.
- The ICMP timestamp request message allows a host
to ask for the current time according to the
remote host. - The remote host uses an ICMP timestamp reply
message to respond to the request. - All ICMP timestamp reply messages contain the
originate, receive and transmit timestamps. - Using these three timestamps, the host can
estimate transit time across the network by
subtracting the originate time from the transit
time.
22Using Ping
wirth 415pm -gt ping www.uakron.edu PING
arwen.uakron.edu (130.101.81.50) 56(84) bytes of
data. 64 bytes from arwen.uakron.edu
(130.101.81.50) icmp_seq0 ttl62 time0.512
ms 64 bytes from arwen.uakron.edu
(130.101.81.50) icmp_seq1 ttl62 time0.449
ms 64 bytes from arwen.uakron.edu
(130.101.81.50) icmp_seq2 ttl62 time1.38
ms 64 bytes from arwen.uakron.edu
(130.101.81.50) icmp_seq3 ttl62 time0.439
ms 64 bytes from arwen.uakron.edu
(130.101.81.50) icmp_seq4 ttl62 time0.448
ms 64 bytes from arwen.uakron.edu
(130.101.81.50) icmp_seq5 ttl62 time0.496
ms 64 bytes from arwen.uakron.edu
(130.101.81.50) icmp_seq6 ttl62 time0.449
ms --- arwen.uakron.edu ping statistics --- 7
packets transmitted, 7 received, 0 packet loss,
time 6001ms rtt min/avg/max/mdev
0.439/0.596/1.383/0.323 ms, pipe 2 wirth
416pm -gt
23Extended Ping
Used for path MTU discovery
- IP header options can be used along with ICMP
- route recording,
- timestamping,
- source routing
24Traceroute
- UNIX utility - displays router used to get to a
specified Internet Host - Operation
- router sends ICMP Time Exceeded message to source
if TTL is decremented to 0 - if TTL starts at 5, source host will receive Time
Exceeded message from router that is 5 hops away - Traceroute sends a series of probes with
different TTL values and records the source
address of the ICMP Time Exceeded message for
each - Probes are formatted so that the destination host
will send an ICMP Port Unreachable message
25Traceroute and ICMP (2)
- Trace the route of an IP packet
Source
Destination
Router 1
Router 2
Timeline
26Traceroute and ICMP (3)
- Trace the route of an IP packet
- Upon reaching destination,
- No Time exceeded message generated
- How do you know when final destination is
reached? - Traceroute sends to unused UDP port (gt30000),
generating an ICMP destination unreachable
message - With code port unreachable
27Taceroute
- mymachine traceroute www.cis.ksu.edu
- traceroute to polaris.cis.ksu.edu
(129.130.10.93), 30 hops max, 40 byte packets - 1 wraith.facnet.mcs.kent.edu (131.123.46.1)
0.878 ms 0.620 ms 0.553 ms - 2 ghost.uis-mcs.mcs.kent.edu (131.123.40.1)
6.000 ms 3.366 ms 2.632 ms - 3 lib2-255x248-e37-lib.gate.kent.edu
(131.123.255.254) 7.170 ms 3.552 ms 4.477 ms - 4 twcneo-cw.neo.rr.com (204.210.223.3) 9.515
ms 15.167 ms 18.687 ms - 5 bordercore4-hssi1-0.NorthRoyalton.cw.net
(166.48.233.253) 17.864 ms 10.971 ms 14.652 ms - 6 core4.WillowSprings.cw.net (204.70.4.73)
23.438 ms 22.099 ms 17.397 ms - 7 wsp-sprint2-nap.WillowSprings.cw.net
(206.157.77.94) 18.367 ms 22.854 ms 20.267 ms - 8 sl-bb11-chi-2-1.sprintlink.net
(144.232.10.157) 23.518 ms 24.528 ms 18.757 ms - 9 sl-bb12-chi-5-1.sprintlink.net (144.232.10.6)
21.197 ms 31.452 ms 15.050 ms - 10 sl-bb10-kc-7-1.sprintlink.net (144.232.9.117)
46.752 ms 40.125 ms - 11 sl-gw5-kc-0-0-0.sprintlink.net (144.232.2.62)
38.360 ms 48.002 ms 44.795 ms - 12 sl-uok-1-0-0.sprintlink.net (144.232.132.14)
93.256 ms 67.070 ms 61.727 ms - 13 ks-1-ks-ksu.r.greatplains.net
(164.113.232.193) 77.743 ms 64.566 ms 67.117
ms - 14 164.113.212.250 (164.113.212.250) 59.988 ms
46.188 ms 55.616 ms - 15 129.130.252.9 (129.130.252.9) 68.211 ms
67.881 ms 75.441 ms - 16 polaris.cis.ksu.edu (129.130.10.93) 76.462
ms 54.838 ms
28PMTU-D
TCP path-MTU discovery
29(No Transcript)
30SNMP
- Where did it come from ?
- Internet Engineering Task Force
- Network Management Area
- SNMP v1
- MIBv1, MIBv2
- SNMP v2 (?)
- SNMP v3 (?)
31SNMPv1 History
- RFC 1157, 1990
- A Simple Network Management Protocol (SNMP)
- RFC 1155, 1158, 1213, 1990
- Specification of the MIBv2
- Written in ASN.1
32(No Transcript)
33Protocol context of SNMP
34SNMPv1 Protocol
- Five Simple Messages
- get-request
- get-next-request
- get-response
- set-request
- trap
35SNMP - SNMP Message Handling -
GetRequest (What is the value of MIB?)
SNMP Agent
SNMP Manager
GetResponse (The value is XXXX!)
GetNextRequest (What is the next value of MIB
Tree ?)
GetResponse (The value is XXXX!)
SetRequest (Modify the value of OID)
GetResponse (The value is XXXX!)
Trap (Problem happened!)
36SNMPv1 UDP ports
get_request
get_response
port 161
get_next_request
port 161
get_response
Manager
Agent
set_request
port 161
get_response
trap
port 161
port 162
37SNMPv1 Packet Format
UDP Header
PDU Type
Request ID
Error Status
Error Index
Version
Community
name
value
name
...
- SNMP version (0 is for version 1)
- Community (read-only, read-write)
- Shared password between agent and manager
- PDU Specifies request type
- Request ID
- Error Status
- Error Index
38Community Names
- Community names are used to define where an SNMP
message is destined for. - Set up your agents to belong to certain
communities. - Set up your management applications to monitor
and receive traps from certain community names.
39RFC 1065 (MIB Structure)
- Structure and Identification of Management
Information for TCP/IP-based Internets (SMI) - Uses Abstract Syntax Notation 1 (ASN.1)
- Types of information
- Network Address
- IP Address
- Counter (32 bit monotonically increasing)
- Gauge (32 bit variable)
- Timeticks (time in hundredths of a second)
- Opaque (arbitrary syntax for text data)
- Adopted as a full standard in RFC 1155 (basically
unchanged)
40MIB definitions
- RFC 1066 - MIB definitions using RFC 1065 (RFC
1155) (Rose McCloghrie) - First version of the MIB now called MIB-I
- Adopted as a full standard in RFC 1156
(essentially unchanged from 1066) - RFC 1158 - extends MIB-I and defines MIB-II
- Adopted as a full standard in RFC 1213
41Vendor extensions to MIB
- RFC 1156 (MIB-I) allowed for vendor specific
extensions to be included in the MIB - Allows for additional management information
about devices not provided for in the standard
MIB - For example CPU utilisation
- Normal for devices to support all of MIB-II PLUS
have their own vendor-specific extensions
42SNMP NAMES
43OSI Object Identifier Tree
44SNMP - MIB Tree -
- Objects are managed by the tree
- Expressed in a row of values divided by the
period
root
iso(1)
ccitt(0)
Joint-iso-ccitt(2)
org(3)
dod(6)
Internet(1)
directory(1)
mgmt(2)
exprimental(3)
private(4)
mib-2(1)
enterprise(1)
Standard MIBs
Vendor-specific MIBs
45SNMP Naming
- question how to name every possible standard
object (protocol, data, more..) in every possible
network standard?? - answer ISO Object Identifier tree
- hierarchical naming of all objects
- each branchpoint has name, number
1.3.6.1.2.1.7.1
udpInDatagrams UDP MIB2 management
ISO ISO-ident. Org. US DoD Internet
46SNMP - OID -
- OID Expression
- iso(1). org(3). dod(6). internet(1). mgmt(2).
mib2(1) - -gt .1.3.6.1.2.1
- e.g. sysDscr .1.3.6.1.2.1.1.1
mib-2.1.1 system.1
Subtree Name OID Description
system 1.3.6.1.2.1.1 Defines a list of objects that pertain to system operation, such as the system uptime, system contact, and system name.
interfaces 1.3.6.1.2.1.2 Keeps track of the status of each interface on a managed entity. The interfaces group monitors which interfaces are up or down and tracks such things as octets sent and received, errors and discards, etc.
at 1.3.6.1.2.1.3 The address translation (at) group is deprecated and is provided only for backward compatibility. It will probably be dropped from MIB-III.
ip 1.3.6.1.2.1.4 Keeps track of many aspects of IP, including IP routing.
icmp 1.3.6.1.2.1.5 Tracks things such as ICMP errors, discards, etc.
tcp 1.3.6.1.2.1.6 Tracks, among other things, the state of the TCP connection (e.g., closed, listen, synSent, etc.).
udp 1.3.6.1.2.1.7 Tracks UDP statistics, datagrams in and out, etc.
egp 1.3.6.1.2.1.8 Tracks various statistics about EGP and keeps an EGP neighbor table.
transmission 1.3.6.1.2.1.10 There are currently no objects defined for this group, but other media-specific MIBs are defined using this subtree.
snmp 1.3.6.1.2.1.11 Measures the performance of the underlying SNMP implementation on the managed entity and tracks things such as the number of SNMP packets sent and received.
47SNMP - MIB OID -
- SNMP Manager can acquire the management
information defined by MIB(Management Information
Base) from Agent - Current version MIBv2 RFC 1213
- MIB is the aggregate of object (information) on
the equipment which SNMP Agent holds - Identifier is defined for each object OID
- MIB performed by Agent is roughly divided into
- MIBv2 standard, public, specified by IETF
- Enterprise MIB private, specified by vendor
company
48SNMP MIB
MIB module specified via SMI MODULE-IDENTITY (100
standardized MIBs, more vendor-specific)
OBJECT TYPE
OBJECT TYPE
OBJECT TYPE
objects specified via SMI OBJECT-TYPE construct
49SMI Object, module examples
ipMIB MODULE-IDENTITY LAST-UPDATED
941101000Z ORGANZATION IETF SNPv2
Working Group CONTACT-INFO Keith
McCloghrie DESCRIPTION The MIB
module for managing IP and ICMP
implementations, but excluding their
management of IP routes. REVISION
019331000Z mib-2 48
ipInDelivers OBJECT TYPE SYNTAX
Counter32 MAX-ACCESS read-only STATUS
current DESCRIPTION The total number of
input datagrams successfully
delivered to IP user- protocols (including
ICMP) ip 9
50MIB example UDP module
Object ID Name Type
Comments 1.3.6.1.2.1.7.1 UDPInDatagrams
Counter32 total datagrams delivered
at this node 1.3.6.1.2.1.7.2
UDPNoPorts Counter32 underliverable
datagrams no app at
portl 1.3.6.1.2.1.7.3 UDInErrors
Counter32 undeliverable datagrams
all other reasons 1.3.6.1.2.1.7.4
UDPOutDatagrams Counter32 datagrams
sent 1.3.6.1.2.1.7.5 udpTable SEQUENCE
one entry for each port in use by
app, gives port and IP address
51ASN.1 Abstract Syntax Notation 1
- ISO standard X.680
- used extensively in Internet
- like eating vegetables, knowing this good for
you! - defined data types, object constructors
- like SMI
- BER Basic Encoding Rules
- specify how ASN.1-defined data objects to be
transmitted - each transmitted object has Type, Length, Value
(TLV) encoding
52ASN.1 Syntax
- SYNTAX
- Data-type eg. Integer, Gauge, Counter,
PhysAddress, ... - ACCESS
- read-only, read-write, write-only, not-accessible
- STATUS
- mandatory, optional, obsolete
53Syntax
- uses ASN.1 (Abstract Syntax Notation)
- binary encoding
- 02 01 06 is a 1 byte integer, value
6 - Primitive Types
- INTEGER, OCTECT STRING, OBJECT
IDENTIFIER, NULL - Constructor Types
- SEQUENCE ltprimitive-typegt ... ie. a
record - SEQUENCE OF ltprimitive-typegt ... ie. an
array
54Syntax
- Defined Data Types
- IpAddress what you expect
- Counter non-negative integer that wraps
- Gauge non-negative integer that latches
- TimeTicks time in hundredths of seconds
55TLV Encoding
- Idea transmitted data is self-identifying
- T data type, one of ASN.1-defined types
- L length of data in bytes
- V value of data, encoded according to ASN.1
standard
Tag Value Type
Boolean Integer Bitstring Octet
string Null Object Identifier Real
1 2 3 4 5 6 9
56TLV encoding example
Value, 259 Length, 2 bytes Type2, integer
Value, 5 octets (chars) Length, 5 bytes Type4,
octet string
57SNMP - SNMP Message Handling 2 -
GetRequest inetapan_at_toolsgt snmpget -v2c -c xxxx
tpr2.jp.apan.net .1.3.6.1.2.1.2.2.1.4.136 IF-MIB
ifMtu.136 INTEGER 9192
GetNextRequest inetapan_at_toolsgt snmpget -v2c -c
xxxx tpr2.jp.apan.net system SNMPv2-MIBsystem
No Such Object available on this agent at this
OID inetapan_at_toolsgt snmpwalk -v2c -c xxxx
tpr2.jp.apan.net system SNMPv2-MIBsysDescr.0
STRING m20 internet router, kernel
6.2R3.10 SNMPv2-MIBsysObjectID.0 OID
SNMPv2-SMIenterprises.2636.1.1.1.2.2 DISMAN-EVEN
T-MIBsysUpTimeInstance Timeticks (423280751)
48 days, 234647.51 SNMPv2-MIBsysContact.0
STRING SNMPv2-MIBsysName.0 STRING
tpr2 SNMPv2-MIBsysLocation.0
STRING SNMPv2-MIBsysServices.0 INTEGER 4
SetRequest inetapan_at_toolsgt snmpset v2c c xxxx
tppr.jp.apan.net system.sysLocation.0
system.sysLocation.0 "" inetapan_at_toolsgt
snmpset v2c c yyyy tppr.jp.apan.net
system.sysLocation.0 s Tokyo, JP system.sysLocat
ion.0 Tokyo, JP" inetapan_at_toolsgt snmpset
v2c c xxxx tppr.jp.apan.net system.sysLocation.0
system.sysLocation.0 Tokyo, JP"
58SNMP - Trap Message -
- The way for Agent to inform Manager about event
of something undesirable - Trap originates from Agent and is sent to the
trap destination, as configured within Agent
itself - When Manager receives a trap, it needs to know
how to interpret it - PDU
- Enterprise
- vendor identification (OID) for the agent
- AgentAddress
- The IP address of the node where the trap was
generated. - Trap Type
- Generic / Specific (not used)
- Timestamp
- The length of time between the last
re-initialization of the agent that issued a trap
and the moment at which the trap was issued
59SNMP
- SNMP Traps
- unsolicited notification of events
- can include variable list
- ColdStart, WarmStart
- LinkUp, LinkDown
- Authentication Failure
- EGP Neighbour Loss
- Enterprise Specific
60Traps
- Forwarded automatically from agent to station(s)
in response to an event with the device - Traps defined in MIB-II
- Cold-start of system
- Warm-start of system
- Link down
- Link up
- Failure of authentication
- Exterior Gateway Protocol (EGP) neighbour loss
- Enterprise specific
61SNMPv2 History
- RFC 1441, 1993 Introduction to version 2 of the
Internet-standard Network Management Framework - RFC 1446, 1993 Security Protocols for version 2
of the Simple Network Management Protocol - Written to address security and feature
deficiencies in SNMPv1
62SNMPv2 Protocol
- Extension to SNMPv1
- Provided security model
- 2 new commands
- get-bulk-request
- inform-request
63SNMPv2 Protocol continued...
privDst
dstParty
srcParty
context
PDU
authInfo
General Format
privDst
dstParty
srcParty
context
PDU
0-length OCTET STRING
Nonsecure Message
digest
dstTime
srcTime
privDst
dstParty
srcParty
context
PDU
Authenticated, not encrypted
privDst
dstParty
srcParty
context
PDU
0-length OCTET STRING
Private, not authenticated
privDst
digest
dstTime
srcTime
dstParty
srcParty
context
PDU
Private and authenticated
64Format of SNMPv1 messages
Get-Request, Get-Next-Request, Set-Request
Get-Response
Version Community PDU Enter- Agent Generic
Specific Time Name X Value X
String type prise Addr trap
trap
Trap
65Coexistence by Means of Proxy Agent
66SNMPv2C Protocol
- SNMPv2 additional PDU types
- SNMPv1 Community based authentication
- UDP transport
- All the features of SNMPv2 with the security of
SNMPv1
67SNMPv1 and SNMPv2
- SNMPv1 is a subset of SNMPv2
- Managers usually can send requests in either
format depending on the capability of the agents - Requires an update of the agent and manager
software to migrate from SNMPv1 to SNMPv2 - Many manufacturers are resisting SNMPv2 for a
variety of reasons leading to an SNMPv3
specification - Almost all manufacturers currently support SNMPv1
68Network Monitoring Tools
69Ways of Monitoring
- Classified into three monitoring ways
- In Internal Network (mostly)
- Via External Network
- Non-network (Emergency case)
1, Monitoring in internal Network (mostly)
3, Independent access (Emergency case) - ISDN,
PSTN
External network
Internal network
2, Monitoring via External Network - via
Peering Network - via the Internet
Monitoring Machine
70Network Management Software
- SNMP Agents
- provided by all router vendors
- many expanded (enterprise) MIBs
- bridges, wiring concentrators, toasters
71Network Management Software
- Public Domain
- Application Programming Interfaces available from
CMU and MIT - include variety of applications
72Network Management Software
- Commercially
- many offerings, UNIX and PC based
- HP OpenView
- SunNet Manager
- Cabletron Spectrum
- MANY others
73Commercial SNMP Applications
- http//www.hp.com/go/openview/ HP OpenView
- http//www.tivoli.com/ IBM NetView
- http//www.novell.com/products/managewise/
Novell ManageWise - http//www.sun.com/solstice/ Sun MicroSystems
Solstice - http//www.microsoft.com/smsmgmt/ Microsoft SMS
Server - http//www.compaq.com/products/servers/management/
Compaq Insight Manger - http//www.redpt.com/ SnmpQL - ODBC Compliant
- http//www.empiretech.com/ Empire Technologies
- ftp//ftp.cinco.com/users/cinco/demo/ Cinco
Networks NetXray - http//www.netinst.com/html/snmp.html SNMP
Collector (Win9X/NT) - http//www.netinst.com/html/Observer.html Observe
r - http//www.gordian.com/products_technologies/snmp.
html Gordians SNMP Agent - http//www.castlerock.com/ Castle Rock
Computing - http//www.adventnet.com/ Advent Network
Management - http//www.smplsft.com/ SimpleAgent,
SimpleTester
74Monitoring Targets
- Target suitable for checking normality of network
service - Router
- Dead or Alive?
- Status?
- Performance? Routing?
- Server
- Dead or Alive?
- Status?
- Damon? Service Port?
- Traffic, etc.
- Increase or decrease?
- Dos Attack? Performance? Environment?
-
75Monitoring Method
- How to monitor the target
- Active monitor or Passive monitor
- Polling Monitoring machines give message in
watching target - Useful for checking the current status
- ICMP/SNMP polling
- Receive trap message from target
- Useful for detecting the status change
- SNMP trap, syslog
- Statistics data
- Useful for grasping the trend and transition
- Select the Monitoring Tool
- Ping (ICMP), SNMP, Monitoring Tool, Original
Tool, etc. - Check the monitoring Route to Target
- Internal or External network
76- ICMP/Ping Polling 1 -
- Check IP reachability by ICMP echo/reply
- Additional information
- RTT (Round Trip Time)
- Packet Loss
- TTL (Time to Live)
- Most standard way of checking node activity
- Time series RTT/Packet loss data becomes
important information when measuring link
performance
ICMP echo
RTT xx msec Packet Loss xx TTL xx
ICMP echo reply
77UDP/TCP polling
- Effective in monitoring service ports of server
- Using client for service
- DNS - nslookup
- Using telnet
- WWW,SMTP,POP
- Using tool
- Radius - radping
bash-2.05 telnet ns.jp.apan.net 80 Trying
203.181.248.3... Connected to ns.jp.apan.net. Esca
pe character is ''. get lt!DOCTYPE HTML PUBLIC
"-//IETF//DTD HTML 2.0//EN"gt lthtmlgtltheadgt lttitlegt5
01 Method Not Implementedlt/titlegt
Telnet with service port
reply
78Monitoring Software - HP OpenView -
- HP OpenView Network Node Manager
- http//www.openview.hp.com/products/nnm/index
.html - Overview
- Auto discovery and mapping
- Drill-down views (Hierarchy Map)
- Fault monitoring ICMP / SNMP polling
- Event monitoring Trap receiving/Event
configuration - SNMP tools Status polling
- MIB Browser
- Web-based reports
- Extended software is enhanced
- Platform Windows 2000/XP, Solaris 8/9, HP-UX
79Monitoring Software - HP OpenView Sample 1-
Event log
Network map
ICMP polling for connectivity check
Router map
Network sub-map
80Monitoring Software - HP OpenView Sample 2-
Event configuration
Snmp configuration for polling - parameters -
community
Data collection Thresholds for SNMP
81Monitoring Software - Nagios Overview-
- Nagios
- Freely available from http//www.nagios.org
- Overview
- A host and service monitor designed to
- inform you of network and end system problems
- Provides simple ping availability of resources on
the network - Works with a set of plugins to provide local
and remote host service status - Custom plugins are relatively easy to develop
- Web-based monitoring system
- Platform Linux, UNIX
82Monitoring Software - Nagios Sample 1-
Service Overview For All Host Groups
Service Status Details For All Hosts
83Monitoring Software - Nagios Sample 2-
Network Map For All Hosts
Event log
84MRTG (Multi-Router Traffic Grapher)
- Overview
- Monitors the load of network equipment using
SNMP, mainly used for creation of traffic graph - Excellent graphing tool developed by Tobias
Oetiker - Plots graph with any two variables against time,
It is graph-ized with PNG format on HTML page - Able to create scripts to feed data into MRTG
- Implements data collection, image, web-page
collection - Very widely deployed in large networks and still
being actively developed - Platform UNIX system / Windows NT
- Supports SNMPv2 able to read 64bit counters
- http//people.ee.ethz.ch/oetiker/webtools/mrtg/
85MRTG - Workflow -
- Green area typically represents incoming maximum
bits per second - Blue line typically represents outgoing maximum
bits per second
- Workflow
- Read configuration file
- Collect graphing data from network equipment,
based on configuration - Update database file and generate graph
- If required, generate HTML file
- MRTG performs above workflow then completes
- Since MRTG collects data of the past 5 minutes
(default value of source code), it is desirable
to set crontab for every 5 minutes
86MRTG - Data Storage -
- Data Storage
- Keeps 5 minute data only for 2.5 days. The data
is thrown away afterward. - There is no referring to historical data with
high resolution - Keeps 1-day data for approx. 2 years
Daily grafh/5min
Weekly grafh/30min
Interval Num of record Storage period Graph
5 minutes 600 2.5 days daily
30 minutes 600 12.5 days Weekly
2 hours 600 50 days Monthly
1 day 731 2 years Yearly
Monthly grafh/2hours
Rougher Resolution
Yearly grafh/1day
87MRTG - Configuration 1 -
- MRTG Configuration
- cfgmaker
- Helps to create configuration file form
- Example
- cfgmaker -global WorkDir /home/httpd/html/mrtg
\ - -global "Options_
bits,growright \ - -output /home/httpd/html/mrtg/cfg/mr
tg.cfg \n - community_at_router.domain.name
- Graph log data /home/httpd/html/mrtg
- Configuration file /home/https/html/cfg/mrtg.cf
g - Option unit bits(bps), Horizontal axis
grow right way - Detailed information
- http//people.ee.ethz.ch/oetiker/webtools/mrtg/cf
gmaker.html
88MRTG - Configuration 2 -
- Target Configuration
- Target Expression
- Targetlttarget namegtlttarget kindgtltcommunitygt_at_lta
ddressgt - lttarget namegt Identify equipment
- lttarget kindgt Measurement item
- ltcommunitygt SNMP community string
- ltaddressgt Hostname or IP address of equipment
- SNMP data collection specification method
- Basic / Port (ifindex)
- Targetmyrouter 2public_at_wellfleet-fddi.ethz.ch
- Explicit OIDs / MIB Variables
- Targetmyrouter 1.3.6.1.2.1.2.2.1.14.11.3.6.1.2
.1.2.2.1.20.1public_at_myrouter - Targetmyrouter ifInErrors.1ifOutErrors.1publi
c_at_myrouter - You can use cfgmaker to generate references with
the options - -- ifref?
- ifrefip Interface by IP
- ifrefdescrf Interface by Description
- ifrefname Interface by Name
- ifrefeth Interface by Ethernet Address
89MRTG - Configuration 3 -
Targetla ifHCInOctets\so-2/0/0ifHCOutOctets\s
o-2/0/0xxxxxxx_at_tpr2.jp.apan.net2 MaxBytesla
300000000 Titlela Traffic Analysis of
TransPAC LA Link PageTopla ltH1gtTraffic
Analysis of TransPAC LA linklt/H1gt WithPeakla
ymw Directoryla tpr2 Optionsla bits,
growright Targetla-err ifInErrors\so-2/0/0ifO
utErrors\so-2/0/0xxxxxxx_at_tpr2.jp.apan.net MaxByte
sla-err 300000000 Titlela-err Packet Error
for TransPAC LA link PageTopla-err ltH1gtPacket
Error for TransPAC LA linklt/H1gt Directoryla-err
tpr2 Optionsla-err growright, integer,
nopercent YLegendla-err Number of Error
Packets ShortLegendla-err n Legend1la-err
Number of Error Packets for Incoming
Traffic Legend2la-err Number of Error Packets
for Outgoing Traffic Legend3la-err Peak of
Number of Error Packets for Incoming
Traffic Legend4la-err Peak of Number of Error
Packets for Outgoing Traffic LegendIla-err
nbspIn LegendOla-err nbspOut WithPeakla-
err w
90MRTG - Comments -
- Comments / Disadvantages
- If you are to monitor a lot of devices (1000s),
it is better to have a fast disk - If using external monitoring scripts, a fast
processor and a lot of memory is necessary - Not particularly fast when compared to other data
retrieval and storage schemes (Flat text files
can slow down processing.) - MRTG cant customize graphing periods
- Flat text files are difficult to process when
scripting against the data - Use 64bit counters with SNMPv2 for OC3-OC192
speed interface, GbE if it is 115Mbps traffic can
wrap 32bit counters around in 5 minutes - MRTG cant modify collected data which is
summarized - Only two variables are available in processing a
graph
91RRDtool (Round Robin Database Tool)
- Overview
- Successor to MRTG
- Developed by the same developer of MRTG Tobias
Oetiker - Tool group for RRD can flexibly define data item,
time interval, data amount, graph depiction, etc. - Binary file format that can store data at any
interval for any length of time - File does not grow in size over time
- Ability to make custom graphs across user-defined
intervals - Ability to graph multiple variables on a single
graph - Additional scripts are necessary in creating
graphs and web-page - 25-30 percent faster than MRTG
- Does not have the function to collect data
- http//people.ee.ethz.ch/oetiker/webtools/rrdtool
/
92RRDtool - Architecture -
- Comparison of architecture between MRTG and RRD
SNMP engine
Graph
router
Index
log
Frontend Program
Frontend Program
Graph
router
server
Index
RRD
text
93RRDtool - Basic Usage -
- Basic usage of RRD tools
- Set up new Round Robin (RRD) ?
- Define RRD used as vessel of data
- Command rrdtool create filename
- Store new set of values into RRD periodically
? - Write the data collected by frontend program in
RRD - Command rrdtool update filename
- Generate Graph ?
- Create graph from data stored in one or several
RRDs - Command rrdtool graph filename (specify the
graph name to generate)
data
data
Graph
data
?
?
RRD
?
94RRDtool - Practice -
- Example
- Object
- Gigabit Ethernet Switch
- Definition
- Definition of RRD record
- Ability to describe peak graph from data of 1-day
to 10-years
Interval Num of RRD file Storage Period Graph
1 minute 360 6 hours 4 hours
5 minutes 576 2 days Daily
2 hours 600 50 days Monthly
1 day 731 2 years Yearly
4 days 915 10 years 10 years
95RRDtool - Create -
- Set up a new Round Robin Database (RRD)
- DS Define the data item
- COUNTER continuous increasing counters
- 60 if no new data is supplied for more than 60
- sec, it is considered as unknown
- 0 minimum acceptable value (byte)
- 125000000 maximum acceptable value (byte)
- RRA (Round Robin Archive) Define the data
consolidations - AVARAGE/MAX average /maximum of consolidated of
data - 0.5 consolidation interval is be made up from
UNKNOWN data while the consolidated value is
still regarded as known. - - Average 50. MAX 20 or 10
- 1 consolidated data point where the data then
goes into the archive - 360 how many generations of data values are
kept in RRA
Command Example /usr/local/rrdtool-1.0.46/bin/rr
dtool create \ /home/httpd/html/traffic/traffic_vl
an.rrf \ step 60 \ DSvlan2incounter600125000
000 \ DSvlan2outcounter600125000000
\ DSvlan7incounter600125000000
\ DSvlan7outcounter600125000000 \
RRAAVERAGE0.51360 \ RRAAVERAGE0.55576
\ RRAAVERAGE0.5120600 \ RRAAVERAGE0.514407
31 \ RRAAVERAGE0.55760915 \ RRAMAX0.25576
\ RRAMAX0.1120600 \ RRAMAX0.1440731
\ RRAMAX0.15760915 \
96RRDtool - Update -
- Stores a new set of values into RRD periodically
- Data collection
- Collect the data from targets using frontend
program - Original tool
- Cricket - http//cricket.sourceforge.net/
- Orca - http//www.orcaware.com/orca/
- SNAPP - http//sourceforge.net/projects/snapp/
- Updating an RRD
- Feed collected data into a RRD database using
following commands
Command Example rrdtool update /home/httpd/html/tr
affic/traffic_vlan.rrd \ --template inout
N112221
- The name of the RRD you want to update.
- NUpdate time is set to be the current time
- DS1 DS2
- The data sources are defined in the RRD
97RRDtool - Graph 1 -
Command Example rrdtool graph /home/httpd/html/tr
affic/traffic.png -s -4h w 800 h 800 a PNG \
t VLAN Traffic v bit/s \
DEFvlan2in_ave/home/httpd/html/traffic/traffic_
vlan.rrdvlan2inAVERAGE \ DEFvlan2out_ave/home/
httpd/html/traffic/traffic_vlan.rrdvlan2outAVERA
GE \ DEFvlan7in_ave/home/httpd/html/traffic/traf
fic_vlan.rrdvlan7outAVERAGE \ DEFvlan7in_ave/h
ome/httpd/html/traffic/traffic_vlan.rrdvlan7outA
VERAGE \ CDEFvlan2in_ave_bitvlan2in_ave,8
\ CDEFvlan7in_ave_bitvlan7in_ave,8
\ CDEFvlan2out_ave_bitvlan2out_ave,-8
\ CDEFvlan7out_ave_bitvlan7out_ave,-8
\ AREAvlan2in_ave_bitff5e5eVLAN2-in
\ STACKvlan7in_ave_bit5eff5eVLAN7-in
\ AREAvlan2out_ave_bitaa0101VLAN2-out
\ STACKvlan7out_ave_bit0101aaVLAN7-out \
Options -s start time (default seconds),
-e end seconds (default seconds), -w,h
width and height pixels, -a image format
GIFPNG, -t Graph title, -v vertical-label
text
98RRDtool - Graph 2 -
- Generating a Graph -2-
- DEF
- Define virtual name for data source
- DEFltvnamegtltRRDfilenamegtltDS-namegtCF
- CF consolidation function
- select AVARAGE, MAX, MIN, LAST ( Newest
data) - CDEF
- Create new virtual data source by evaluating
mathematical expression - CDEFltvnamegtrpn-expression (Reverse Polish
Notation) - Graph depiction parameter
- ltStylegtltvnamegtltcolorgtltlegendgt
- LINE Plot for the request data, using the color
specified - AREA Area between 0 line and the graph line
will be filled with the color specified - STACK Graph gets stacked on top of the previous
LINE, AREA, or STACK graph - By updating graph generation periodically using
crontab, you can see updated graphs on the Web
99RRDtool - Sample -
http//mrtg.jp.apan.net/cricket/router-interfaces/
100Netflow - Overview -
- Overview
- Enables IP traffic flow analysis without probes
- Invented and patented by Cisco
- Juniper (called cflowd), Foundry, ??? many
venders are supporting - Flow cash data on routers is exported
- to a flow tool, so that traffic flow is to be
analyzed
- flow Definition
- Source IP address
- Destination IP address
- Source port
- Destination port
- Layer 3 protocol type
- TOS byte (DSCP)
- Input logical interface (ifIndex)
Traffic
Enable NetFlow
Core Network
UDP NetFlow Export Packets
Collector (Solaris, HP-UX, or Linux)
Application GUI
101Netflow - Flow Data -
- Flow data export
- Enable NetFlow on the router
- There is difference in architecture between Cisco
and Juniper routers - Take care! the load of a router does not become
high! - - Check CPU, memory, bandwidth, sampling rate
- Flow data collection Analysis
- Prepare the software for receiving flow-export
data - flow-tools http//www.splintered.net/sw/flow-tools
/ - cflowd http//www.caida.org/tools/measurement/cfl
owd/ - Cisco NetflowCollector
- Analyze traffic from raw data with software
- flow-scan http//net.doit.wisc.edu/plonka/FlowSca
n/ - (If you want to graph-ize analysis data, I
recommend you to use RRDtool) - Cisco CiscoWorks
- Source and destination IP address
- Source and destination TCP/UDP ports
- Packet and byte counts
- Routing information (next-hop address, source
autonomous system (AS) number, destination AS
number, source prefix mask, destination prefix
mask)
102Netflow - Example -