Title: Challenges in High Performance Network Monitoring
1Challenges in High PerformanceNetwork Monitoring
- How to monitor networks that become faster and
faster - Fulvio Risso (fulvio.risso_at_polito.it)
- http//netgroup.polito.it/fulvio.risso/
2Outline
- Introduction
- What is Network Monitoring
- Why you need Network Monitoring
- What to monitor
- Technologies
- How to get data
- Active Network Monitoring
- Ping, traceroute, pathchar, RIPE TT
- Passive Network Monitoring
- Polling, event reporting
- Sniffing, SNMP, RMON, Flow-based technologies
- Challenges in High Speed Networks
- Speed
- Information overload (e.g. storage)
3What is Network Monitoring
- Network monitoring relates to the
- observation and the analysis of the
- status and behaviour
- of the following managed objects
- network devices
- end systems
- network links
- network traffic
- network applications
4Why Network Monitoring?
- Network statistics (for optimization and
planning) - Network mapping/inventory
- Security
- Troubleshooting
- Accounting
- Network monitoring
- Traffic statistics
- bandwidth usage
- service usage
- traffic distribution (e.g. local vs. remote)
- Network optimization and hardening (to achieve
responsiveness to change and growth) - Bottlenecks
- Throughput
- Identification of routers and servers (DNS, )
- Mapping client characteristics (opened ports, )
- Identifying unofficial services or servers
- Detection of network security violations
- Intrusion Detection
- Compromised Hosts
- Protecting your network from the world
- Faulty Hardware
- (No) Connectivity
- Resource and service availability
- Keep logs of users activities
5Why you need Network Monitoring (1)
- Network statistics (for optimization and
planning) - Network monitoring
- Traffic statistics (bandwidth usage, service
usage, traffic distribution (e.g. local vs.
remote)) - Network optimization and hardening (to achieve
responsiveness to change and growth) - Bottlenecks
- Throughput
- Network mapping/inventory
- Identification of routers and servers (DNS, )
- Mapping client characteristics (opened ports, )
6Why you need Network Monitoring (2)
- Security
- Identifying unofficial services or servers
- Detection of network security violations
- Intrusion Detection
- Compromised Hosts
- Protecting your network from the world
- Troubleshooting
- Faulty Hardware
- (No) Connectivity
- Resource and service availability
- Accounting
- Keep logs of users activities
7What to monitor?
- Traffic
- Measurements
- When you already know what to measure
- E.g. get the amount of IP traffic
- Generic monitors
- When you do not know exactly what to measure
- E.g. get the distribution of the network-layer
protocols - Traffic characterization
- When you want to create a model (mathematical,
maybe?) of the traffic - E.g. extract some valuable data from the current
traffic - Probes
- When you want to probe your network
- Availability (links, network resources, services,
etc) - Events and Alerts (e.g. traffic thresholds)
8Example ntop
- Ntop is a simple, open source (GPL), portable
traffic measurement and monitoring tool, which
supports various management activities, including
network optimization and planning and detection
of security violations
9What ntop does (1)
- Traffic Measurement
- Data sent/received Volume and packets,
classified according to network/IP protocol - Multicast Traffic
- TCP Session History
- Bandwidth Measurement and Analysis
- Traffic Characterisation and Monitoring
- Network Flows
- Protocol utilisation ( req, peaks/storms,
positive/negative repl.) and distribution - Network Traffic Matrix
- ARP, ICMP Monitoring
10What ntop does (2)
- Network Optimisation and Planning
- Passive network mapping/inventory identification
of Routers and Internet Servers (DNS, Proxy) - Traffic Distribution (Local vs. Remote)
- Service Mapping service usage (DNS, Routing)
- Anomalies Detection through some common traffic
parameters - ICMP ECHO request/response ratio
- ICMP Destination/Port Unreachable
- SYN Pkts vs. Active TCP Connections
- Suspicious packets (e.g. out of sequence)
- Fragments percentage
- Traffic from/to diagnostic ports
- TCP connections with no data exchanged
11What ntop does (3)
- TCP/IP Stack Verification
- Network mapping improper TCP three way
handshaking (e.g. queso/nmap OS Detection) - Portscan stealth scanning, unexpected packets
(e.g. SYN/FIN) - DOS synflood, invalid packets (ping of death,
WinNuke), smurfing - IDS/Firewall elusion overlapping fragments,
unexpected SYN/ACK (sequence guessing) - Intruders peak of RST packets
- Intrusion Detection
- Trojan Horses (e.g. traffic at know ports)
- Spoofing Local (more MAC addresses match the
same IP address) and Remote (TTL !) - Network discovery (via ICMP, ARP)
- Viruses host contacts in the last 5 minutes
(warning in this respect P2P apps behave as
viruses/trojans!)
12Possible approaches to NM
- Active
- The system under monitor is probed periodically
with some external signal - Passive
- A probe (silently) collects data and infers some
properties from it
13Active Network Monitoring
- Often based on specific traffic / packet
patterns, generated specifically for monitoring
purposes - Usually ICMP packets
- Sometimes other probes (e.g. TCP connections)
- Used for
- Delay measurement
- One way, End-to-end
- Remote devices availability
- Services
- Examples
- RIPE Test Traffic Measurement Service
- PingER (Ping End-to-end Reporting) at Stanford
University - nmap
14Passive Network Monitoring
- The most widely used approach
- Preferred for its lack of intrusiveness
- Used for
- Traffic measurement, monitoring, characterization
- E.g. network traffic is examined to generate
alerts or statistics - E.g. full packet decoding (e.g. for
troubleshooting) - Status and parameters of network links, network
devices, - E.g. traffic load on interface, link-layer
signals - Available technologies
- Packet-based approach Packet Sniffing
- Generic statistics and network status SNMP
- Aggregate statistics approach RMON
- Flow-based approach NetFlow, sFlow, IPFIX
15Passive NM packet-based approach
- Sniffing
- We want to capture exactly the frames that are
being transferred on a wire or on some specific
network segment - Very detailed view (e.g. for debugging)
- May have limited knowledge of link-layer issues
(e.g. Ethernet collisions, ...) - Very large amount of data to be processed
- Privacy concerns
16Sniffing architectural choices
? Fast ? Expensive (niche market) ? Difficult to
move / duplicate ? ASIC cannot be reprogrammed /
updated (FPGA can, but it is not very simple)
Hardware-based Systems
Performance
Versatility
17Sniffing where to capture traffic (1)
Old Ethernet
? Captures everything, even physical signals ?
Precise timestamping ? Practical issues (you need
an old Ethernet)
18Sniffing where to capture traffic (2)
Switched Network
Network device-based
Mirror port (per port, per port group, per
vlan,...)
? Captures all the traffic, even from several
ports, even from remote locations (such
as Cisco RSPAN) ? Requires a dedicated port on
the switch ? May need faster interfaces (at
least 2x for tx and rx) ? Timestamps not
precise ? May be problems for correlating
traffic (which port originates this
packet?) ? Unable to detect link-layer problems
? Captures all the traffic, even from several
ports ? Precise timestamps ? Traffic correlation
easier ? Requires a dedicated port on the
device ? May need faster interfaces (at least 2x
for tx and rx) ? Unable to detect link-layer
problems ? Technology in the early stage, not
widely supported - Cisco Catalyst 9000 and some
other proprietary examples - RMON is hardly
usable - PSAMP is still ongoing
19What about sniffing in network devices?
- Difficult to get exactly the wanted packet trace
- SNMP does not allow packet capture
- RMON allows packet capture, but only within some
standard templates - E.g. poor filtering options
- Cisco NetFlow does not allow packet capture
- sFlow allows packet capture, but it cannot be
customized not widely supported - A new header contains the packet however often
key information are missing (e.g. originating
interface, ) - IETF PSAMP should be helpful
- Standardization rather show (began in 2000)
- Requires ad-hoc hardware, otherwise resources are
stolen from the router main objective (forwarding
and routing)
20How not go to jail with Sniffing
- Ascertain compliance with regulatory procedures
- Check the regulation in your country
- You can use sniffing for
- National security
- To prevent or detect crime
- To prevent or detect unauthorised use
- To ensure effective systems operation
- You have to make sure that
- The identity of the sender/receiver cannot be
inferred from the captured data - Addresses masquerading
- Aggregate data
21Passive NM the SNMP approach
- Allows retrieving generic statistics, network
status, - Not widely used for network configuration
(although supported) - Defines mechanism for remote management of
network devices (routers, bridges, etc.) - Fundamental principle all device management done
by simple variable value manipulation - Approach
- standard means for specifying quantities
recognized by devices - protocol for requesting, returning, notifying of
changes of values
22Architecture of SNMP components
- An SNMP network consists of three main
components - Managed Devices
- Agents
- Network Management Systems (NMS)
- The managed device is a node in the SNMP network
and it contains the SNMP agent - The NMS makes a virtual connection to the SNMP
agent - The agent serves the information to the NMS
regarding the network status
23Components of the SNMP world
- Protocol for exchanging data between Agents and
Management Entity - SNMP
- Definition of the objects that can be read /
modified - Must be know on both side (Agents and ME)
- MIB
- Syntax used to specify the Management Information
Base - SMIv2
24Structure of Manag. Information (SMIv2)
- SMIv2 defines the rules for creating MIBs and it
is based on simple typed variables - SMIv2 is based on extended subset of ASN.1 (1998)
- Characteristics of the variables defined by SMI
- Each variable has an ASN.1 datatype
- INTEGER, OCTET STRING, OBJECT IDENTIFIER, NULL,
- It does not implement complex data structures and
operations on the variables - Variables are either scalars (exactly one
instance) or columns in a conceptual two
dimensional table (zero or several variables)
25Management Information Base (1)
- "The set of managed objects within a system,
together with their attributes, constitutes that
system's management information base." (ISO
7498-4) - MIB is created using the SMIv2 syntax
- MIB is controlled by the SNMP agent
- The information in the MIB is organized
hierarchically - MIB consists of managed objects
- Managed objects that are identified by two names
- Object Name
- Object Identifier
26Management Information Base (2)
- Variables recognized by device supplied in MIB
(Management Information Base) - text file giving variables and data structures
defined using ASN.1 - standard variable sets often provided as RFCs
- device-specific sets provided by vendors
- Management stations parse MIBs to determine
variables available for management - obtain both data structure and management
information - Example
- -- the Interfaces group
- ifNumber OBJECT-TYPE
- SYNTAX INTEGER
- ACCESS read-only
- STATUS mandatory
- DESCRIPTION
- "The number of network interfaces
present on this system." - interfaces 1
27ASN.1 Object Identifiers
- Variables identified by globally unique strings
of digits - Example
- 1.3.6.1.4.1.3.5.1.1
- name space is hierarchical
- in above, 1 stands for iso, 3 stands for org, 6
stands for dod, 1 stands for internet, 4
stands for private, etc. - Variable names are aliases for digit strings
(within MIB) - Example
- ifNumber interfaces 1
- interfaces was previously defined in MIB as
1.3.6.1.2.1.2, so - ifNumber 1.3.6.1.2.1.2.1
28SNMP Message Encoding
- Encode message as byte stream using ASN.1 BER
(Abstract Syntax Notation 1 Basic Encoding Rules) - Quantites encoded as Type, Length, Value triples
- Types
- Subset of basic ASN.1 types used in SNMP
integer, octet string, object identifier
(variable name), sequence - SNMP-defined types gauge, counter, IP address,
etc. - Values
- weirdly encoded!! (see ASN.1 specs)
29SNMPv1 Protocol
Manager
Agent
It can be used for reading one or more variables
Get
Response
30SNMP and Network Monitoring
- Possibility to capture and create data values
from properly targeted and formatted traps the
information gathered using SNMP can be used for
network monitoring - E.g. packet arrival and departure rates, packet
drop rates, packet error rates, system load,
modem availability etc. - Examples of network monitoring tools
- MRTG
- HP OpenView (not only monitoring)
- MRTG uses the data collected from SNMP agents to
generate graphical representations of it almost
real time
31Some SNMP Issues
- Often, the most valuable data is exported only
through proprietary MIBs - Often, units are differents (Kbps for one vendor,
bps for another, ) - Difficult to manage a multivendor network
- Cannot add a new MIB within an agent
- Cannot customize the variable which are needed to
monitor the network - The opposite (add a new MIB in the Management
Station) is pretty simple
32Passive NM RMON
- Defines a remote network monitoring MIB
- Is an addition to the basic set of SNMP standards
- Why RMON?
- With MIB-II the network manager can obtain
information that is purely local to the
individual devices - What about information pertaining to traffic on
the LAN as a whole? - Collision domain concept
- Features
- Is used to passively monitor data transmitted
over LAN segments - Provides interoperability between SNMP-based
management consoles and remote monitors
33RMON Goals
- Off-line operation
- RMON MIB allows a probe to be configured to
perform diagnostics even in the absence of
communication with the management station - Proactive monitoring
- A monitor can continuously run diagnostics and
log network performance. In the event of a
failure, the monitor can supply this information
to the management station - Problem detection and reporting
- The monitor can be configured to recognize error
conditions, continuously check for them and
notify the management station in the event of one - Value added data
- A remote monitoring device can add value to the
data it collects by highlighting those hosts that
generate the most traffic or errors - Multiple Managers
- An organization can have multiple management
stations for different units. The monitor can be
configured to deal with more than one management
station concurrently - Not all implementations fulfill all these goals
34RMON-1 MIB (RFC 1757, RFC 1513) (1)
- Statistics (1)
- Contains extent of utilisation and error
statistics for the Ethernet and Token Ring
network segments. It shows packets, collisions,
octets, broadcasts, multicasts, errors, and keeps
track of packet size distribution (lt 64, 64 -
1518, gt 1518 octets) - History (2)
- Enables to copy periodically the values from the
Statistics group into a circular buffer - Alarm (3)
- Implements the monitoring of MIB instances
threshold values, based on the ASN.1 datatype
INTEGER. An alarm (SNMP Trap) is produced when a
threshold is exceeded - Host (4)
- Maintains the association of IP, MAC addresses,
bytes sent/received (and more) for the observed
traffic
35RMON-1 MIB (RFC 1757, RFC 1513) (2)
- hostTopN (5)
- Analyzes (i.e. sorts) the data entered in the
Hosts group - Matrix (6)
- Contains data over communication relations which
are defined by pairs by MAC addresses. Useful for
what if analysis, and for detecting intruders - Filter (7)
- Used to select individual packets. A filter
expression (bit patterns only) assigns packages
to a channel. The channel determines whether the
packet is only counted or whether an event is
produced on packet receipt - Capture (8)
- Provides a scratchpad memory where are stored all
the packets received by a channel
36RMON-1 MIB (RFC 1757, RFC 1513) (3)
- Event (9)
- The Event group regulates the handling of
internal events it defines the various events
that cause the emission of SNMPv1 traps sent to
management applications or be stored in a log. - tokenRing (10)
- Historical
- All the groups on RMON MIB are optional
- There are some dependencies
- The Alarm group requires the implementation of
Event group - The HostTopN group requires the implementation of
Host group - The packet Capture group requires the
implementation of Filter group
37RMONv1 vs. RMON v2
- RMONv1 has been designed for low level protocols
below IP - RMONv2 has been designed to monitor high layer
protocols - RMONv2 extends RMONv1 by adding nine new groups
38RMON-2 MIB (RFC 2021, RFC 2074) (1)
- Protocol directory group
- Describes the protocols detected by the probe
including the protocol parameter (e.g. UDP port
numbers). All protocols above the network layer
are supported (e.g. http, ftp) - Protocol distribution group
- Produces basic statistics for selected protocols
(number of byte, number of packages) - Address mapping group
- Provides a mapping of MAC addresses (flown
through the probe) in network addresses - Network layer host group
- Provides statistics for the network layer
classified according to network addresses - Network layer matrix group
- Supplies statistics for communication relations
(host communications matrix) at network level
39RMON-2 MIB (RFC 2021, RFC 2074) (2)
- Application layer host group
- Provides statistics for an application layer
protocol according to network addresses - Application layer matrix group
- Is similar to Network Layer Matrix group with the
exception that in this case statistics are
calculated on an application layer protocol layer - User history group
- Permits an automatic generation of statistics
stored into so-called Buckets. The number of
available buckets is configurable - Probe configuration group
- Enables the configuration of the probe and covers
among other things - Configuration of serial access (Modems)
- IP network configuration
- Configuration of serial connections (SLIP) for
Trap delivery - Configuration of parameters for Traps delivery
40RMONv2 Time Filter
- A table can contain a very large number of values
- E.g. traffic from each host to any other host on
the network - Retrieving the whole table can be expensive
- The TimeFilter allows getting only the values
that changed after time T (specified in the GET
operation)
41Some RMON Issues
- Implementation of RMON agents and management
station is very complex - RMON is usually done through ad-hoc blades in
high-end network devices - Customizability
- Cannot add new feature to the existing MIBs
- Often, users need just some simple functions, but
they are forced to but expensive equipment to get
them done, althoug the most part of the features
are useless in their view - Not widely used
42Passive NM Flow-based approaches
- The most part of the data trasfer in a data
network involves some transport-layer protocol - TCP, UDP
- The flow-based approach analyzes transport-layer
sessions, and uses this data as the basis for the
network monitor - Flow information
- IP source, destination
- Transport protocol
- Port source, destination
- Additional fields, not strictly related to the
session - E.g. IP flags,
43Mostly used architecture
Exporter Captures packets, processes them and
creates a flow table internally The flow table is
(partially) periodically exported to the
collector Exporting modes depend on the
technology involved Very high requirements in
terms of CPU and memory
Flow Table
Collector Minimal processing requirement Problems
may arise if the flow table must be saved for
future reference (e.g. in a database)
44Flow-based NM characteristics
- Advantages
- Reduces the amount of information to process
(flow information are smaller than packet
information) - More scalable
- Problems
- Cannot deal with some of the aspects related to
packet level - E.g. ICMP probes, routing protocols,
- Most important technologies
- Cisco NetFlow
- Uses data (partially) available for CEF (Cisco
Express Forwarding) - IETF IPFIX
- sFlow
45Cisco NetFlow
- Open standard for network traffic measurement
defined by Cisco Systems - By far, the most used technology
- Very small interaction between collector and
exporter - SNMP may be used to configure the probe and
(occasionally) to get data back - Data is exported by means of a UDP stream, with
proper headers - Packet sampling in order to decrease the
processing
46Exporting Flows
- Flows are exported to collector when
- the flow ends (e.g. a TCP packet with the FIN or
RST bits) - the flow has been inactive for a certain period
of time, i.e. if no packets belonging to it have
been observed for a given timeout (usually 15
sec) - the flow is still active, but a given timeout
(usually 30 min) is expired this is useful for
exporting long-lasting flows at regular basis - the probe experiences internal constraints (e.g.
counters wrapping or low memory) in this case, a
flow may be forced to expire prematurely
47NetFlow problems
- Different methods for exporting a flow
- Makes processing harder
- Flow records span several bins
- The concept of bins is not well defined in
NetFlow (at least, bins are 30min) - The collector cannot now, at time T, which are
the flows seen, because some active flows may
have not been exported (yet) - Targeted for TCP/IP networks only
- No support for link-layer headers
- Impossible to add new information (e.g. protocol
fields) in the exported flow record - Packet Sampling
- Unsuitable for some kind of applications
48IETF IPFIX
- IP Flow Information Export
- Basically, NetFlow with the IETF stamp
- Limited differences
- Transport protocol (SCTP optional TCP or UDP)
- Limited customizability of the fields that are
exported within each flow record (e.g.. MPLS
label, BGP Autonomous System, )
49Realtime Traffic Flow Measurement
- IETF Working Group (RTFM)
- Proposal is more advanced than NetFlow
- Simple Ruleset Language
- Provides a way to customize
- flow definition (which can be a generic group of
packet with some common characteristics, e.g. the
packets from source A to destination B) - action (byte count, and more)
- Flows are bidirectional
- makes easier to check the two directions of a
connection - Interaction between probe and collector is done
through SNMP queries - Probe must store flow records in memory until the
collector ask for them - Not supported in commercial devices
- Only the public-domain NeTraMet tool
50sFlow
- Packet Sampling (like Cisco NetFlow)
- Can export either
- Sampled packets (although limited to the first
few hundred bytes) - Flow information
- Excellent technology, but not supported by Cisco
51Scalability of the proposed approaches
- SNMP and RMON show excellent scalability
properties - But they usually work on traffic aggregate
- RMON may need to compute more precise statistics
(e.g. traffic sent by each host, or traffic
matrix) - Flow-based and Packet-based are the most critical
technologies from this point of view - So, lets investigate how to mitigate the
problems of flow-based and packet-based
technologies
SNMP
RMON
Flow-based
Scalability
Packet-based
52Challenges in High Speed Networks
- Speed
- How to capture all the data flowing in a
multi-gigabit pipe - Information overload
- How to deal with the tremendous amount of
information coming out from a multi-gigabit pipe - These issues are ortogonal
53How to support ever increasing speeds
- The path to High Speed Network Analysis includes
two steps - Improving raw performance
- Improving components for packet processing
- We need to
- Increase performance in packet capture
- In this way, we are able to deliver all the data
to the processing engine - In general, applications that need to process
information contained in network packets - Create smarter processing engines
- In this way, we can deal with the tremendous
amount of processing required for extracting
useful information for data
54Reference Model for Packet Capture
55Hardware Interface
- Objective transfer data from network to RAM
- Bottlenecks
- Interrupt
- Access to the hardware (e.g. setting values in
the NIC registers) - Solutions
- Interrupt Mitigation
- Hardware-based
- Interrupt Batching
- Software-based
- Device Polling
- E.g. FreeBSD (Rizzo)
- Hybrid models Interrupt-Polling
56Livelock
57Capture Driver
- Goals
- Timestamp the packets
- Deliver packets to the application
- Bottlenecks
- Context Switch (104 clock cycles in Windows)
- Packet copies
- Solutions
- Packet filtering, shapshot capture (not always
possible) - Bulk copies
- Large buffers (may be useful if shared with the
application)
58The path for raw speed (1)
Step ONE Optimize as much as you can
Step TWO Move intelligence into the
kernel Decrease overhead when moving data
around Not suitable for some applications like
packet capture Remaining issues interrupts and
some kernel-related overheads
Processing
User code
User Level
user-buffer
Packet Capture Library
Kernel Buffer
Kernel Level
Other protocol stacks
packet filter
Network Tap
NIC Driver
Packets
Network
59Let give some numbers...
Source Luca Deri, ntop.org
These numbers may be surprising
60 also about different Operating Systems
Source Luca Deri, ntop.org
61Lets talk about performances...
- To increase performance should have
- Hardware-based timestamp
- Avoid NIC driver and OS-related costs
- Avoid un-necessary copies (e.g. shared
buffer)
Current Winpcap 3.0 overhead in clock
cycles 3164 clock cycles
62The path for raw speed (2)
User code
Step THREE Decouple capture stack from other
network stacks Custom NIC driver No longer able
to support other protocol stacks (e.g. TCP/IP on
the interface) (Possible) intrusive modification
of the operating system Astonishing performances
(for the pure SW solution) Capture on a Gigabit
Ethernet PCI bottleneck Timestamp precision (if
done in software) Processing Either in Kernel or
User space (better)
User Level
Packet Capture Library
Kernel Level
Custom NIC Driver (or)
Network
(or) Smart NIC
Packets
63The path for raw speed (3)
Step FOUR (a) Create smarter NICs Hardware
processing Avoid PCI bus bottleneck (not
applicable for capture all applications) Timesta
mp precision Need advanced mechanism for
customizable processing
User code
Step FOUR (b) Increase parallelism in user
space PCI bottleneck Easy to customize processing
(general purpose CPUs)
Buffering
Packet Capture Library
Custom NIC
Processing
Packets
64The two directions for further speed (1)
- Parallelizing tasks at user-level
- Easy simple to implement (its just software)
- Easy to get some very powerful multiprocessor
machine - May have syncronization issues
- Especially for applications that require the
results of some previous processing - Bus limitations
- PCI 1.0 (32bit, 33MHz) ? 1 Gbps
- PCI 2.2 (64bit, 66MHz) ? 4.2 Gbps
- PCI-X (64bit, 133MHz) ? 8.5 Gbps
- PCI-X 2.0 (64bit, 266MHz) ? 17 Gbps
- PCI-Express (16x) ? 32 Gbps
- Complete lack of interest
65The two directions for further speed (2)
Source Loris Degioanni PhD Thesis
66The two directions for further speed (3)
Implement some processing engine in hardware i.e.
in the lowest component possible (e.g. NIC)
67Possible solution for programmable hw
User code
NetVM
68High speed other options?
- Sampling
- This is one of the most widely used technique
- Not really a solution its more a trick
- Several studies demonstrate that the results
obtained from sampled packets are nearly
equivalent with the ones obtained from complete
traces - This is valid only for some applications (e.g.
traffic statistics) other (e.g. intrusion
detection) cannot rely on this technique
69Second problem Information overload
- 10 Gbps (full duplex) pipe ? 216TB storage every
day - Assuming a 200GB HD (SerialATA) at 120, with
2.5cm height - ? 1080 HardDisk/day, 130K, a wall of 27 meters
(height) - Problems in information overload
- Storage (the less important)
- At least for who has money and space such
(government, military) - Difficult to locate the wanted information
- This is a problem for riches and poors
70Proposed solutions
- On-line methods
- Sampling
- Ineffective with application-layer data (e.g.
determine the web pages seen by a given user) - Flow-based techniques
- Aggregate statistics
- Processing of some predetermined indexes (e.g.
number of sessions per second) - Difficult to determine which indexes are really
useful, and which one must be computed - Off-line methods
- Nothing new, but the type of the processing can
be decided in a second time - There is no risk to lose data just because you
realized too late that you need to extract some
new information out of the packet dump
71Offline methods
- May be very useful for complex analysis (e.g. TCP
flow reassembly)
On-line monitoring and analysis
Off-line analysis
Capture
On-line Processing
Dump results
Off-line Processing
Dump results
Network card
72Data Mining techniques
- May be useful to extract relevant data and to
discover useful relationships - Still in a very early stage
73... and do not forget that...
... monitoring is a cost!
74Conclusions
- Although networks become faster and faster, no
much interest can be seen in network monitor - At best, people seems to be happy with some
simple solutions - Sampling
- Aggregate statistics
- (Sampled) NetFlow records
- However, this means losing the control of your
network - Among the most important problems to solve
- Speed there are interesting results
- Endace (NZ), NetGroup at Politecnico di Torino
(IT) - Information overload
- This is a topic that requires more efforts
75Questions?