Title: Diagnosing Anomalies with NetworkWide Analysis
1Diagnosing Anomalies with Network-Wide Analysis
- Anukool Lakhina, Mark Crovella, Christophe Diot
2Network Anomaly Diagnosis
- Am I being attacked?
- Is someone scanning my network?
- Are there worms spreading?
- A sudden traffic shift?
- An equipment outage?
- Something never seen before?
A general, unsupervised method for reliably
detecting and classifying network anomalies is
needed
3My Talk in One Slide
- A general system to detect classify anomalies
at ISPs and enterprises - Central Message Network-wide analysis of
network data can expose many anomalies - Analyze readily-available SNMP and NetFlow data
- Expose both operational malicious incidents
4Motivating Problems
Distributed DOS Attack
Large Traffic Shifts(Operational Event)
Worm scan observable in network-wide traffic
5Example Problem of Distributed Attacks
NYC
Victimnetwork
LA
ATLA
- Continue to become more prevalent CERT04
- Financial incentives for attackers, e.g.,
extortion - Increasing in sophistication worm-compromised
hosts and bot-nets are massively distributed
6Today Detect at Edge
NYC
Victimnetwork
- Detection easy
- Anomaly stands out visibly
- Mitigation hard
- Exhausted bandwidth
- Need upstream providers cooperation
- Spoofed sources
LA
ATLA
HSTN
7Power of Network-Wide Analysis Detect at Core
Peak rate 300Mbps Attack rate 19Mbps/flow
IPLS
Distributed Attacks easier to detect at the
ingress
8A Need for Network-Wide Management
- Effective diagnosis of attacks requires a
whole-network approach - Simultaneously inspecting traffic on all flows
- Useful in many contexts
- Managing traffic in enterprise networks
- Worm propagation, insider misuse, operational
problems
9Talk Outline
- Measuring Network-Wide Traffic
- Detecting Network-Wide Anomalies
- Beyond Volume Detection Traffic Features
- Automatic Classification of Anomalies
- Summary
10Origin-Destination Traffic Flows
- Traffic entering the network at the origin and
leaving the network at the destination (i.e.,
the traffic matrix) - Use routing (IGP, BGP) data to aggregate NetFlow
traffic into OD flows - Massive reduction in data collection
11Networks Evaluated
- Abilene research network (Internet2)
- 11 PoPs, 121 OD flows, anonymized, 1/100
sampling, 5 min bins - GĂ©ant Europe research network
- 22 PoPs, 484 OD flows, not anonymized, 1/1000
sampling, 10 min bins - Sprint European commercial network
- 13 PoPs, 169 OD flows, not anonymized,
aggregated, 1/250 sampling, 10 min bins
12But, This is Difficult!
How do we extract anomalies and normal behavior
from noisy, high-dimensional data in a
systematic manner?
13Turning High Dimensionality into a Strength
- Traditional traffic anomaly diagnosis builds
normality in time - Methods exploit temporal correlation
- Whole-network view is an attemptto examine
normality in space - Make use of spatial correlation
- Useful for anomaly diagnosis
- Strong trends exhibited throughout network are
likely to be normal - Anomalies break relationships between traffic
measures
14The Subspace Method LCDSIGCOMM 04
- An approach to separate normal anomalous
network-wide traffic - Designate temporal patterns most common to all
the OD flows as the normal subspace - Remaining temporal patterns form the anomalous
subspace - Then, decompose traffic in all OD flows by
projecting onto the two subspaces to obtain
Residual trafficvector
Traffic vector of all OD flows at a particular
point in time
Normal trafficvector
15The Subspace Method, Geometrically
In general, anomalous traffic results in a large
sizeof For higher dimensions, use Principal
Component Analysis LPCSIGMETRICS 04
Traffic on Flow 2
Traffic on Flow 1
16Subspace Method Detection
- Error Bounds on Squared Prediction Error
- Assuming Normal Errors
- Jackson and Mudholkar, 1979
- Full details in our paper LCDSIGCOMM 04
17An example malicious anomaly
No Dominant Source IP Dominant Dest. IP 80 of
P and 92 of F traffic. Cause DOS attack
18An Operational Anomaly
19Summary of Anomaly Types Found LCDIMC04
False Alarms
Unknown
Traffic ShiftOutageWormPoint-Multipoint
Alpha
Overloads
DOS
Scans
20Automatically Classifying Anomalies
LCDSIGCOMM05
- Goal Classify anomalies without restricting
yourself to a predefined set of anomalies - Approach Leverage 4-tuple header fields
- SrcIP, SrcPort, DstIP, DstPort
- In particular, measure dispersion in features
- Then, apply off-the-shelf clustering methods
21Traffic Feature Distributions
22Feature Entropy for Classification
Bytes
Port scan dwarfed in volume metrics
Packets
H(Dst IP)
But stands out in feature entropy, which
revealsstructure
H(DstPort)
23Clustering Known Anomalies (2-D view)
Known Labels
Cluster Results
Dispersed
Legend Code Red Scanning Single source DOS
attack Multi source DOS attack
(DstIP)
(SrcIP)
(SrcIP)
Concentrated
Dispersed
Summary Correctly classified 292 of 296
injected anomalies
24Example of Anomaly Clusters
Dispersed
Legend Code Red Scanning Single source DOS
attack Multi source DOS attack
(DstIP)
(SrcIP)
Dispersed
Concentrated
Summary Correctly classified 292 of 296
injected anomalies
25Summary
- Network-Wide Detection
- Broad range of anomalies with low false alarms
- In papers Highly sensitive detection, even when
anomaly is 1 of background traffic - Anomaly Classification
- Feature clusters automatically classify anomalies
- In papers clusters expose new anomalies
- Network-wide data and feature analysis are
promising tools for general anomaly diagnosis
26Overview of System We Are Building
- Top level functionality
- Data Collection Processing
- Anomaly Diagnosis
- Data Inspection
- Query Builder
- Web-based interface ajax driven
- Multi-user system
27More information
- For more information, see papers slides at
- http//cs-people.bu.edu/anukool/pubs.html
- Ongoing Work implementing algorithms in a
prototype system - Your feedback cooperation appreciated!
- Comments, data, deployment
28Screenshot Slides coming soon
29Backup slides
30Previous Work on Anomaly Detection
- Largely focused on
- Point solutions
- not a general approach
- Rule-based classification
- not unsupervised
- Data from single links
- not network-wide
31Automatic Diagnosis of a DOS Attack
Anomaly Detection Anomaly detected in packet
traffic of se1 to fr1 OD flow
Anomaly Classification DDOS attack Flooding
attack across dispersed destination ports, and
concentrated on single victim IP 193.54.168.72
(univ-paris8.fr)
32Automatic Diagnosis of a Network Scan
Anomaly Detection Anomaly detected in entropy
traffic of at1 to le1 OD flow not visible in
bytes or packets
Anomaly Classification Network scan across
dispersed destinations for single TCP port 6129
(used by the Dameware remote administration
software, known to be vulnerable and often used
by viruses).
33Abilene Clusters Reveal New Anomalies
Insights 3 and 4 different types of
scanning 7 NAT box?