Title: NetworkWide Security Analysis
1Network-Wide Security Analysis
- Anukool Lakhina with Mark Crovella and
Christophe Diot
ZISC NetSec 2005, Nov 1, 05
2Motivating Problems
IPLS
SNVA
LA
Insider Attack(Worm scan)
Coordinated, Stealth Attack
Distributed Attack
- Working Hypothesis Diagnosis situational
awareness requires a network-wide approach - Simultaneous analysis of traffic on all links
3The Problem of Distributed Attacks
Victimnetwork
NYC
IPLS
- Detection easy at egress
- Attack stands out visibly
- Mitigation hard
- Exhausted bandwidth
- Need upstream providers cooperation
- Spoofed sources
LA
ATLA
HSTN
4Power of Network-Wide Analysis
Peak rate 300Mbps Attack rate 19Mbps/flow
IPLS
DDOS detected at the ingress ? containment
possible
5Data for Network-Wide Analysis
Collect sampled NetFlow data from all routers
of 1. Abilene (Internet 2) 11 PoPs, 1/100
sampling, 5 min 2. Géant (Dante) 22 PoPs, 1/1000
sampling, 10 min 3. Sprint-Europe 13 PoPs, 1/250
sampling, 10 min Data-view we study Fuse
routing traffic to construct network-wide
point-to-point demands (i.e., the traffic
matrices) other views possible
6But, This is Difficult!
How do we extract anomalies and normal behavior
from noisy, high-dimensional data in a
systematic manner?
7The Subspace Method LCDSIGCOMM04
- An approach to separate normal anomalous
network-wide traffic - Designate temporal patterns most common to all
traffic flows as the normal subspace - Remaining temporal patterns form the anomalous
subspace - Then, decompose traffic in all flows by
projecting onto the two subspaces to obtain
Residual trafficvector
Traffic vector of all flows at a particular
point in time
Normal trafficvector
8A Geometric Illustration
In general, anomalous traffic results in a large
sizeof For higher dimensions, use Principal
Component Analysis LPCSIGMETRICS04
Traffic on Flow 2
Traffic on Flow 1
9Subspace Detection Thresholds
- Capture size of vector using squared prediction
error - Assuming Gaussian data, we can find boundswhich
SPE should only exceed 1- of the time - Result due to Jackson and Mudholkar, 1979
Traffic on Flow 2
Traffic on Flow 1
10An example malicious anomaly
No Dominant Source IP Dominant Dest. IP 80 of
P and 92 of F traffic. Cause DOS attack
11An example operational anomaly
12Summary of Anomaly Types Found LCDIMC04
False Alarms
Unknown
Traffic ShiftOutageWormPoint-Multipoint
Alpha
Overloads
DOS
Scans
13Automatically Classifying Anomalies
LCDSIGCOMM05
- Goal Classify security operational anomalies
without being restricted to a predefined set of
anomalies - Approach Leverage 4-tuple header fields
- SrcIP, SrcPort, DstIP, DstPort
- In particular, measure dispersion in features
- Then, apply off-the-shelf clustering methods
14Traffic Feature Distributions
15Feature Entropy for Classification
Bytes
Port scan dwarfed in volume metrics
Packets
H(Dst IP)
But stands out in feature entropy, which
revealsstructure
H(DstPort)
16Example of Clustering Attacks
Dispersed
Legend Code Red Scan Single-source bandwidth
attack Multi sourcecoordinated attack
(DstIP)
(SrcIP)
Dispersed
Concentrated
Summary Correctly classified 292 of 296
injected anomalies
17Anomaly Clusters in Abilene data
Insights 3 and 4 different types of
scanning 7 NAT box?
18Our Work in Context
- Previous work largely focused on
- Point solutions
- not a general approach
- Rule-based classification
- not unsupervised
- Data from single links
- not network-wide
19Summary
- Network-Wide Detection
- Broad range of anomalies with low false alarms
- In papers Highly sensitive detection, even when
anomaly is 1 of background traffic - Anomaly Classification
- Feature clusters automatically classify anomalies
- In papers clusters expose new anomalies
- Network-wide data and feature analysis are
promising tools for general anomaly diagnosis
20More Information
- For more, please see our papers slides at
- http//cs-people.bu.edu/anukool/pubs.html
- Ongoing Work implementing algorithms in a
prototype system - Feedback cooperation welcome!
- Suggestions, data, deployment, ...
21Thanks!
- Data from the Abilene Observatory
- Rick Summerhill, Mark Fullmer, Matthew Davy
- Help with Géant data
- Richard Gass and Gianluca Iannaccone
- Injected Anomaly Data
- Alefiya Hussain for DOS traces
- Dave Andersen Jaeyeon Jung for worm traces