Title: Diagnosing NetworkWide Traffic Anomalies
1Diagnosing Network-Wide Traffic Anomalies
- Anukool Lakhina, Mark Crovella, Christophe Diot
2Anomaly Diagnosis
- Is my network experiencing unusual conditions?
- e.g., DoS attacks, outages, misconfigurations
- DetectionIs there an unusual event?
- IdentificationWhat is the best explanation?
- QuantificationHow serious is the problem?
A general framework for AnomalyDiagnosis
3Problem Statement
- A volume anomaly is a sudden change in an
Origin-Destination flow (i.e., point to point
traffic) - Given link traffic measurements, diagnose the
volume anomalies
4An Illustration
- Detect the time of the anomaly
- Identify the source destination
- Quantify the size of the anomaly
5Data Collected
Abilene
Sprint-Europe
6Typical Link Data
Abilene
Sprint-Europe
Finding common patterns (e.g. volume anomalies)
from such high-dimensional, noisy data is very
difficult
7Low Intrinsic Dimensionality of Link Traffic
- Studied via Principal
- Component Analysis
- Key result
- Normal traffic is well approximated as occupying
a low dimensional subspace - Reasons
- Links share OD flows
- Set of OD flows also low
- dimensional
8The Subspace Method
- An approach to separate normal from anomalous
traffic - Normal Subspace, space spanned by the first
k principal components - Anomalous Subspace, space spanned by the
remaining principal components - Then, decompose traffic on all links by
projecting onto and to obtain
Residual trafficvector
Traffic vector of all links at a particular
point in time
Normal trafficvector
9A Geometric Illustration
In general, anomalous traffic results in a large
value of
Traffic on Link 2
Traffic on Link 1
10Detection
- Capture size of vector using squared prediction
error - Assuming Gaussian data, we can find boundswhich
SPE should only exceed 1- of the time - Result due to Jackson and Mudholkar, 1979
Traffic on Link 2
Traffic on Link 1
11Detection Illustration
Value of
over time(SPE)
SPE at anomaly time points clearly stand out
12Identification
- An anomaly causes a displacement of the link
traffic vector away from - The direction of the displacement gives
information about the nature of the anomaly - Intuition find the hypothesis that best
describes the detected anomaly
13Hypothesis-Based Identification
- Denote set of all anomalies by
- Each adds link traffic specified by
- In the presence of
- is found by minimizing the distance to
in the direction of the anomaly
Normal Subspace,
14Selecting the Best Hypothesis
- 1. For each hypothesized anomaly
- compute
- 2. Select anomaly as
The best hypothesis (OD flow) accounts for
maximum residual traffic
15Quantification
- Given hypothesized anomaly ,
- Estimated per-link anomaly traffic is
- And the size of the anomaly is
16Validation Identifying True Volume Anomalies
- Measure OD flows for Sprint-Europe and Abilene
networks - To identify true volume anomalies,
- Look for significant deviations in each OD flow
- Use two different approaches EWMA filtering,
and Fourier frequency-domain filtering - Note time point, OD flow, and amount of deviation
- These are our true volume anomalies
(approximately)
17Results on True Volume Anomalies Sprint
40 Largest deviations in OD flows via Fourier
True Anomaly Size
Anomaly (rank order)
Quantification
Identification
Detection
18Summary True Volume Anomalies
- Summary
- High Detection Rate
- Low False Alarm Rate
- Accurate Identification
- Accurate Quantification
19Summary Synthetic Volume Anomalies
- Summary
- Detection rates about 90
- Identification better on Sprint than Abilene
- Quantification good in both networks
20Power of the Subspace Method
- Subspace Method exploits correlation among links
to define normal traffic behavior - Effective diagnosis of volume anomalies requires
a whole-network approach - Previous work has concentrated on measurements
from individual links
21Conclusions
- Diagnosing Volume Anomalies
- Proposed a general diagnosis framework
- Subspace method yields high detection rates, low
false alarm rate - Accurate hypothesis-based identification and
quantification - Subspace method also useful in other contexts
- Poster in the Network Troubleshooting Workshop
- Paper in the IMC 04