Title: Sensitivity of PCA for Traffic Anomaly Detection
1Sensitivity of PCA forTraffic Anomaly Detection
- Evaluating the robustness of current best
practices
Haakon Ringberg1, Augustin Soule2, Jennifer
Rexford1, Christophe Diot2 1Princeton University,
2Thomson Research
2Outline
- Background and motivation
- Traffic anomaly detection
- PCA and subspace approach
- Problems with methodology
- Conclusion future directions
3A network in the Internet
4Network anomalies
We want to be able to detect these anomalies!
5Network anomaly detectors
- Monitor health of network
- Real-time reporting of anomalies
6Principal Components Analysis (PCA) Benefits
- Finds correlations across multiple links
- Network-wide analysis
- Lakhina SIGCOMM04
- Demonstrated ability to detect wide variety of
anomalies - Lakhina IMC04
- Subspace methodology
- We use same software
7Principal Components Analysis (PCA)
- PCA transforms data into new coordinate system
- Principal components (new bases) ordered by
captured variance - The first k tend to capture periodic trends
- normal subspace
- vs. anomalous subspace
8Pictorial overview ofsubspace methodology
- Training separate normal anomalous traffic
patterns - Detection find spikes
- Identification find original spatial location
that caused spike (e.g. router, flow)
9Pictorial overview of problems with subspace
methodology
- Defining normalcy can be challenging
- Tunable knobs
- Contamination
- PCAs coordinate remapping makes it difficult to
identify the original location of an anomaly
10Data used
- Géant and Abilene networks
- IP flow traces
- 21/11 through 28/11 2005
- Anomalies were manually verified
11Outline
- Background and motivation
- Problems with approach
- Sensitivity to its parameters
- Contamination of normalcy
- Identifying the location of detected anomalies
- Conclusion future directions
12Sensitivity to topk
- PCA separates normal from anomalous traffic
patterns - Works because top PCs tend to capture periodic
trends - And large fraction of variance
13Sensitivity to topk
- Where is the line drawn between normal and
anomalous? - What is too anomalous?
14Sensitivity to topk
Very sensitive to number of principal components
included!
15Sensitivity to topk
- Sensitivity wouldnt be an issue if we could tune
topk parameter - Weve tried many different methods
- 3s deviation heuristic
- Cattells Scree Test
- Humphrey-Ilgen
- Kaisers Criterion
- None are reliable
16Contamination of normalcy
- What happens to large anomalies?
- They capture a large fraction of variance
- Therefore they are included among top PCs
- Invalidates assumption that top PCs need to be
periodic - Pollutes definition of normal
- In our study, the outage to the left affected
75/77 links - Only detected on a handful!
17Identifying anomaly locations
- Spikes when state vector projected on anomaly
subspace - But network operators dont care about this
- They want to know where it happened!
- How do we find the original location of the
anomaly?
18Identifying anomaly locations
- Previous work used a simple heuristic
- Associate detected spike with k flows with the
largest contribution to the state vector v - No clear a priori reason for this association
19Outline
- Background and motivation
- Problems with approach
- Conclusion future directions
- Defining normalcy
- Identifying the location of an anomaly
20Defining normalcy
- Large anomalies can cause a spike in first few
PCs - Diminishes effectiveness
- But we can presumably smooth these out (WMA)
- But first PCs arent always periodic
- whichk instead of topk?
- Initial results suggest this might be challenging
also
21Fundamental disconnect between objective functions
- PCA is optimal at finding orthogonal vectors
ordered by captured variance - But variance need not correspond to normalcy
(i.e. periodicity) - When do they coincide?
22Identifying anomaly locations
- PCA is very effective at finding correlations
- But is accomplished by remapping all data to new
coordinate system - Strength in detection becomes weakness in
identification - Inherent limitation
23Conclusion
- PCA is sensitive to its parameters
- More robust methodology required
- Training defining normalcy (topk, whichk)
- Detection tuning threshold
- Identification better heuristic
- Disconnect between objective functions
- PCA finds variance
- We seek periodicity
- PCAs strengths can be weaknesses
- Transformation good at detecting correlations
- Causes difficulty in identifying anomaly location
24Thanks!Questions?
- Haakon Ringberg
- Princeton University Computer Science
- http//www.cs.princeton.edu/hlarsen/
25Outline
- Background and motivation
- Problems with approach
- Future directions
- Conclusion
- Addressable problems, versus
- Fundamental problems
26Conclusion addressable
- PCA is sensitive to its parameters
- More robust methodology required
- Training defining normalcy (topk, whichk)
- Detection tuning threshold
- Identification better heuristic
- Previous work used same data and optimized
parameter settings as Lakhina et al. - But these concerns might be addressable
27Conclusion fundamental
- We dont know what normal is
- Disconnect between objective functions
- PCA finds variance
- We seek periodicity
- PCAs strengths can be weaknesses
- Transformation good at detecting correlations
- Causes difficulty in identifying anomaly location
- Are other methods are more appropriate?
- We require a standardized evaluation framework