Title: Mining Anomalies Using Traffic Feature Distributions
1Mining Anomalies Using Traffic Feature
Distributions
- Anukool Lakhina, Mark Crovella, Christophe Diot
- ACM SIGCOMM, August 2005
- Presented by Abhinay Kampasi
- Referred to presentation on authors website
2Motivation for Anomaly Detection
- Is my customer being attacked?
- Is someone probing my network?
- Are there worms spreading?
- A sudden traffic surge?
- An equipment outage?
- Something never seen before?
Anomalies present in network trafficare buried
like needles in a haystack!
3Previous Work
- Volume based anomaly detection
- Largely focused on
- Point solutions
- not a general approach
- Rule-based classification
- not unsupervised
- Data from single links
- not network-wide
A general, unsupervised method for reliably
detecting and classifying network anomalies is
needed
4Feature Distributions
- Anomalies can be detected and distinguished by
inspecting traffic features SrcIP, SrcPort,
DstIP, DstPort
5Feature Distribution Changes induced by Port Scan
Anomaly
6Entropy
- Metric that captures degree of dispersal or
concentration -
- where symbol i occurs ni times in sample
- S is total of observations
- Value lies between 0 and log2N
- 0 when distribution maximally concentrated
- All observations same
- log2N when distribution maximally disperseds
- All observations distinct
- Entropy value is normalized
7Applying Entropy to Port Scan Data
8Methodology
- Detect
- Use multiway subspace method
- Augments volume metrics, highly sensitive
- Identify anomalies on multiple features and flows
- Classify
- Use clustering on anomaly features
- Can do unsupervised classification
9Network-Wide Traffic Data Collected
- Collected 3 weeks of sampled NetFlow data at 5
minute bins from two backbone networks - Compute entropy on packet histograms for 4
traffic features SrcIP, SrcPort, DstIP, DstPort - Two sources of bias sampling and anonymization
of IP addresses
10Multiway Subspace Method
- Based on subspace method and principal component
analysis - Every point in subspace has normal and residual
components - H(t,p,k) denotes the entropy value at time t for
flow p, of traffic feature k
- Unwrap the multiway matrix into one matrix
- Apply subspace method on merged matrix
- Detect anomalies by monitoring size of
residual vector for unusually high values
11How does entropy compare with volume-based
detection?
- Does entropy allow detection of a larger set of
anomalies? - Are anomalies detected by entropy fundamentally
different from volume-based methods? - How precise is the entropy-based detection?
12Comparison
Points that lie to the right of the vertical line
are volume-detected anomalies and points that lie
above the horizontal line are detected in entropy.
13Manual Inspection
14Detection Rate by Injecting Real Anomalies
- Evaluation Methodology
- Superimpose known anomaly traces into OD flows
- Test sensitivity at varying anomaly intensities,
by thinning trace - Results are average over a sequence of experiments
15Classifying Anomalies by Clustering
- Use unsupervised classification
- Each anomaly is a point in 4-D space
- (SrcIP), (SrcPort), (DstIP),
(DstPort) - Use Hierarchical Agglomerative Algorithm for
determining clusters - Minimizes intra-cluster variation and maximizes
intra-cluster variation
16Clustering Known Anomalies (2-D view)
Legend Code Red Scanning Single source DOS
attack Multi source DOS attack
173-D view of Abilene anomaly clusters
- Used 2 different clustering algorithms
- Results consistent
- Heuristics identify about 10 clusters in dataset
18Anomaly Clusters in Abilene data
19Summary
- Feature distributions as summarized by entropy
are promising for general anomaly diagnosis - Network-Wide Detection
- Entropy significantly augments volume metrics
- Highly sensitive Detection rates of 90
possible, even when anomaly is 1 of background
traffic - Anomaly Classification
- Clusters are meaningful, and reveal new anomalies
20Points to Ponder
- The paper only discusses anomaly detection on
offline data. Can it be enhanced for online
anomaly detection? - We still need volume based detection because
feature distribution does not identify all
anomalies. - Can other fields in packet header be used for
anomaly detection?
21Profiling Internet Backbone Traffic Behavior
Models and Applications
- Kuai Xu, Zhi-LI Zhang, Supratik BhattacharyyaACM
SIGCOMM, August 2005 - Presented by Abhinay Kampasi
- Referred to presentation on authors website
22Why profile traffic?
- Changes in Internet traffic dynamics
- increase in unwanted traffic
- emergence of disruptive applications
- new services on traditional ports
- traditional service on non-standard ports
- Existing tools
- rely on ports for identifying or classifying
traffic - report volume-based heavy hitters
- look for specific or known patterns
- Need better techniques to discover behavior
patterns - help network operators secure and manage networks
23Communication patterns
- Underlying communication patterns of end hosts
- Who are they talking to? How are ports used?
- How many packets or bytes transferred?
- Can communication patterns reveal interesting
behavior?
24Methodology
- Data pre-processing
- aggregate packet streams into 5-tuple flows
- group flows into clusters
- Extract significant clusters
- data reduction step using entropy
- Classify cluster behavior based on
similarity/dissimilarity of communication
patterns - characterize using information theory
- clusters classified into behavior classes
- Interpret behavior classes
- structural modeling for dominant activities
25Data Preprocessing
- Aggregate packet streams into 5-tuple flows
- Group flows associated with same end hosts/ports
into clusters
26Extract Significant Clusters
- Focus on significant clusters
- Sufficiently large number of flows
- Represent behavior of significant interest
- Adaptive thresholding using entropy
- A cluster is significant if standing outfrom
the rest - Use entropy to quantify whether the rest looks
random
27Entropy based adaptive thresholding
28Sample Results
Though the total number of distinct values along
a given dimension may not fluctuate very much,
the number of significant feature values
(clusters) may vary dramatically, due to changes
in the underlying feature value distributions.
29Relative Uncertainty
- Entropy H(X) -Sp(xi)logp(xi)
- Maximum Entropy Hmax(X) log min(m,N)
- Relative Uncertainty of variable XRU(X) H(X) /
Hmax(X), RU ? 0, 1 - RU(X) 0 X is deterministic
- RU(X) 1 X is randomly distributed
30Behavior Characterization
31Behavior Classes
Summarize three feature distributions into 27
classes0, 0, 0 2, 2, 2, for convenience
BC0to BC26
32Summary of behavior classes
- Behavior classes classify clusters based on
communication patterns - Behavior classes have distinct temporal
properties - Popularity
- Average Size
- Membership Volatility
- Clusters within the same behavior class have
similar structural models - Clusters have stable behavior over time
33Dominant State Analysis
- Each cluster has hundreds or thousands of flows.
- An exhaustive approach is not practical
- Need a compact summary
- Dominant state analysis
- Identify the dependency among the free dimensions
of a cluster - Dominant states of a cluster are subsets of
values that approximate the original data
34General procedure for Dominant State Analysis
35Applications of Profiles
36Anomalous Behaviors
- Clusters in rare behavior classes
- Identified a web server under DDoS attack
- Behavioral changes for clusters
- Yahoo web server example
- Unusual profiles for popular service ports
37Conclusion
- Developed a systematic methodology to
automatically discover and interpret
communication patterns - Used information-theoretical techniques to build
behavior models of end hosts and applications - Applied dominant state analysis to explain
traffic behavior - Identified typical behavior profiles as well as
rare and deviant behaviors
38Future Work
- Correlating behavior profiles across multiple
links - Validate behavior profiles using additional
features, e.g., packet payload - Integrate traffic profiling framework with a
real-time monitoring system
39Thank You ?
Questions / Comments?