Is Sampled Data Sufficient for Anomaly Detection - PowerPoint PPT Presentation

About This Presentation
Title:

Is Sampled Data Sufficient for Anomaly Detection

Description:

That means, ratio ? between distinct IP addresses and port number is larger for scanner. ... Determine true scanners. Final list of scanners manually generated ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 54
Provided by: cseCu
Category:

less

Transcript and Presenter's Notes

Title: Is Sampled Data Sufficient for Anomaly Detection


1
Is Sampled Data Sufficient for Anomaly Detection
  • Ip Wing Chung Peter (05133660)
  • Ngan Sze Chung (05928650)

2
Abstract
  • Traffic Measurement in Network is important
  • Network management
  • Anomaly detection for security analysis
  • Detect all packet trace?
  • The most accurate
  • Consume network
  • resources
  • Affect normal traffic

3
Abstract
  • Sampling Technique
  • Conserve network resources
  • How many samples?
  • Sampling techniques vs Anomalies detection
    algorithm

4
Abstract
  • Introduction
  • Background and Methods
  • Impact of Sampling on Volume Anomaly Detection
  • Impact of Sampling on Portscan Detection
  • Conclusion and Future Work

5
Introduction
  • Aim
  • To study the impact of sampling on anomaly
    detection
  • Objective
  • To study 4 existing sampling techniques
  • To study 3 common anomaly detection algorithm
  • To simulate the result by inputting the sampled
    data to detect the anomalies
  • To evaluate the impact of sampling on anomaly
    detection algorithm

6
Background and Methods
  • Sampling
  • Volume Anomaly Detection
  • Portscan Detection
  • Trace Data
  • Methodology

7
Sampling
  • Random packet sampling
  • Sample a packet with a small probability r lt 1
  • Classify sampled packets into flows based on
    source/destination, IP/port, protocol
  • Flow terminated by timeout (1 min), or explicit
    TCP semantics (FIN)

8
Sampling
  • Random packet sampling
  • Simple to implement
  • Low CPU power and memory requirement
  • Inaccurate for flow statistic

9
Sampling
  • Random flow sampling
  • Sample a flow with a small probability p lt 1
  • Improve accuracy
  • for flow statistic
  • Classifies packet
  • into flows first
  • Prohibitive memory
  • and CPU power

10
Sampling
Where z is a threshold that trades off accuracy
  • Smart sampling
  • Sample a flow of size x with a probability p(x)
  • Determined by threshold z (e.g. z 40000)
  • Bias towards large flows

sample with 0.1 probability
Flow 1, 40 bytes Flow 2, 15580 bytes Flow 3, 8196
bytes Flow 4, 5350789 bytes Flow 5, 532
bytes Flow 6, 4000 bytes
sample with 100 probability
sample with 10 probability
11
Sampling
  • Sample-and-hold (SH)

12
Sampling
  • Sample-and-hold (SH)
  • Flow table lookup
  • If found, flow entry gets updated by all the
    subsequent packets once it is created in SH
    table
  • If not found, flow entry created with a
    probability p
  • (e.g. p 1/3 on previous case)
  • Sampling biased toward elephant flows

13
Volume Anomaly Detection
  • Detect Network traffic anomalies (e.g. DoS
    attack)
  • Abrupt changes in packet or flow count
    measurements
  • Induces volume anomalies
  • Discrete wavelet transform (DWT) based detection
  • Proved to be effective at detecting volume
    anomalies

14
DWT-Based Detection
  • Applies wavelet decomposition on packet or flow
    time series
  • Detect volume change at various time scale
  • 3 steps
  • Decomposition
  • Re-synthesis
  • Detection

15
DWT-Based Detection
  • Decomposition
  • Decompose original signal to identify changes
  • DWT calculate wavelet coefficient

low pass filter
original signal
high pass filter
16
DWT-Based Detection
  • Re-synthesis
  • Aggregated into high, mid and low bands
  • Low-band signal ? slow-varying trends
  • High-band signal ? highlight sudden variations
  • Mid-band ? sum of the rest

17
DWT-Based Detection
  • Detection
  • Compute variance of high and mid-band signals
    over a time interval
  • Deviation score
  • If deviation score is higher than a predefined
    threshold are marked as volume anomalies

local variance
global variance
18
Portscan Dectection
  • 2 online portscan detection techniques
  • Threshold Random Walk (TRW)
  • Time Access Pattern Scheme (TAPS)

19
Threshold Random Walk (TRW)
  • 2 Hypothesis
  • H0 a source is a normal host
  • H1 a source is a scanner
  • Rationale
  • A normal host is far more likely to have
    successful connection than a scanner which
    randomly probes address space.

20
Threshold Random Walk (TRW)
  • Hypotheses testing on sequence of events
  • To determine which hypothesis is more likely
  • let Y Y1, Y2, . . . , Yi represent the random
    vector of connections observed from a source,
  • where Yi 0 if the ith connection is successful
    and Yi 1 otherwise

21
Threshold Random Walk (TRW)
  • Likelihood Ratio
  • When the Likelihood Ratio crosses either one of
    two predefined thresholds, the corresponding
    hypothesis is selected as the most likely.
  • requires 6 observed events to detect scanners
    successfully

22
Threshold Random Walk (TRW)
  • TRWSYN - backbone adaptation of TRW
  • Backbone traffic usually uni-directional
  • Difficult to predict failed / succeeded
    connection
  • TRWSYN oracle
  • Marks single SYN-packet flows as failed
    connection
  • Detect TCP portscan ONLY

23
Time Access Pattern Scheme (TAPS)
  • Access Pattern
  • Observation Scanner initiates connections to a
    larger spread of
  • destination IP addresses (horizontal scan)
  • port numbers (vertical scan)
  • That means, ratio ? between distinct IP addresses
    and port number is larger for scanner.

24
Time Access Pattern Scheme (TAPS)
  • Hypotheses test, similar to TRW.
  • Single packet flow ?failed connection
  • Each time bin (say i), for each source, compute
    ratio ?, compare with predefine threshold k.
  • Event variable Yi 0 if ?ltk
  • 1 if ?gtk
  • Update Likelihood Ratio

25
Trace Data
  • 2 Links in Tier-1 ISPs Backbone network
  • 2 OC-48 links between backbone routers on West
    Coast and East Coast
  • BB-West Large percentage of scanning traffic
  • BB-East Large Volume
  • Collected by IPMON

26
Methodology
  • 4 sampling schemes use different parameters
  • Require common metric for fair comparison
  • We choose
  • Different in
  • Memory requirement
  • CPU utilization

Percentage of sampled flows
27
Methodology
  • Note
  • Although fixed percentage of sampled flows
  • Smart sampling Sample-and-Hold bias towards
    Large flows

28
Impact of Sampling onVolume Anomaly Detection
  • Volume Anomaly Detection Result
  • Feature Variation Due to Sampling

29
Detection from the original trace
30
  • Total 21 abrupt changes from original trace
  • No. of detection ? as sampling interval ?
  • Random flow sampling performs the best
  • Smart sampling Sample-and-hold drops much
    faster
  • No false positive in detection

31
Feature Variation Due to Sampling
  • Difference in performance on detection
  • Most volume spikes caused by a sudden increase in
    small packet flows
  • Random flow sampling is unbiased by flow size
  • Others are biased by large flows
  • Smart sampling and Sample-and-hold designed to
    track heavy hitters
  • Poor performance compare to packet sampling

32
Feature Variation Due to Sampling
  • No false positives
  • Simply, spike in samples must have existed in the
    original trace
  • Not an artifact of sampling
  • Sampling only ? no. of detection and not cause
    any false detection

33
Feature Variation Due to Sampling
  • No. of detection ? as sampling interval ? even in
    random flow sampling
  • Technique based
  • on no. of sampled
  • event and local
  • variance
  • Hypothesize sampling introduces distortion in
    variance

Success
Fail
34
Feature Variation Due to Sampling
  • Sampling introduce distortion in variance
  • Sampling scale down original time series
    by a fraction of p
  • Assume variance and average rate
  • New scaled-down variance
  • Sampling involves removal of discrete point
  • i.e. Sample original point process
  • binomially
  • Total variance

Binomial random var.
35
Feature Variation Due to Sampling
  • Total variance

removal of discrete pt.
scaled-down variance
gt 70
when N 500
Affect Detection !
36
Impact of Sampling on Portscan Dectection
  • Metrics
  • Desirable to have HIGH Rs and LOW Rf
  • Focus on Success and False Positive Ratio
    (because RsRf-1)

37
Impact of Sampling on Portscan Dectection
  • Challenge Determine true scanners
  • Final list of scanners manually generated by
    Sridharan (in Impact of Packet Sampling on
    Portscan Detection) as the ground truth
  • Less interested in absolute accuracy
  • Relative performance as a function of sampling
    scheme and sampling rate

38
TRWSYN under Sampling
  • Rs and Rf ratios for the BB-West trace as
    functions of effective sampling interval for all
    four sampling schemes

39
TRWSYN under Sampling
  • Random Packet Sampling
  • As base case for comparison
  • Success Ratio Rs
  • Initially increases slightly for small N
  • (seems advantageous)
  • Drop off for Large N

40
TRWSYN under Sampling
  • Random Packet Sampling
  • As base case for comparison
  • False Positive Ratio Rf
  • Follows similar behaviour as Rs
  • but Larger scale
  • Increases 3 times when N from 1 to 10

41
TRWSYN under Sampling
  • 2 key effects of packet sampling
  • Flow-reduction
  • Number of flows observed reduced
  • Flow-shortening
  • Multi-packet flows reduced to single packet flows
  • Recall
  • TRWSYN algorithm
  • Single SYN packet flow ? connection failure
  • ? potential scanner

42
TRWSYN under Sampling
  • Small sampling interval
  • Flow-reduction ? slight impact ? High Rs
  • Flow-shortening ? substantial impact
  • ? ?single packet flow
  • Impact
  • Scanners multi-packet flows initially missed
  • ? shortened ? Detected ? Increase Rs
  • Regular multi-packet flows
  • ? shortened ? Detected ? Increase Rf

43
TRWSYN under Sampling
  • Large sampling interval
  • Flow-reduction dominates
  • Fewer decisions (detections)
  • Rs and Rf decrease

44
TRWSYN under Sampling
  • 3 Flow sampling schemes
  • Decision based on entire flow
  • No Flow-shortening
  • Flow- Reduction dominates the impact
  • Exception
  • Sample-and-Hold
  • Mid-Flow-Shortening
  • Decision only made on SYN packet flows
  • Introduce NO False Positive

45
TRWSYN under Sampling
  • Both Rs and Rf decrease almost monotonically as
    N increases
  • Rf lower than packet sampling

46
TRWSYN under Sampling
  • In terms of Rf
  • Flow sampling gtgt Packet sampling
  • In terms of Rs,
  • Random Flow Sampling gt Random Packet Sampling gt
    Smart Sampling gt Sample-and-Hold
  • Cause
  • Bias towards Large Flows
  • Suffer more from Flow-reduction

47
TAPS under Sampling
  • Critical parameter Time Bin
  • For each sampling scheme,
  • each sampling rate,
  • Use Optimal Time Bin
  • Maximize Rs
  • Increasing function of sampling interval
  • True for both Packet sampling and Flow sampling
    schemes

48
TAPS under Sampling
  • Results of portscan detection with TAPS for Trace
    BB-West

49
TAPS under Sampling
  • Rs decreases as sampling interval increases
  • Random Flow Sampling performs the best
  • Random Packet Sampling performs as well as the
    remaining 2 Flow sampling schemes
  • Cause
  • Bias towards Large Flows
  • Tend to miss small (critical) flows

50
TAPS under Sampling
  • Random Packet Sampling
  • Rf intially increases
  • due to Flow-shortening
  • Then drop off at large sampling interval
  • due to Flow-reduction
  • Flow Sampling schemes
  • No/Minor Flow-shortening
  • Low Rf
  • Monotonically decreases with sampling interval

51
TAPS under Sampling
  • TAPS uses address range distribution for
    detection
  • Insensitive to the 4 schemes
  • No distortion introduced
  • Low Rf
  • e.g. Random Packet Sampling yields 1/10 of Rf
    by TRWSYN

52
Conclusion
  • Random Flow Sampling
  • Performs the best
  • Prohibitive resource requirement
  • Random Packet Sampling
  • Suffers from Flow-shortening
  • Smart Sampling Sample-and-Hold
  • Bias towards large flows
  • Perform poorer than Random Packet Sampling in
    volume anomaly detection

53
Conclusion
  • All 4 sampling schemes
  • Degrade all 3 anomaly detection algorithms
  • In terms of Rs and Rf
  • Sampled Data Sufficient for Anomaly Detection?
  • ? Remains an Open Question
Write a Comment
User Comments (0)
About PowerShow.com