Statistical%20Based%20Anomaly%20Detection - PowerPoint PPT Presentation

About This Presentation
Title:

Statistical%20Based%20Anomaly%20Detection

Description:

Data are collected from various sources from the system ... Rarely used commands may be more discriminative than frequently used ones ... – PowerPoint PPT presentation

Number of Views:73
Avg rating:3.0/5.0
Slides: 32
Provided by: wwwnetC
Category:

less

Transcript and Presenter's Notes

Title: Statistical%20Based%20Anomaly%20Detection


1
Statistical Based Anomaly Detection
  • Based on the joint work of ATT, Rutgers and NISS
  • Presented by Jinghua Hu

2
Outline
  • Introduction
  • Statistics Based Anomaly Detection
  • Data Sets
  • Models
  • Summary
  • References

3
Introduction
  • Intrusion Detection
  • Misuse Detection
  • Anomaly Detection
  • Modeling Methodology
  • Rule based
  • Specification based
  • Profile based

4
Introduction
  • Profile Based Anomaly Detection
  • Pattern-based profiles
  • Fixed-length patterns (U. of New Mexico)
  • Variable-length patterns ( IBM Zurich )
  • Sequence-Match ( Purdue )
  • Statistics-based Profiles
  • SRI
  • ATT, Rutgers and NISS

5
SRI NIDES
  • Data are collected from various sources from the
    system
  • Build frequency-based statistical profiles based
    on long-term distributions for each features
  • Compare short-term distributions to the profiles
    and get individual scores
  • Integrate individual scores into an overall
    abnormality score

6
ATT, Rutgers and NISS
  • Data Source
  • Unix command sequences for 50 users, without
    arguments or timestamps
  • Cut into fixed-length blocks
  • Block 1--50 clean data, for Training
  • Block 51--150 partly contaminated with other
    users data blocks
  • Unit of analysis command blocks
  • Goals Detecting Masqueraders

7
ATT, Rutgers and NISS
  • Static Statistical Models
  • Frequency based model
  • Uniqueness model
  • Dynamic Statistical Models
  • Markov Chain
  • Hybrid High-order Markov
  • Other Methods
  • Data Compression
  • IPAM
  • Sequence-match( from Purdue)

8
Static Statistical Models
  • Frequency-based model
  • Built probability distribution of commands
  • Hypothesis
  • Commands are generated by the probability
    distributions of the given user

9
Uniqueness Model
  • Motivation
  • Commands not previously seen in users training
    data may indicate a masquerader
  • Rarely used commands may be more discriminative
    than frequently used ones
  • Popularity of commands
  • Uniquely used used only by a single user
  • Unpopular used by few users

10
Uniqueness Model
  • Uniqueness Scores
  • Need command usage information of the whole group
    of users for training
  • Assign an uniqueness ID to each command based
    on the number of users in the group who have used
    the command before

11
Uniqueness Model
  • Test Score
  • Check testing data command-by-command to compute
    a test score
  • Incorporating multiple factors including
  • Uniqueness index of commands
  • Weights based on whether the command was seen
    before in the given users training data
  • Command usage relative to other users

12
Markov Chain
  • Motivation
  • Dynamic models contain more information
  • Model transition probability of commands
  • Hypothesis
  • The one-step transition probabilities of the
    commands conform to the state transition
    probability matrix of the given user

13
Markov Chain
  • Test Statistic
  • Log-Likelihood Ratio
  • Problems with the standard Models
  • Large dimension of parameter space without enough
    supporting data
  • Large computation efforts in training
  • Alternative Hypothesis
  • Reduced dimension

14
Markov Chain
  • Techniques for dimension reduction
  • Principal Component Regression
  • Choose linear combinations of user deviations
    from state transition matrix that have maximum
    variance and uncorrelated
  • Alternative Test Statistic
  • Fisher Score Statistics
  • Bayesian Factor

15
Hybrid High-order Markov Chain
  • Motivation
  • Multi-step transition may be more realistic in
    modeling users behavior
  • High-order Markov Chain provide richer
    information than 1-step Markov Chain
  • Test Statistics
  • Log-Likelihood Ratio

16
Hybrid High-order Markov Chain
  • Techniques for dimension reduction
  • Restrict attention to a subset of the most used
    commands
  • MTD Represent high-order state transition
    probability as the linear combinations of
    one-step transition probabilities
  • For rarely used commands, Use statistical-independ
    ence models instead of High-order Markov Chains

17
Data Compression
  • Motivation
  • Users tend to repeat part of their own history
    activities
  • Test data appended to the historical training
    data should compress more readily when the test
    data comes indeed from the same user rather than
    a masquerader

18
Data Compression
  • Test Score
  • Calculate the number of additional bytes needed
    to compress the test data when appended to the
    training data
  • Using Lempel-Ziv algorithm for data compression
  • An alarm is raised when the score is above a
    threshold

19
IPAM
  • Incremental Probabilistic Action Modeling
  • Based on 1-step command transition probabilities
    estimated from the training data
  • Use an exponential updating scheme to update the
    transition probabilities
  • Prediction
  • Predict the next command by choosing the one
    corresponding to the highest transition
    probability

20
IPAM
  • Test Score
  • A prediction is labeled as good if the next
    command turns out to be among the top-4 predicted
    commands
  • The fraction of good predictions of the test data
    forms the score
  • An alarm is raised when the score is below the
    threshold

21
Sequence-Match
  • Based on pattern matching
  • Profiles represented by sequences of length N
  • Using a window sliding along testing data,
    compute a similarity measure between the most
    recent N commands and the profiles
  • The measure is the count of matches where
    adjacent matches get higher weights
  • The maximum of all similarity values forms the
    test score

22
Summary
  • Test Results
  • Masqueraders fall into two major groups
  • Either easy to detect by all methods
  • Or hard to detect and thus missed by all
  • False Alarm
  • Different methods prone to raise false alarms for
    different users
  • False alarms often appear in consecutive blocks

23
(No Transcript)
24
(No Transcript)
25
Summary
  • Performance Comparison
  • Dynamic models work better than static models,
    but not significantly
  • Uniqueness method dominates when 1-5 false
    alarms range is desired
  • Compression Method seems to be inferior
  • Dynamic models are similar to pattern-based
    profiles, with the additional information of
    probabilities associated with transitions

26
(No Transcript)
27
Summary
  • Correlations between methods
  • Highly correlated Uniqueness and Hybrid
    High-order Markov
  • Correlated IPAM and Sequence-Match
  • 1-step Markov Chain can be associated with
    either group
  • Data Compression stand alone

28
Corelations between Methods
Uniqueness
IPAM
Markov Chain
Hybrid High-order Markov
Sequence Match
Data Compression
29
Summary
  • Profile Updating
  • Profile updating helps the models to adapt to
    profile shift
  • Uniqueness method and Hybrid High-order Markov
    method benefit from updating
  • Detection methods can be vulnerable from updating
    during testing
  • Compression method works better without updating

30
References
  • D. Anderson, T. F. Lunt, H. Javitz, A. Tamaru, A.
    Valdes. Detecting Unusual Program Behavior Using
    the Statistical Component of the Next-generation
    Intrusion Detection Expert System (NIDES).
    Computer Science Laboratory, SRI-CSL-95-06, May
    1995
  • Theus, M., Schonlau, M., Intrusion Detection
    Based on Structural Zeroes. Statistical Computing
    Graphics Newsletter. Vol. 9, No 1, 12 - 17.
    1998.
  • DuMouchel, W., Schonlau, M., A fast computer
    intrusion detection algorithm based on hypothesis
    testing of command transition probabilities.
    KDD98, New York, pp. 189-193. 1998.
  • Schonlau, M., DuMouchel, W., Ju, W., Karr, A.,
    Theus, M., Vardi, Y., Computer Intrusion
    Detecting Masquerades, Statistical Science,
    February 2001.

31
References
  • DuMouchel, W., Schonlau, M., A Comparison of Test
    Statistics for Computer Intrusion Detection Based
    on Principal Components Regression of Transition
    Probabilities, Proceedings of the 30th Symposium
    on the Interface Computing Science and
    Statistics, 30, 404-413. 1999.
  • B. D. Davison and H. Hirsh. Predicting Sequences
    of User Actions. In Predicting the Future AI
    Approaches to Time Series Problems, pages 5-12.
    AAAI Press, 1998.
  • Wen-Hua Ju and Yehuda Vardi, Profiling UNIX
    Users And Processes Based on Rarity of Occurrence
    Statistics with Applications to Computer
    Intrusion Detection. RAID 2001.
  • Lane, T. and Brodley, C. E. Temporal Sequence
    Learning and Data Reduction for Anomaly
    Detection. In Proceedings of the Fifth ACM
    Conference on Computer and Communications
    Security, pp 150-158. 1998.
Write a Comment
User Comments (0)
About PowerShow.com