Intrusion Detection Modeling Technique and Experiment Design - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Intrusion Detection Modeling Technique and Experiment Design

Description:

IDES flag observed activities that deviate significantly from the established ... IDIOT and STAT use patterns of well-known attacks or weak spots of the system ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 19
Provided by: ydo
Category:

less

Transcript and Presenter's Notes

Title: Intrusion Detection Modeling Technique and Experiment Design


1
Intrusion Detection Modeling Technique and
Experiment Design

  • Yuhong Dong
  • ydong_at_cse.fau.edu
  • March 19, 2004

2
Table of Content
  • Review IDS system (anomaly detection and misuse
    detection)
  • IDS Modeling Algorithm
  • -- Classification Modeling
  • -- Association Rule
  • -- Frequent Episode
  • Feature Construction
  • Experiments
  • Conclusion

3
Overview IDS system
  • Anomaly Detection System
  • IDES flag observed activities that deviate
    significantly from the established normal usage
    profiles
  • Misuse Detection System
  • IDIOT and STAT use patterns of well-known
    attacks or weak spots of the system to match and
    identify known intrusion, patterns or signatures

4
Building IDS is a hard work
  • System builders rely on their intrusion and
    experience to select the statistical measures for
    anomaly detection
  • Experts first analyze and categorize attack
    scenarios and system vulnerabilities, and
    hand-code the corresponding rules and patterns
    for misuse detection.

5
Algorithm
  • Classification
  • maps a data item into one of several
    predefined categories ( normal and abnormal)
  • --decision trees or rules
  • Link analysis
  • determines relations between fields in the
    database records. Correlations of system features
    in audit data.
  • -- A programmer, for example, may have
    emacs highly associated with C file
  • Sequence analysis
  • models sequential patterns. These algorithms
    can discover what time-based sequence of audit
    events are frequently occurring together.
  • -- patterns from audit data containing
    network-based denial-of-service(DOS) attacks
    suggest that several per-host and per-service
    measures should be included.

6
Classification Modeling normal / intrusion--
example of telnet records

Hot count of access of system
directory Compromised count of file/path not
found errors and Jump to instructions
7
Classification Modeling -- Example Ripper Rules
from Telnet Records
Ripper selects the unique feature values in
identifying the intrusions. These rules can be
first inspected and edited by security experts,
and then be incorporated into misuse detection
system. The accuracy of classification model
depends directly on the set of features provided
in the training data. For example, if the
features hot, compromised and root_shell were
removed from the records in the Table1, Ripper
would not be able to produce accurate rules to
identify buffer overflow connections
8
Association Rules
The goal of mining association rules is to derive
multifeature correlations from the database
table. Support(x) is defined as the percentage of
records that contain item set X. An association
rule is a set of item set X. An association rule
is an expression X-gtYc,s, ssupport(XUY) is the
support of the rule, and csupport(XUY)/support(X)
is the confidence.
9
Frequent Episodes
  • Given a set of time stamped event records, where
    each record is a set of items, an intervalt1,t2
    is the sequence of event records.
  • Support(x) is the ratio between the number of
    minimum occurrences that contain X and the total
    number of event records.
  • A frequent episode rule is the expression
  • X,Y-gtZ c,s,w ssupport(XUYUZ) is the
    support of the rule, and csupport(XUYUZ)/support(
    XUY) is the confidence, wt2-t1

10
Feature Construction
  • Conditions
  • --Network Intrusion Detection System
  • --Algorithm frequent episodes
  • --Pre-processing tcpdump data
  • Experiment
  • -- applying the frequent episodes program to
    both normal connection data and intrusion data,
    and compare the resulting patterns to find the
    intrusion only patterns.
  • -- Then apply the algorithm to construct the
    syn flood pattern, the result pattern a count
    of connections to the same dst_host in the past 2
    seconds, and among these connections, a
    percentage of those that have the same service
    and percentage of those that have the S0 flag.
  • Open problem
  • -- how to decide the right time window
    value w.
  • -- how to select the appropriate feaures to
    detect an intrusion
  • -- how to select the right axis and
    reference features to generate the most
    distinguishing and useful intrusion patterns

11
Experiments
  • The Data Resources DARPA data
  • -- Data Pre-processing
  • Misuse Detection
  • -- Manual and Automatic Feature Construction
  • -- Detection Models
  • -- Results
  • User Anomaly Detection
  • Conclusion and Future Directions

12
Experiment
  • Object of the Experiment
  • -- survey and evaluate the state of the art
    in research in intrusion detection.
  • Procedure
  • -- Each participating site was required to
    build intrusions detection models using the
    training data, and send the results on the test
    data back to DARPA for the performance
    evaluation.
  • The DARPA data
  • -- 4 gigabytes of compressed tcpdump data of
    7 weeks of network traffic.
  • -- This data can be processed into about 5
    million of connection records of about 100 bytes
    each.
  • -- the data contains content of every packet
    transmitted between hosts inside and outside a
    simulated military base.

13
Experiment
  • DARPA DATA ( continued)
  • Four main categories of attacks were
    simulated
  • -- DOS, denial-of-service, for example, syn
    flood
  • -- R2L, unauthorized access from a remote
    machine, for example, guessing password
  • -- U2R, unauthorized access to local super
    user privileges by a local unprivileged user,
    buffer overflow attacks
  • -- Probing, surveillance and probing, for
    example, port-scan, ping-sweep
  • Data Pre-processing each record includes these
    intrinsic features
  • Misuse Detection Feature Construction and
    Detection Models

14
Experiment Feature Construction Detection
Model
  • Detection Model
  • -- traffic model DOS and Probing attack
  • -- host-based traffic model slow Probing
    attacks
  • -- content model R2L and U2R attack
  • Result

X-axis false alarm rate Y-axis detection rate
X-axis is the false alarm rate, calculated as
the percentage of normal connections classified
as an intrusion.
15
Experiment - Performance
  • This is an misuse detection system, it is better
    performance for the known attack than unknown
    attack. For al intrusions, an overall detection
    rate of bellow 70 is hardly satisfactory in a
    mission critical environment.

16
Experiment User Anomaly Detection
  • Initial exploratory approach is to mine the
    frequent patterns from user command data, and
    merge or add the patterns into an aggregate set
    to form the normal usage profile of a user.
  • A new pattern can be merged with an old pattern
    if they have the same left-hand-sides and
    right-hand-sides, their support values are within
    a 5 of each other, and their confidence values
    are also within 5 of each other
  • To analyze a user login session, we mine the
    frequent patterns from the sequence of commands
    during this session. This new pattern set is
    compared with the profile pattern set and a
    similarity score is assigned. Assume that the new
    set has n patterns and among them, there are m
    patterns that have matches in the profile
    pattern set, then the similarity score is simply
    m/n, a higher similarity score means a higher
    likelihood that the users behavior agrees with
    his or her historical profile.

17
Conclusion and Future Directions
  • Data generated from network traffic monitoring
    tends to have very high volume, dimensionality
    and heterogeneity, and there is a need for high
    performance modeling algorithms that will scale
    to very large network traffic data sets.
  • Network data is temporal (streaming) in nature,
    and development of algorithms for mining data
    streams is necessary for building real-time
    intrusion detection system.
  • Low frequency of computer attacks requires
    modification of standard data mining algorithms
    for their detection.
  • Cyber attacks may be launched from several
    different locations and targeted to many
    different destinations, thus creating a need to
    analyze network data from several network
    locations in order to detect these distributed
    attacks.

18
Reference
  • A Data Mining Framework for Building Intrusion
    Detection Models
Write a Comment
User Comments (0)
About PowerShow.com