SSCI - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

SSCI

Description:

The classification results were quantified and compared with the detection results by UNM. ... From the statistical point of view, this is a classification problem. ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 24
Provided by: vikramma
Category:

less

Transcript and Presenter's Notes

Title: SSCI


1


SSCI 1301DARPA OASIS PI MEETING Santa Fe, NM
- Jul 24-27, 2001Intelligent Active Profiling
for Detection and Intent Inference of Insider
Threat in Information Systems
Joao B. D. Cabrera and Raman
K. Mehra Scientific
Systems Company, Inc.
Lundy
Lewis Wenke Lee
Aprisma
Inc. North Carolina State
Univ.


SBIR
Phase I
Topic No. SB002-039
Contract No. DAAH01-01-C-R027
2
ObjectiveClassifying and Responding to Insider
Threats
  • Objectives Design and evaluate IDSs capable of
    classifying and responding to Insider Threats
    investigate the use of Network Management Systems
    as a vehicle.
  • Misuse/Intrusion Tolerance is achieved by
    having an adequate and timely response.
  • Technology Statistical Pattern Recognition and
    AI for the design of detectors and classifiers
    NMSs for data collection and response
    coordination.
  • Approach Utilize the Benchmark Problem for
    proof-of-concept studies examine the
    applicability of NMSs and peripherals for
    monitoring and response.

3
Towards Adequate and Timely Response
  • Adequate
  • High Accuracy Few False Alarms, Lots of
    Detections.
  • Distinguish among attacks Different attacks
    elicit different types of response.
  • Distinguish faults from attacks.

  • Timely
  • Detect the Attack before it is too late to
    respond.

4
Question 1 What threats/attacks are your project
considering ?
Insider Attacks Password stealing,
unauthorized database access, email snooping,
etc. For proof-of-concept purposes, we
investigated the Benchmark Problem of System
Calls made by Unixs sendmail. However,
the technologies and tools we are developing are
applicable to any situation in which the
observables are sequences of possibly correlated
categorical variables Audit Records by BSM in
Unix or Object Access Auditing in Windows NT.
5
Question 2 What assumptions do your project make
?
1. Data sets corresponding to normal, malicious
and faulty behavior are available for the
construction and testing of detection schemes
Training Stage and Testing Stage. 2. The
observables for normal, malicious and faulty
behavior are sequences of categorical variables.
3. Patterns capable of differentiating between
different types of malicious activity and faults
exist, and are learnable by special purpose
algorithms verified in the effort. 4. If 3.
is possible, there is time to take preventive
action when malicious activity is detected.
6
Question 3 What policies can your project
enforce ?
If the detection system accuses the
presence of malicious activity, a response will
be triggered. For the specific case of the
Benchmark Problem, typical responses would be to
kill the process, or delay its execution till
time out. Intent Inference gives the
capability of specializing the response. ?
The project aims to develop a capability
Intent Inference - which can be used as a
component of Intrusion Tolerant Architectures.
7
Benchmark ProblemDetect malicious activity by
monitoring System Calls made by Privileged
Processes in Unix
Originally suggested by C. Ko, G. Fink, and K.
Levitt 1994. Extensively studied by the UNM
Group (S. Forrest and others), starting with A
Sense of Self for Unix Processes 1996.
Programs sendmail, lpr, ls, ftp, finger Well
Investigated Problem Our results could be
compared with previous efforts. We concentrated
on sendmail Data sets for six types of
anomalies (five attacks and one fault) are
available.
8
Benchmark Problem (cont.)
UNM Finding A relatively small dictionary of
short sequences (901 sequences of length 6 for
sendmail) provides a very good characterization
of normality for several Unix processes. The
dictionary is constructed using a Training Set of
Normal behavior. Sequences not belonging to
this dictionary are called abnormal sequences.
Intrusions are detected if a process contains
too many abnormal sequences. Processes are
labeled as normal or intrusions All intrusions
receive the same label.
9
Privileged Programs and the space of OS calls
10
Anomaly Count Detector (UNM)
  • Determining the
    Threshold
  • Anomalous Traces not available Anomaly
    Detection Problem.
  • Anomalous Traces available Classification
    Problem.

11
Anomaly Count Detector - Statistics
  • Typical
    Results
  • A2, A3, A4, A5 detectable (anomaly counts well
    above normal).
  • A1 decode intrusion Not Detectable.

12
This Project Specific Objectives and
Accomplishments
  • 1. Intent Inference
  • Demonstrated the feasibility of performing Intent
    Inference based on sequences of OS calls for
    sendmail.
  • The classification results were quantified and
    compared with the detection results by UNM.
  • Fusion of Detection Systems
  • Demonstrated the improvement of detection rates
    gained by combining the proposed scheme for
    Intent Inference with the UNM scheme for
    detection based on Anomaly Counts.


13
Intent Inference
We pose the problem of Intent Inference as
distinguishing between types of attacks and
faults using the sequences of OS calls.
From the statistical point of view, this is a
classification problem. The main issue is to
find features that cluster the different types of
attacks and faults.

14
Looking for Features Returning to the space of
OS Calls
Balance between small within-class-scatter
(elements in each class as clustered as possible)
and large between-class-scatter (classes as
separated as possible). The Abnormal Sequences
corresponding to each Anomaly can also be viewed
as Features. Do they have any Discriminating
Power ?

15
Discriminating Power of Anomalous
Sequences(Anomalies for which Multiple Traces
are available)
It was observed that the Anomalous Sequences
are distinct for each Anomaly Type (large
between-class-scatter), and appear consistently
in all traces of a given Anomaly (small
within-class-scatter).

? The Anomalous Sequences are good discriminators.
16
Why this is so ?
  • Anomalous Processes are the superposition of
    large sections of Normal Actions reflecting the
    Normal Behavior of the Program (typically 90)
    and a small, concentrated sequence of very
    specific actions associated with the Anomaly.
  • Different anomalies are related to different
    actions, and it is reasonable to expect that
    these distinctions would be apparent.
  • It is remarkable however that this separation
    could be observed at the level of OS Calls.
  • The Anomalous Sequences serve as signatures for
    the Anomalies These are statistical
    signatures, extracted by an automatic procedure,
    not by domain knowledge.


17
Constructing a Classifier based on Anomalous
Sequences
  • Extract the Normal Dictionary.
  • For each Anomaly Type, record the corresponding
    Anomalous Sequences Call the set of these
    sequences as the Anomaly Dictionary for the
    Anomaly. After Training, there will be N Anomaly
    Dictionaries.
  • Incoming Processes are labeled according to
    matches with the Anomaly Dictionaries the
    Anomaly with most matches is selected.
  • Processes for which no match is found are labeled
    as Normal.

18
String Matching Classifier

The operation is as simple as the Anomaly
Count Detector, but the Memory Storage
Requirements are typically 70 less.
19
Performance Evaluation(Testing Set average of
4,000 combinations)
  • 100 performance for A1 and A2 for k gt 5. A1 is
    detected, which is not possible using Anomaly
    Counts.
  • No False Alarms for k lt 8.

20
Performance Evaluation (cont.)
  • Poor Performance for Unknown Anomalies
    Mislabeled as one of the Known Anomalies.
  • 20 of the Fault Anomalies are missed.

21
Improving the Performance of the String Matching
Classifier
  • ? The Performance of the Classifier can be
    improved by combining it with the Anomaly Count
    Detector
  • Processes with Anomaly Counts above the
    Detection Threshold, are labeled as Anomalous,
    regardless of matches with the Anomaly
    Dictionaries following this procedure, the 20
    of Faults are labeled as Unknown Anomalies.
  • Anomalies with matches with more than one
    Anomaly Dictionary are labeled as Unknown
    Anomalies following this procedure, the Unknown
    Anomalies A4 and A5 are corrected labeled.


22
Summary (Phase I)
Demonstrated the feasibility of using sequences
of OS calls for the classification of Anomalies
effected by Privileged Programs in Unix String
Matching Classifier. Correct classification of
Anomalies allows a more specific response an
important capability for Intrusion Tolerance.
Sequences of systems calls were shown to be
Statistical Signatures for the Anomalies.
Combining the String Matching Classifier with the
Anomaly Count Detector The Anomaly Count
Detector detects Unknown Attacks, while the
String Matching Classifier allows accurate
characterization of Known Attacks.

23
Further Work (Phase II)? Towards a Host-Based
System for Classification of Intrusions
  • Verify if the Paradigm of Statistical
    Signatures holds for other scenarios Audit
    Trails in Unix and Windows NT.
  • Combination of data-based schemes with Domain
    Knowledge using Automated Rules to construct
    more complete Normal Dictionaries at the level of
    OS Calls.
  • Integration with NMS modules
  • At the System and Application Management Level
    Using available COTS peripherals to construct a
    Host-Based IDS and the attending response
    infrastructure.
  • At the Network Management Level Using the COTS
    systems to integrate the outputs of the IDS with
    other elements of the Infrastructure.

Write a Comment
User Comments (0)
About PowerShow.com