Situational Awareness Analysis Tool for Aiding Discovery of Security Events and Patterns - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Situational Awareness Analysis Tool for Aiding Discovery of Security Events and Patterns

Description:

Situational Awareness Analysis Tool for Aiding Discovery of Security Events and Patterns ... 'shallow' analysis of voluminous network-wide sensor data to ... – PowerPoint PPT presentation

Number of Views:319
Avg rating:3.0/5.0
Slides: 39
Provided by: varunch
Category:

less

Transcript and Presenter's Notes

Title: Situational Awareness Analysis Tool for Aiding Discovery of Security Events and Patterns


1
Situational Awareness Analysis Tool for Aiding
Discovery of Security Events and Patterns
  • PI Vipin KumarCo-PIs Jaideep Srivastava,
    Zhi-Li Zhang, Yongdae Kim, University of
    Minnesota

2
Presentation Outline
  • Executive Summary
  • Key Accomplishments Since June 2004
  • A Novel Approach to Level II Analysis
  • Analysis Framework
  • Analysis Methodology
  • Evaluation of the Approach
  • Case Study I Experience with the SKAION Data
  • Case Study II Experience at the University of
    Minnesota
  • Applicability of Approach to IC Scenarios
  • IC Scenario
  • Assumptions and Limitations
  • Relationship to other ARDA funded projects
  • Project Future Plans
  • Tasks, timeline, and deliverables

3
Executive Summary
  • Objective Help IC network defenders identify
    analyze distributed, stealthy, multi-step, novel
    attacks
  • Innovative Claim
  • a novel Level-II analysis framework/process and
    associated techniques for identifying
    distributed, stealthy, multi-step attacks
  • provide attack context and sequencing of events
    to aid IC defenders for timely attack recognition
    situation assessment
  • transform large amount of sensor data into a
    small set of labeled event sequences analyzable
    by human security analysts
  • significantly reduce false alarms, and uncover
    correlated attacks
  • Novel Ideas
  • shallow analysis of voluminous network-wide
    sensor data to identify anchor points for
    in-depth follow-on analysis in a focused context
  • spatial/temporal chaining analysis and event
    sequencing for attack context extraction and
    characterization
  • both employ behavior-based host profiling flow
    anomaly analysis

4
Situational Awareness Analysis Framework
Level I
Level II
Signature
-
based IDS
Attack Context
Anchor point
Extraction
identification
Anomaly Detector
Attack
Situation
Characterization
Assessment
Scan Detector
Behavior Profiling
Host/Service Profiling
Flow Anomaly Analysis
Attack Profiling
5
Key Accomplishments
  • Developed a novel level I and level II analysis
    framework and algorithms
  • Behavior anomaly detection for identifying hard
    to detect malicious activity in the IC networks
  • Profiling of network traffic along multiple
    dimensions to characterize normal/abnormal
    behavior, enabling improved level I and level II
    analysis
  • Intelligent fusion of multiple sensor data for
    high-confidence attack recognition (e.g.
    signature based IDS, scan detection, anomaly
    detection)
  • Spatio-temporal chaining analysis in the
    communication graph to extract larger context of
    a suspicious activity
  • Event sequencing and labeling for attack
    characterization
  • Demonstrated success in detecting multi-step
    attack scenarios in Skaion's dataset, especially
    generated for ARDAs P2INGS program
  • Skaion attack scenarios are detected with a low
    false alarm rate
  • Demonstrated success on real world network data
    at the University of Minnesota and at the ARL -
    Center for Intrusion Monitoring and Protection,
    where data is analyzed from multiple DoD sites

6
Research Objective Key Assumptions
  • Objective Help IC network defenders identify
    analyze distributed, stealthy, multi-step, novel
    attacks
  • Key assumptions
  • Attacks on IC networks
  • Unlike common Internet attacks such as worms and
    (distributed) denial-of-service attacks, which
    generate large volume of data, and take place in
    a short time
  • Likely to occur in multi-stages spread out in
    time, involving several outside hosts (and
    perhaps compromised inside hosts), and generating
    low-intensity traffic
  • Want to break into protected hosts for access to
    sensitive data
  • Attack events exhibit anomalous behaviors
    deviating from normal host/service profiles
  • Attackers/victims connected by suspicious
    communication activities

7
Level II Analysis Methodology
  • Anchor Point Identification
  • Identifying starting point for attack analysis
    via data fusion correlation of output from
    level 1 analysis
  • Context Extraction
  • Identifying relevant events and entities (hosts,
    flows, ), starting from an anchor point
  • Attack Characterization
  • Refinement of context to characterize attacks
    (presently manual)
  • Situation Assessment
  • Evaluation of attack characterizations (out of
    scope)

Anchor Point Identification
Context Extraction
Attack Characterization
Situation Assessment
Motivated by challenges faced while working on
several cases with Angelo Bencivenga and Tim
Dunn at the ARL CIMP
8
Level-II Analysis Process Diagram
Configuration/Selection of Analysis Strategies
Search size, depth, time frame
Labeling/Scoring Rules
Control
List of anchor points
Event activity graph
Labeled Attack Sequences
Anchor Point Identification
Context Extraction
Attack Characterization
IDS Sensor Data
Situational Assessment
Behavior Anomaly Analysis
Profile based chaining analysis
Temporal sequencing analysis
Domain specific guided search
Algorithms/Techniques
Correlation/fusion of multiple sensor data
Knowledge based event labeling

Watchlist/Blacklist
Attack pattern matching


9
Behavior Profiling Anomaly Analysis
Historical Behavior Profiling
Current-Time Anomaly Analysis
Scoring
  • Network-Wide
  • Flow Anomaly Analysis
  • MINDS flow anomaly ranking
  • signature-based alerts
  • TCP flag analysis,
  • Service Profiling
  • service type web, dns,
  • protocol TCP, UDP, ICMP,
  • connection patterns
  • flow statistics

flow anomaly scores
  • Host Profiling
  • host types servers, clients, etc.
  • port/service profiles
  • traffic statistics
  • communication patterns
  • Host-Specific
  • Flow Anomaly Analysis
  • deviation from normal host
  • behaviors

host anomaly scores
clustering
outlier analysis
association rules
signature-based rule matching
Techniques
statistical profiling
statistical deviation analysis
link analysis


10
Anchor Point Identification Techniques
  • Watch list maintained by analyst
  • Hosts that engage in suspicious activity as
    identified by one or more of the following
  • Standard IDS signature (snort alarms)
  • Behavior Anomalies
  • Hosts that send/receive traffic that is anomalous
    w.r.t. historical profile
  • Behavior Signatures
  • A host that communicates with a known compromised
    machine
  • Hosts that perform scans
  • Port knocking
  • Services (e.g., ftp, ssh) running on non-standard
    ports
  • Any other identifiable behavior of a known
    compromised machine

Anomalous Flow
Communication to a host on watchlist
Watchlist Host Anchor Point
SNORT alert on host with anomalous behavior
11
Attack Context Extraction
  • Starting from an anchor point recursively examine
    activity to other hosts that
  • deviates from norm
  • hosts profile
  • service/port profile
  • is similar to known suspicious traffic
  • attack signatures
  • replies to scans
  • activities from compromised hosts

Anchor Point
Remote login attempt
Outbound FTP
Reply to a scan
Web server
ruleset terminal_services ignore srcport
lt 1024 ignore packets lt 4 ignore
dstport ! 3389 ignore protocol ! tcp
profile client_services server dstport
3389 protocol tcp profile servers
client dstport 3389 protocol tcp
12
Attack Characterization
A context graph
  • Determine likely relationships (e.g. sequencing)
    between retained events and hosts
  • Evaluate and rank hosts and activities in the
    attack context to
  • Retain those with high degree of suspicion and
    prune those with low degree of suspicion

E1
E3
E2
E4
I4
I1
I2
I3
  • Sample Rules
  • If a host is scanning - label it as attacker
    with low score
  • If a host is scanned and it replies label it
    as victim and give it a medium score
  • If a internal host is scanning - label it hacked
    with a high score
  • If a hacked internal host makes a subsequent file
    transfer to outside increase the score of the
    hacked label and label the target host as
    attacker with a high score

time
  • Attack Characterization Event
    Sequencing Labeling
  • E1 -? I1 Scan with replies
  • E2 -gt I4 Initializing connection on
    non- standard port - Successful
  • I4 -gt E4 Initializing ftp connection
    with external host

13
Accomplishments since Nov. 17, 2004 site visit
  • Refined Level II Analysis Process
  • Investigated and improved anchor point analysis
  • Spatio-temporal chaining analysis for context
    extraction
  • Event sequencing and labeling for attack
    characterization
  • Evaluation using Skaion Dataset II and real
    network data from the University of Minnesota

14
Skaion Dataset
  • Two sets of synthetic data, generated by the
    Skaion Corporation to simulate IC network traffic
    and attacks
  • Traffic generated to statistically match data
    captured at AFRL
  • Traffic contains background (normal) traffic, as
    well as various scans and failed attacks
  • Background traffic is combined with multi-step/
    multi-stage attacks to produce each scenario
  • Data Set I 4 scenarios
  • A. Naïve Attacker B. Five-by-five
  • C. Ten-by-ten D. Simple-ten
  • Data Set II 3 categories
  • Single-Stage Attacks 8 scenarios
  • Bankshot Multi-stage Attacks 5 scenarios
  • Misdirection Multi-stage Attacks 3 scenarios
  • Each scenario includes tcpdump data of all
    network traffic as well as Snort alerts, HTTP
    access logs, FTP transfer logs, and Windows logs

15
Scenario II.C.1 S29 Misdirection Multistage
Attack
18.2.175.153
40.159.214.124
Anomaly Rank 47 Failed connection on port 22
53.82.21.112
EXTERNAL
  • Anchor Point Identification
  • SNORT alerts involving anomalous IPs
  • Statistics
  • Trunk
  • Total Packets 103,791
  • Total flows 10,859
  • Snort Alerts 451
  • Bprd
  • Total Packets 73,595
  • Total flows 6,987
  • Colo
  • Total Packets 98,858
  • Total flows 6,002

Anomaly Rank 47 Failed connection on port 22
REMOTE OSIS USERS
100.10.20.4
BPRD
web-server
Scanner
16
Scenario II.C.1 S29 Misdirection Multistage
Attack
18.2.175.153
Attempts remote login
116.45.223.116
40.159.214.124
74.205.114.175
Anomaly Rank 47
40.219.61.25
53.82.21.112
Scans and gets a reply
EXTERNAL
Web-server initiating connection on port
8080 Anomaly Ranks1,2, 4, 11
  • Context Extraction
  • Activity that deviates from hosts normal profile
  • Scans that get replies

Web-server initiating FTP Connections
100.1.21.134
Remote login on the web-server This follows
undetected iis50_nsiislog attack
REMOTE OSIS USERS
Anomaly Rank 47
100.10.20.10
100.10.20.4
web-server
BPRD
web-server
17
Scenario II.C.1 S29 Misdirection Multistage
Attack
time
E4
  • Attack Characterization Event
    Sequencing Labeling
  • E4, E5 E6 -gt I2 Bad HTTP Traffic
  • E2 -? i2 Scanning with a reply
  • E3 E6 ? I2 Remote login - failed
  • D1 -gt I1 Remote login
    successful
  • I1 -gt E1 Anomalous FTP
  • I1 -gt E2 Anomalous
    transfer on port 8080

E5
E3
E2
E6
E1
X
X
Dial-up host D1 hacks into web server I1 via
remote login, and initiates anomalous file
transfers from I1 to two outside hosts, E1 E2,
where E2 earlier performed scanning
D1
I1
REMOTE OSIS USERS
web-server
I2
BPRD
web-server
Scanner
18
Scenario II.B.1 S1 Bankshot Multistage Attack
Anomaly Rank 194 Ftp connection to the
web-server
51.91.57.157
112.50.254.117
EXTERNAL
  • Context Extraction
  • Activity that deviates from hosts normal profile
  • Scans that get replies

Anchor Point Identification SNORT alerts
involving anomalous IPs
  • Statistics
  • Trunk
  • Total Packets 986,494
  • Total flows 44,994
  • 10,896 Snort Alerts
  • Bprd
  • Total Packets 305,598
  • Total flows 19,111
  • Colo
  • Total Packets 960,676
  • Total flows 27,045

Successful remote login from the external host
100.20.200.15 /100.20.1.3
Failed access to web-server on port 111
web-server
Anomaly Rank 194 Ftp connection to the
web-server
100.10.20.4
Initializing connection with mail server on port
5617 Anomaly Rank 1,2,5
web-server
SHIELD ENCLAVE
100.10.20.3
mail server
BPRD
Scanner
19
Scenario II.B.1 S1 Bankshot Multistage Attack
time
  • Attack Characterization Event
    Sequencing Labeling
  • E2 -? I2 Scanning with a reply
  • E2 -gt I2 Failed ftp attempt to
    web- server
  • E1 ? S1 Scanning with a
    reply
  • E1 ? S1 Remote login -
    successful
  • S1 -gt I1 I2 Scanning with
    replies
  • S1 -gt I2 Failed connection to
    web-server on non-standard port
  • S1 -gt I1 Successful connection to
    mail server on port 5617

E2
E1
EXTERNAL
X
SHIELD ENCLAVE
S1
X
web-server
I2
External host E1 scans and hacks internal host S1
which scans the BPRD network and hacks mail
server I1
web-server
I1
mail server
BPRD
Scanner
20
Scenario I.C Ten by Ten
EXTERNAL
  • Statistics
  • 292,272 total packets
  • 16,663 total flows
  • 98.7 TCP
  • 54 Snort Alerts

BPRD
INTERNAL
Scanner
21
Scenario I.C Ten by Ten
EXTERNAL
192.168.222.2
199.227.249.246
Anchor Point Identification SNORT alerts
involving anomalous IPs
2
B/O
1
B/O
100.10.20.10
100.10.20.6
Anomaly rank 42
web-server
Anomaly rank 65 Anomalous file transfer
100.10.20.5
Anomaly rank 64
BPRD
INTERNAL
100.10.20.4
web-server
Anomaly rank 12 Non-standard port access
22
Scenario I.C Ten by Ten 1st set of anchor
points
EXTERNAL
220.237.152.116
40.219.61.25
199.227.249.246
  • Context Extraction
  • Activity that deviates from hosts normal profile
  • Scans that get replies

Unsuccessful non-standard port access
Anomalous file transfer
100.10.20.10
Anomaly rank 65 Anomalous file transfer initiated
by web-server
INTERNAL
100.10.20.4
BPRD
Anomaly rank 12 Non-standard port access
23
Scenario I.C Ten by Ten 1st set of anchor
points
EXTERNAL
time
  • Attack Characterization Event
    Sequencing Labeling
  • E1 -? i2 Initializing connection on
    non-
    standard port Failed
  • E2 -gt I2 Initializing connection on
    non- standard port - Failed
  • E2 ? I1 Initializing
    connection on non- standard port
    Successful
  • I1 -gt E3 Initializing ftp
    connection with external host

E1
E3
E2
X
X
External host E2 hacks internal host I1 which
subsequently does file transfer with external
host E3. E2 also attempts an unsuccessful attack
on I2.
I1
web-server
I2
BPRD
web-server
24
Scenario I.C Ten by Ten 2nd set of anchor
points
EXTERNAL
206.131.61.250
95.116.204.23
208.241.45.204
221.23.248.251
192.168.222.2
210.20.5.160
161.122.144.247
  • Context Extraction
  • Activity that deviates from hosts normal profile
  • Scans that get replies

Failed attempts by external hosts to connect to
internal machines on non-standard or closed ports
100.10.20.6
Anomaly rank 42
100.0.1.2
100.10.20.5
100.20.10.2
Anomaly rank 64
BPRD
INTERNAL
25
Scenario I.C Ten by Ten 2nd set of anchor
points
EXTERNAL
E4
E5
E3
E6
E1
E2
time
E7
X
  • Attack Characterization Event
    Sequencing Labeling
  • E1 E7 ? i1 I4 Initializing connection
    on non- standard/closed port Failed

X
X
X
X
X
X
X
X
X
I4
I1
I2
BPRD
I3
INTERNAL
Scanner
26
Case Study II Experience with Minnesota Data
  • Approach Starting with a good set of anchor
    points of known bad computers, analyze their
    communication patterns and the communication
    patterns of those they talk to, to identify other
    compromised computers
  • Anchor Points A blacklist of 370 Master (CC)
    machines, constructed by security analysts around
    the world, was used as the starting point

27
University of Minnesota
U of MN Network
Internal IP was found to be talking to 2 of the
newly found masters and 9 new external IPS on
port 6667
One internal computer talking to 3 blacklisted IP
(17 flows)
Internal IP was found to be talking to 35
external IPS on port 6667
List of 370 Blacklisted computers
kissing-sadam.allxtremenet.net
deleted.important.us-govt.info
not.really.a.whiteangel.info
whats.up.buttface.net
dont.i.know.y-ou.com
irc.acidillusion.net
28
More Intelligent Approach
  • The 1st attempt was good, growing the black list
    by 12, but can we do better?
  • Removed the requirement of only looking for
    communication on port 6667 TCP
  • Added simple historic profiling to remove good
    IPs from being blacklisted
  • Identified 54 new command and control machines
    with no false alarms

29
A little manual digging into the 54 new Command
and Control machines
  • Upon further inspection 30 of the 54 CC machines
    had 2000-5000 machines throughout the world
    connected to them at the time of investigation
  • Some of the more interesting computer names found
  • 66.90.85.148  phear.my.penix.info
  • 66.90.124.134  dont.i.know.y-ou.com
  • 66.90.124.141  irc.acidillusion.net
  • 67.111.204.243  whats.up.buttface.net
  • 69.64.51.192  192.electricstorm.co.uk
  • 208.51.90.83  not.really.a.whiteangel.info
  • 208.179.57.115  deleted.important.us-govt.info
  • 208.179.62.246  kissing-sadam.allxtremenet.net

30
Summary Lessons from Case Studies
  • When a compromise does occur, quick understanding
    of the scope of the problem is crucial for IC
    network defenders
  • Our analysis methodology is effective at quickly
    identifying what computers are compromised on
    synthetic, university and military networks
  • shown good promise on the Skaion data
  • helped security analysts identify compromised
    machines in public networks (UMN)
  • proved effective on real military networks (ARL
    CIMP)
  • Behavior anomaly detection is an effective way to
    detect novel sophisticated attacks

31
Applicability of Approach to IC Scenarios
  • Threat Model multi-step, stealthy attacks
    generating suspicious/anomalous activities
  • Rationale our analysis methodology is likely to
    perform better on IC networks than in a general
    Internet environment
  • traffic is relatively cleaner and more regulated
  • number of (outside) hosts an IC computer talks to
    is likely to be far fewer than a typical host in
    a university setting
  • easier to build reliable behavior profiles and
    communication patterns

32
Limitations/Vulnerabilities Mitigations
  • Limitations/Vulnerabilities
  • Must be able to find an anchor point
  • either from anomalies, signatures, scan
    detection, host based IDS, etc.
  • Some steps or aspects of malicious activities
    must deviate from normal behavior
  • Mitigations
  • include more diverse sensor data
  • develop more intelligent rules for anchor point
    identification
  • develop more sophisticated behavior profile
    techniques
  • develop more efficient context extraction and
    attack characterization that can explore a larger
    search space

33
Relationship to Other ARDA Projects (Based on
June 2004 PI meeting)
Veridean, CMU, Lockheed Martin
Secure Decision
Dartmouth
MINDS output can be input to CAPS
UTAH
Hidden Markov Model could help Attack
characterization.
Game Theory could help anchor point
identification.
MINDS Level I and II analysis can be more
effective with visualization.
MINDS output can be input to ECCARS correlator -
MINDS level II Analysis can simplify attack graph
extraction
Attack profiling can be used to guide MINDS
level II analysis
MINDS
Alions Buffalo
Nong Ye, Arizona SU
And/Or analysis might help anchor point
identification
Correlation analysis might help anchor point
identification
MINDS level II Analysis can simplify attack
scenario extraction
MINDS anomalies (alarms) can be correlated with
other alarms
SRIs correlation analysis can be used for anchor
point identification
MINDS level II Analysis can simplify attack
scenario extraction
Bayesian analysis can be used for MINDS level II
level Analysis
D-Force IET
GDAIS Dartmouth
Valdes SRI
34
Future Plans
  • Long-Term Goal Integrated Situational Awareness
    Framework Tools to aid IC defenders in
    effective decision making
  • Where we are in November 2004
  • - Developed a novel SA analysis framework and its
    key components and algorithms
  • Where do we expect to be by March 2005
  • Tasks deliverables
  • Where do we want to go beyond March 2005
  • Future capabilities

35
Near-Term Action Plan (March 2005)
  • Tasks
  • Implementation refinement of Level II SA
    analysis methodology and algorithms
  • Implementation refinement of network behavior
    profiling
  • Deliverables
  • prototype system incorporating key components of
    Level I and II analysis with anchor point and
    context extraction steps
  • documentation of design and implementation
  • documentation of testing, evaluation and case
    studies

36
Plan Beyond March 2005
  • Improvement extensions of Level II SA analysis
    methodology and algorithms
  • anchor point identification using more diverse
    sensor data
  • context refinement using link analysis and
    association rules
  • attack characterization using advanced models
    from other projects
  • semi-automated situation assessment techniques
  • Continual and real-time profiling and profile
    databases
  • multi-dimensional, information-theoretical
    structural models for normal/suspicious network
    (host, flow, service, etc.) behaviors
  • attack and attacker profiling (worm/scanning
    activities, moles/drones/masters, etc.)
  • query-able profile databases
  • Integration of various pieces of proposed SA
    analysis framework
  • in particular, interoperability with other P2INGS
    systems
  • Multi-tiered, cooperative, global situational
    awareness analysis framework

37
Future Plans
  • Erics suggestions - Tasks for the next 12
    months.
  • Create a query language an analyst can use in the
    course of doing 2nd level analisys to look for
    high level patterns. Some example patterns are
    like the ones for finding terminal services.
    Although the algorithms used in this may not be
    very novel or ground breaking, such a tool does
    not exist at this time, and would make an anlysts
    many many times more effective. Right now, all
    pattern matching is effectively done in a
    person.s head, and chugging through the data
    takes hours to look for a simple pattern on only
    a few hours worth of data. Having a fast and
    flexible query language will allow an analyst to
    look for many patterns quickly and efficiently.
  • Developing better techniques for profiling of
    behaviors will be an ongoing task. Having a
    better profile will allow for fewer false alarms,
    and less data for a human to look at. This will
    also allow a .lower. threshold or .looser.
    rules/patterns to be employed since there is a
    greater confidence in the profiles and thus in
    the initial anchor points.
  • Developing a scoring mechanism for how important
    an anchor point is will allow an analyst to look
    at more interesting subgraphs first.
  • Also, always use top .5 anomalies intersected
    with snort alerts for anchor points or first 5
    intersects, whichever is more. This will ensure
    the analyst always has some anchor points to
    explore.
  • A weighting mechanism should also have a
    feedback of sorts. If an alert is presented to an
    analysts and it is found useful, the tools used
    to create this should be given a higher weight.
    If snort combined with minds finds interesting
    things 25 of the time, while snort combined with
    jids only finds things 5 of the time. That
    combination should be given a lower weight.
  • Payload information should be considered. One way
    would be to use a histogram of the payload to
    profile if it is http, ftp, email, binary data,
    or encrypted. This can be used both to determine
    if plain text is going over an ssh port, and if
    the traffic is different. If suddenly the traffic
    over ssh looks binary and not encrypted this is
    interesting (hypothesis binary data although
    using all 256 bytes might not be uniform
    encrypted data should be uniform)
  • Profiles should be made between frequent talkers
    to determine if their communication varies. Right
    now if two computers talk a lot their
    conversations are assumed to be safe. Although
    generally fine, if someone compromises one or
    both of the computers they could hide their
    communication on the same service. If 2 mail
    servers are compromised that typically talk,
    either the IPS or the IPS on port 25 are assumed
    to be safe. However, one could have replaced
    sendmail on one of them to also allow it to
    provide a login. This type of traffic would look
    different than typical mail traffic.

38
Relationship to Other ARDA Funded Projects(Based
on June 2004 PI meeting)
Correlation
Visualization
GDAIS Corr, Assess, History
SRI IDS/ FW Corr, entity centric
Secure Decision Visual Display, human factor
IET (D-Force) IDS Corr, BN,
Veridean Prediction
Utah Visual Display, human factor
Skaion Traffic Generation
MINDS
Endeavor Automatic response, addr/port map
Northrop Network Data Mining
ASU (Nong Ye) Cyber signal analysis
Arbor networks Macro/micro sensors,
track/forensics active honeypot
Write a Comment
User Comments (0)
About PowerShow.com