Situational Awareness Analysis Tool for Aiding Discovery of Security Events and Patterns

About This Presentation

Title:

Situational Awareness Analysis Tool for Aiding Discovery of Security Events and Patterns

Description:

Situational Awareness Analysis Tool for Aiding Discovery of Security Events and Patterns ... 'shallow' analysis of voluminous network-wide sensor data to ... – PowerPoint PPT presentation

Number of Views:319

Avg rating:3.0/5.0

Slides: 39

Provided by: varunch

Category:

more less

Transcript and Presenter's Notes

Title: Situational Awareness Analysis Tool for Aiding Discovery of Security Events and Patterns

1
Situational Awareness Analysis Tool for Aiding
Discovery of Security Events and Patterns

PI Vipin KumarCo-PIs Jaideep Srivastava,
Zhi-Li Zhang, Yongdae Kim, University of
Minnesota

2
Presentation Outline

Executive Summary
Key Accomplishments Since June 2004
A Novel Approach to Level II Analysis
Analysis Framework
Analysis Methodology
Evaluation of the Approach
Case Study I Experience with the SKAION Data
Case Study II Experience at the University of
Minnesota
Applicability of Approach to IC Scenarios
IC Scenario
Assumptions and Limitations
Relationship to other ARDA funded projects
Project Future Plans
Tasks, timeline, and deliverables

3
Executive Summary

Objective Help IC network defenders identify
analyze distributed, stealthy, multi-step, novel
attacks
Innovative Claim
a novel Level-II analysis framework/process and
associated techniques for identifying
distributed, stealthy, multi-step attacks
provide attack context and sequencing of events
to aid IC defenders for timely attack recognition
situation assessment
transform large amount of sensor data into a
small set of labeled event sequences analyzable
by human security analysts
significantly reduce false alarms, and uncover
correlated attacks
Novel Ideas
shallow analysis of voluminous network-wide
sensor data to identify anchor points for
in-depth follow-on analysis in a focused context
spatial/temporal chaining analysis and event
sequencing for attack context extraction and
characterization
both employ behavior-based host profiling flow
anomaly analysis

4
Situational Awareness Analysis Framework
Level I
Level II
Signature
-
based IDS
Attack Context
Anchor point
Extraction
identification
Anomaly Detector
Attack
Situation
Characterization
Assessment
Scan Detector
Behavior Profiling
Host/Service Profiling
Flow Anomaly Analysis
Attack Profiling
5
Key Accomplishments

Developed a novel level I and level II analysis
framework and algorithms
Behavior anomaly detection for identifying hard
to detect malicious activity in the IC networks
Profiling of network traffic along multiple
dimensions to characterize normal/abnormal
behavior, enabling improved level I and level II
analysis
Intelligent fusion of multiple sensor data for
high-confidence attack recognition (e.g.
signature based IDS, scan detection, anomaly
detection)
Spatio-temporal chaining analysis in the
communication graph to extract larger context of
a suspicious activity
Event sequencing and labeling for attack
characterization
Demonstrated success in detecting multi-step
attack scenarios in Skaion's dataset, especially
generated for ARDAs P2INGS program
Skaion attack scenarios are detected with a low
false alarm rate
Demonstrated success on real world network data
at the University of Minnesota and at the ARL -
Center for Intrusion Monitoring and Protection,
where data is analyzed from multiple DoD sites

6
Research Objective Key Assumptions

Objective Help IC network defenders identify
analyze distributed, stealthy, multi-step, novel
attacks
Key assumptions
Attacks on IC networks
Unlike common Internet attacks such as worms and
(distributed) denial-of-service attacks, which
generate large volume of data, and take place in
a short time
Likely to occur in multi-stages spread out in
time, involving several outside hosts (and
perhaps compromised inside hosts), and generating
low-intensity traffic
Want to break into protected hosts for access to
sensitive data
Attack events exhibit anomalous behaviors
deviating from normal host/service profiles
Attackers/victims connected by suspicious
communication activities

7
Level II Analysis Methodology

Anchor Point Identification
Identifying starting point for attack analysis
via data fusion correlation of output from
level 1 analysis
Context Extraction
Identifying relevant events and entities (hosts,
flows, ), starting from an anchor point
Attack Characterization
Refinement of context to characterize attacks
(presently manual)
Situation Assessment
Evaluation of attack characterizations (out of
scope)

Anchor Point Identification
Context Extraction
Attack Characterization
Situation Assessment
Motivated by challenges faced while working on
several cases with Angelo Bencivenga and Tim
Dunn at the ARL CIMP
8
Level-II Analysis Process Diagram
Configuration/Selection of Analysis Strategies
Search size, depth, time frame
Labeling/Scoring Rules
Control
List of anchor points
Event activity graph
Labeled Attack Sequences
Anchor Point Identification
Context Extraction
Attack Characterization
IDS Sensor Data
Situational Assessment
Behavior Anomaly Analysis
Profile based chaining analysis
Temporal sequencing analysis
Domain specific guided search
Algorithms/Techniques
Correlation/fusion of multiple sensor data
Knowledge based event labeling

Watchlist/Blacklist
Attack pattern matching

9
Behavior Profiling Anomaly Analysis
Historical Behavior Profiling
Current-Time Anomaly Analysis
Scoring

Network-Wide
Flow Anomaly Analysis
MINDS flow anomaly ranking
signature-based alerts
TCP flag analysis,

Service Profiling
service type web, dns,
protocol TCP, UDP, ICMP,
connection patterns
flow statistics

flow anomaly scores

Host Profiling
host types servers, clients, etc.
port/service profiles
traffic statistics
communication patterns

Host-Specific
Flow Anomaly Analysis
deviation from normal host
behaviors

host anomaly scores
clustering
outlier analysis
association rules
signature-based rule matching
Techniques
statistical profiling
statistical deviation analysis
link analysis

10
Anchor Point Identification Techniques

Watch list maintained by analyst
Hosts that engage in suspicious activity as
identified by one or more of the following
Standard IDS signature (snort alarms)
Behavior Anomalies
Hosts that send/receive traffic that is anomalous
w.r.t. historical profile
Behavior Signatures
A host that communicates with a known compromised
machine
Hosts that perform scans
Port knocking
Services (e.g., ftp, ssh) running on non-standard
ports
Any other identifiable behavior of a known
compromised machine

Anomalous Flow
Communication to a host on watchlist
Watchlist Host Anchor Point
SNORT alert on host with anomalous behavior
11
Attack Context Extraction

Starting from an anchor point recursively examine
activity to other hosts that
deviates from norm
hosts profile
service/port profile
is similar to known suspicious traffic
attack signatures
replies to scans
activities from compromised hosts

Anchor Point
Remote login attempt
Outbound FTP
Reply to a scan
Web server
ruleset terminal_services ignore srcport
lt 1024 ignore packets lt 4 ignore
dstport ! 3389 ignore protocol ! tcp
profile client_services server dstport
3389 protocol tcp profile servers
client dstport 3389 protocol tcp
12
Attack Characterization
A context graph

Determine likely relationships (e.g. sequencing)
between retained events and hosts
Evaluate and rank hosts and activities in the
attack context to
Retain those with high degree of suspicion and
prune those with low degree of suspicion

E1
E3
E2
E4
I4
I1
I2
I3

Sample Rules
If a host is scanning - label it as attacker
with low score
If a host is scanned and it replies label it
as victim and give it a medium score
If a internal host is scanning - label it hacked
with a high score
If a hacked internal host makes a subsequent file
transfer to outside increase the score of the
hacked label and label the target host as
attacker with a high score

time

Attack Characterization Event
Sequencing Labeling
E1 -? I1 Scan with replies
E2 -gt I4 Initializing connection on
non- standard port - Successful
I4 -gt E4 Initializing ftp connection
with external host

13
Accomplishments since Nov. 17, 2004 site visit

Refined Level II Analysis Process
Investigated and improved anchor point analysis
Spatio-temporal chaining analysis for context
extraction
Event sequencing and labeling for attack
characterization
Evaluation using Skaion Dataset II and real
network data from the University of Minnesota

14
Skaion Dataset

Two sets of synthetic data, generated by the
Skaion Corporation to simulate IC network traffic
and attacks
Traffic generated to statistically match data
captured at AFRL
Traffic contains background (normal) traffic, as
well as various scans and failed attacks
Background traffic is combined with multi-step/
multi-stage attacks to produce each scenario
Data Set I 4 scenarios
A. Naïve Attacker B. Five-by-five
C. Ten-by-ten D. Simple-ten
Data Set II 3 categories
Single-Stage Attacks 8 scenarios
Bankshot Multi-stage Attacks 5 scenarios
Misdirection Multi-stage Attacks 3 scenarios
Each scenario includes tcpdump data of all
network traffic as well as Snort alerts, HTTP
access logs, FTP transfer logs, and Windows logs

15
Scenario II.C.1 S29 Misdirection Multistage
Attack
18.2.175.153
40.159.214.124
Anomaly Rank 47 Failed connection on port 22
53.82.21.112
EXTERNAL

Anchor Point Identification
SNORT alerts involving anomalous IPs

Statistics
Trunk
Total Packets 103,791
Total flows 10,859
Snort Alerts 451
Bprd
Total Packets 73,595
Total flows 6,987
Colo
Total Packets 98,858
Total flows 6,002

Anomaly Rank 47 Failed connection on port 22
REMOTE OSIS USERS
100.10.20.4
BPRD
web-server
Scanner
16
Scenario II.C.1 S29 Misdirection Multistage
Attack
18.2.175.153
Attempts remote login
116.45.223.116
40.159.214.124
74.205.114.175
Anomaly Rank 47
40.219.61.25
53.82.21.112
Scans and gets a reply
EXTERNAL
Web-server initiating connection on port
8080 Anomaly Ranks1,2, 4, 11

Context Extraction
Activity that deviates from hosts normal profile
Scans that get replies

Web-server initiating FTP Connections
100.1.21.134
Remote login on the web-server This follows
undetected iis50_nsiislog attack
REMOTE OSIS USERS
Anomaly Rank 47
100.10.20.10
100.10.20.4
web-server
BPRD
web-server
17
Scenario II.C.1 S29 Misdirection Multistage
Attack
time
E4

Attack Characterization Event
Sequencing Labeling
E4, E5 E6 -gt I2 Bad HTTP Traffic
E2 -? i2 Scanning with a reply
E3 E6 ? I2 Remote login - failed
D1 -gt I1 Remote login
successful
I1 -gt E1 Anomalous FTP
I1 -gt E2 Anomalous
transfer on port 8080

E5
E3
E2
E6
E1
X
X
Dial-up host D1 hacks into web server I1 via
remote login, and initiates anomalous file
transfers from I1 to two outside hosts, E1 E2,
where E2 earlier performed scanning
D1
I1
REMOTE OSIS USERS
web-server
I2
BPRD
web-server
Scanner
18
Scenario II.B.1 S1 Bankshot Multistage Attack
Anomaly Rank 194 Ftp connection to the
web-server
51.91.57.157
112.50.254.117
EXTERNAL

Context Extraction
Activity that deviates from hosts normal profile
Scans that get replies

Anchor Point Identification SNORT alerts
involving anomalous IPs

Statistics
Trunk
Total Packets 986,494
Total flows 44,994
10,896 Snort Alerts
Bprd
Total Packets 305,598
Total flows 19,111
Colo
Total Packets 960,676
Total flows 27,045

Successful remote login from the external host
100.20.200.15 /100.20.1.3
Failed access to web-server on port 111
web-server
Anomaly Rank 194 Ftp connection to the
web-server
100.10.20.4
Initializing connection with mail server on port
5617 Anomaly Rank 1,2,5
web-server
SHIELD ENCLAVE
100.10.20.3
mail server
BPRD
Scanner
19
Scenario II.B.1 S1 Bankshot Multistage Attack
time

Attack Characterization Event
Sequencing Labeling
E2 -? I2 Scanning with a reply
E2 -gt I2 Failed ftp attempt to
web- server
E1 ? S1 Scanning with a
reply
E1 ? S1 Remote login -
successful
S1 -gt I1 I2 Scanning with
replies
S1 -gt I2 Failed connection to
web-server on non-standard port
S1 -gt I1 Successful connection to
mail server on port 5617

E2
E1
EXTERNAL
X
SHIELD ENCLAVE
S1
X
web-server
I2
External host E1 scans and hacks internal host S1
which scans the BPRD network and hacks mail
server I1
web-server
I1
mail server
BPRD
Scanner
20
Scenario I.C Ten by Ten
EXTERNAL

Statistics
292,272 total packets
16,663 total flows
98.7 TCP
54 Snort Alerts

BPRD
INTERNAL
Scanner
21
Scenario I.C Ten by Ten
EXTERNAL
192.168.222.2
199.227.249.246
Anchor Point Identification SNORT alerts
involving anomalous IPs
2
B/O
1
B/O
100.10.20.10
100.10.20.6
Anomaly rank 42
web-server
Anomaly rank 65 Anomalous file transfer
100.10.20.5
Anomaly rank 64
BPRD
INTERNAL
100.10.20.4
web-server
Anomaly rank 12 Non-standard port access
22
Scenario I.C Ten by Ten 1st set of anchor
points
EXTERNAL
220.237.152.116
40.219.61.25
199.227.249.246

Context Extraction
Activity that deviates from hosts normal profile
Scans that get replies

Unsuccessful non-standard port access
Anomalous file transfer
100.10.20.10
Anomaly rank 65 Anomalous file transfer initiated
by web-server
INTERNAL
100.10.20.4
BPRD
Anomaly rank 12 Non-standard port access
23
Scenario I.C Ten by Ten 1st set of anchor
points
EXTERNAL
time

Attack Characterization Event
Sequencing Labeling
E1 -? i2 Initializing connection on
non-
standard port Failed
E2 -gt I2 Initializing connection on
non- standard port - Failed
E2 ? I1 Initializing
connection on non- standard port
Successful
I1 -gt E3 Initializing ftp
connection with external host

E1
E3
E2
X
X
External host E2 hacks internal host I1 which
subsequently does file transfer with external
host E3. E2 also attempts an unsuccessful attack
on I2.
I1
web-server
I2
BPRD
web-server
24
Scenario I.C Ten by Ten 2nd set of anchor
points
EXTERNAL
206.131.61.250
95.116.204.23
208.241.45.204
221.23.248.251
192.168.222.2
210.20.5.160
161.122.144.247

Context Extraction
Activity that deviates from hosts normal profile
Scans that get replies

Failed attempts by external hosts to connect to
internal machines on non-standard or closed ports
100.10.20.6
Anomaly rank 42
100.0.1.2
100.10.20.5
100.20.10.2
Anomaly rank 64
BPRD
INTERNAL
25
Scenario I.C Ten by Ten 2nd set of anchor
points
EXTERNAL
E4
E5
E3
E6
E1
E2
time
E7
X

Attack Characterization Event
Sequencing Labeling
E1 E7 ? i1 I4 Initializing connection
on non- standard/closed port Failed

X
X
X
X
X
X
X
X
X
I4
I1
I2
BPRD
I3
INTERNAL
Scanner
26
Case Study II Experience with Minnesota Data

Approach Starting with a good set of anchor
points of known bad computers, analyze their
communication patterns and the communication
patterns of those they talk to, to identify other
compromised computers
Anchor Points A blacklist of 370 Master (CC)
machines, constructed by security analysts around
the world, was used as the starting point

27
University of Minnesota
U of MN Network
Internal IP was found to be talking to 2 of the
newly found masters and 9 new external IPS on
port 6667
One internal computer talking to 3 blacklisted IP
(17 flows)
Internal IP was found to be talking to 35
external IPS on port 6667
List of 370 Blacklisted computers
kissing-sadam.allxtremenet.net
deleted.important.us-govt.info
not.really.a.whiteangel.info
whats.up.buttface.net
dont.i.know.y-ou.com
irc.acidillusion.net
28
More Intelligent Approach

The 1st attempt was good, growing the black list
by 12, but can we do better?
Removed the requirement of only looking for
communication on port 6667 TCP
Added simple historic profiling to remove good
IPs from being blacklisted
Identified 54 new command and control machines
with no false alarms

29
A little manual digging into the 54 new Command
and Control machines

Upon further inspection 30 of the 54 CC machines
had 2000-5000 machines throughout the world
connected to them at the time of investigation
Some of the more interesting computer names found
66.90.85.148 phear.my.penix.info
66.90.124.134 dont.i.know.y-ou.com
66.90.124.141 irc.acidillusion.net
67.111.204.243 whats.up.buttface.net
69.64.51.192 192.electricstorm.co.uk
208.51.90.83 not.really.a.whiteangel.info
208.179.57.115 deleted.important.us-govt.info
208.179.62.246 kissing-sadam.allxtremenet.net

30
Summary Lessons from Case Studies

When a compromise does occur, quick understanding
of the scope of the problem is crucial for IC
network defenders
Our analysis methodology is effective at quickly
identifying what computers are compromised on
synthetic, university and military networks
shown good promise on the Skaion data
helped security analysts identify compromised
machines in public networks (UMN)
proved effective on real military networks (ARL
CIMP)
Behavior anomaly detection is an effective way to
detect novel sophisticated attacks

31
Applicability of Approach to IC Scenarios

Threat Model multi-step, stealthy attacks
generating suspicious/anomalous activities
Rationale our analysis methodology is likely to
perform better on IC networks than in a general
Internet environment
traffic is relatively cleaner and more regulated
number of (outside) hosts an IC computer talks to
is likely to be far fewer than a typical host in
a university setting
easier to build reliable behavior profiles and
communication patterns

32
Limitations/Vulnerabilities Mitigations

Limitations/Vulnerabilities
Must be able to find an anchor point
either from anomalies, signatures, scan
detection, host based IDS, etc.
Some steps or aspects of malicious activities
must deviate from normal behavior
Mitigations
include more diverse sensor data
develop more intelligent rules for anchor point
identification
develop more sophisticated behavior profile
techniques
develop more efficient context extraction and
attack characterization that can explore a larger
search space

33
Relationship to Other ARDA Projects (Based on
June 2004 PI meeting)
Veridean, CMU, Lockheed Martin
Secure Decision
Dartmouth
MINDS output can be input to CAPS
UTAH
Hidden Markov Model could help Attack
characterization.
Game Theory could help anchor point
identification.
MINDS Level I and II analysis can be more
effective with visualization.
MINDS output can be input to ECCARS correlator -
MINDS level II Analysis can simplify attack graph
extraction
Attack profiling can be used to guide MINDS
level II analysis
MINDS
Alions Buffalo
Nong Ye, Arizona SU
And/Or analysis might help anchor point
identification
Correlation analysis might help anchor point
identification
MINDS level II Analysis can simplify attack
scenario extraction
MINDS anomalies (alarms) can be correlated with
other alarms
SRIs correlation analysis can be used for anchor
point identification
MINDS level II Analysis can simplify attack
scenario extraction
Bayesian analysis can be used for MINDS level II
level Analysis
D-Force IET
GDAIS Dartmouth
Valdes SRI
34
Future Plans

Long-Term Goal Integrated Situational Awareness
Framework Tools to aid IC defenders in
effective decision making
Where we are in November 2004
- Developed a novel SA analysis framework and its
key components and algorithms
Where do we expect to be by March 2005
Tasks deliverables
Where do we want to go beyond March 2005
Future capabilities

35
Near-Term Action Plan (March 2005)

Tasks
Implementation refinement of Level II SA
analysis methodology and algorithms
Implementation refinement of network behavior
profiling
Deliverables
prototype system incorporating key components of
Level I and II analysis with anchor point and
context extraction steps
documentation of design and implementation
documentation of testing, evaluation and case
studies

36
Plan Beyond March 2005

Improvement extensions of Level II SA analysis
methodology and algorithms
anchor point identification using more diverse
sensor data
context refinement using link analysis and
association rules
attack characterization using advanced models
from other projects
semi-automated situation assessment techniques
Continual and real-time profiling and profile
databases
multi-dimensional, information-theoretical
structural models for normal/suspicious network
(host, flow, service, etc.) behaviors
attack and attacker profiling (worm/scanning
activities, moles/drones/masters, etc.)
query-able profile databases
Integration of various pieces of proposed SA
analysis framework
in particular, interoperability with other P2INGS
systems
Multi-tiered, cooperative, global situational
awareness analysis framework

37
Future Plans

Erics suggestions - Tasks for the next 12
months.
Create a query language an analyst can use in the
course of doing 2nd level analisys to look for
high level patterns. Some example patterns are
like the ones for finding terminal services.
Although the algorithms used in this may not be
very novel or ground breaking, such a tool does
not exist at this time, and would make an anlysts
many many times more effective. Right now, all
pattern matching is effectively done in a
person.s head, and chugging through the data
takes hours to look for a simple pattern on only
a few hours worth of data. Having a fast and
flexible query language will allow an analyst to
look for many patterns quickly and efficiently.
Developing better techniques for profiling of
behaviors will be an ongoing task. Having a
better profile will allow for fewer false alarms,
and less data for a human to look at. This will
also allow a .lower. threshold or .looser.
rules/patterns to be employed since there is a
greater confidence in the profiles and thus in
the initial anchor points.
Developing a scoring mechanism for how important
an anchor point is will allow an analyst to look
at more interesting subgraphs first.
Also, always use top .5 anomalies intersected
with snort alerts for anchor points or first 5
intersects, whichever is more. This will ensure
the analyst always has some anchor points to
explore.
A weighting mechanism should also have a
feedback of sorts. If an alert is presented to an
analysts and it is found useful, the tools used
to create this should be given a higher weight.
If snort combined with minds finds interesting
things 25 of the time, while snort combined with
jids only finds things 5 of the time. That
combination should be given a lower weight.
Payload information should be considered. One way
would be to use a histogram of the payload to
profile if it is http, ftp, email, binary data,
or encrypted. This can be used both to determine
if plain text is going over an ssh port, and if
the traffic is different. If suddenly the traffic
over ssh looks binary and not encrypted this is
interesting (hypothesis binary data although
using all 256 bytes might not be uniform
encrypted data should be uniform)
Profiles should be made between frequent talkers
to determine if their communication varies. Right
now if two computers talk a lot their
conversations are assumed to be safe. Although
generally fine, if someone compromises one or
both of the computers they could hide their
communication on the same service. If 2 mail
servers are compromised that typically talk,
either the IPS or the IPS on port 25 are assumed
to be safe. However, one could have replaced
sendmail on one of them to also allow it to
provide a login. This type of traffic would look
different than typical mail traffic.

38
Relationship to Other ARDA Funded Projects(Based
on June 2004 PI meeting)
Correlation
Visualization
GDAIS Corr, Assess, History
SRI IDS/ FW Corr, entity centric
Secure Decision Visual Display, human factor
IET (D-Force) IDS Corr, BN,
Veridean Prediction
Utah Visual Display, human factor
Skaion Traffic Generation
MINDS
Endeavor Automatic response, addr/port map
Northrop Network Data Mining
ASU (Nong Ye) Cyber signal analysis
Arbor networks Macro/micro sensors,
track/forensics active honeypot

Write a Comment

User Comments (0)