ANOMALY DETECTION AND CHARACTERIZATION: LEARNING AND EXPERIANCE - PowerPoint PPT Presentation

About This Presentation

Title:

ANOMALY DETECTION AND CHARACTERIZATION: LEARNING AND EXPERIANCE

Description:

Holt-Winters continued. Constants alpha, beta, and gamma are predetermined ... We predict 'normal' vector space using the Holt-Winters Forecasting Method ... – PowerPoint PPT presentation

Number of Views:31

Avg rating:3.0/5.0

Slides: 23

Provided by: aaron101

Learn more at: https://users.cs.northwestern.edu

Category:

more less

Transcript and Presenter's Notes

Title: ANOMALY DETECTION AND CHARACTERIZATION: LEARNING AND EXPERIANCE

1
ANOMALY DETECTIONAND CHARACTERIZATIONLEARNING
AND EXPERIANCE

YAN CHEN MATT MODAFF AARON BEACH

2
NETWORK TRAFFIC WHAT DOES IT LOOK LIKE?
Where are the anomalies?
3
Overview

Anomaly Detection using Prediction Algorithm
Holt-Winters
Basic
one dimensional detection (value prediction)
Intermediate
multi-dimensional detection (vector prediction)
Advanced
Characterization by correlating many
multi-dimensional detections in
parallel
(2nd power vector prediction)
Automatic characterization updates
using maliciousness rating system

4
Holt-Winters

Prediction algorithm
Exponential Smoothing
Sum of three components
Baseline (intercept)
Linear Trend (slope)
Seasonal Trend

5
Holt-Winters continued

Constants alpha, beta, and gamma are
predetermined (between 0 and 1)
Used 0.1 for all of them based on how much new
values should be weighted against old values
Choose a seasonal size
Choose 1 minute since we only had 1 day
Or two hours for ICMP detection
Measuring within a threshold of deviation (delta)

6
Detecting Aberrations / Alarms

Set a window size and the number of aberrations
considered alarming
If there are more aberrations than the limit
within the time window, then alarm
We used 10-15/30 and 1/1 aberration/window size
depending on the time step and the characteristic
nature of the variable combination being
detected

7
Network Traffic Data

Network traffic data has many variables
We look at
Source and Destination IP addresses
Source and Destination port numbers
Protocol type
Bytes and packets in a traffic flow
Unique flow defined by source and destination
port/IP tuples
Protocol flags (TCP flags)
Over time these many variables
form a dynamic vector of data

8
What is Anomaly Detection?

We predict normal vector space using the
Holt-Winters Forecasting Method
We define vector space beyond normal as
aberrant
If the network traffic vector travels into
aberrant space it is considered an anomaly

Now lets look at a few examples of basic direct
anomaly detection and alarm triggering

9
Detection using port dimension

A clear port scan on port 21 (FTP) at 1246-47 AM
from one address outside the network

10
Detection using Protocol ICMP

ICMP spikes every 2 hours
Without seasonal values all of these may show up
as malicious anomalies

11
Port activity Malicious or normal

While port 17300 is used by nothing except for
the Kuang2 Trojan/Virus, port 10000 is used for
NDMP server backup service and Dumaru.Y?

12
Detection using three variablesFlow
bytes/packets and TCP flag

SYN attack early in the morning??
What about the little spikes are they syn attacks?

13
Explaining detected anomalies

Three variables is enough for detection but
doesnt tell us what the anomaly is, we need
other variables for characterization
Huge scan to port 4128, why just 4128 is it
really just a DoS?
All computers that that respond to the SYNs on
4128 receive requests on port 137 (NET BIOS a
protocol which is used to support file and
printer sharing)
This data matches a method used to find
exploitable systems for many viruses. This is
called a NBTSTAT -A type scan, which is used to
locate systems with open shares (port 4128) and
then they try to execute the infection via a
connection to the file share (port 137)
An attack on port 137, however no large scan on
port 137 only a scan on a relatively harmless
port 4128 this indirect scanning could have
avoided detection
Possible suspects are Nimda ,Bugbear, Msinit,
Opaserv, Qaz

14
More Advanced Detection

For the previous detection example we could
define a vector of malicious conditions
The vector space would have had 10 variables
2 sets of (dst IP, dst port, bytes, packets,
protocol)
Each variable can have a condition or
range that is malicious
This combination of 2 sets of 5 ranges or
conditions for different variables forms a unique
malicious vector space!

Now lets look at an example of using three
detection vectors in parallel to distinguish
normal space from malicious space

15
Comparing 3 Detections in parallel

Network seems to update SMTP servers every few
hours, this should be taken into account,
Spikes in DNS traffic may be credited to seasonal
updates
Due to some older SMTP servers authentication
protocol, port 113 traffic will mirror SMTP
traffic on a smaller scale, if they are taken
together both spike at the same relative ratio,
this can help distinguish normal vector space for
malicious and help define the conditions of
malicious characterization vectors

16
Detecting a Malicious Vector

A degree of maliciousness at any one moment can
be calculated by finding the percentage closer
that the current traffic is to malicious
conditions than the Normal/predicted values are.
So any current network traffic vector (point) has
a degree of maliciousness for each unique vector
of malicious conditions

0 completely normal/predicted
gt100 completely within malicious space

17
Anomalous but not Malicious

What if data falls outside of threshold of
deviation (out of normal space) but does not fall
into malicious space. Undefined space
Any action taken in these cases is ignorant and
not based on previous knowledge so nothing should
be done, a warning alarm should go off and a
careful analysis and report of this data should
be stored so that it might be studies
later
If this anomaly leads into malicious space, the
malicious space may need to be expanded to
include this newly detected anomaly

18
Anomalous but not Malicious continued

Each non-malicious anomalous event should be
stored and given a manual malicious rating later
This rating can then be incorporated into all
related malicious variable conditions
The Detection conditions would then be
continually updated by new anomalous data simply
by the administrator rating how malicious a
specific event was to their network, and in which
way it was malicious (DoS, virus, etc) making
updating done very easy without relying on outer
sources

19
Future Work / Implementation

3 levels of detection
Basic checking maliciousness rating of one
variable
Intermediate checking maliciousness of vectors
of variables
Advanced checking vectors of maliciousness
ratings of multiple detection vectors in parallel
This can continue to be scaled to whatever level
of complexity is necessary
Each detection vector need only be checked once
every time step (seconds, minutes, etc)
depending on how well server can perform.
Detection precision increases with smaller time
steps only one time step of data and vectors need
be stored in memory

20
Future Work / Implementation

Computations per time step is equal to the
average computation for one vector multiplied by
the number of detection vectors
Memory requirement will be equal to traffic data
for one time step plus the average vector size
multiplied by the number of vectors
Based on processor speed, memory space, and
number of characterizations being detected an
optimal time step could be computed
Future work could involve testing the
plausibility of this system in high speed, large
traffic volume situation