Data Mining Approaches for Network Intrusion Detection - PowerPoint PPT Presentation

1 / 31

About This Presentation

Title:

Data Mining Approaches for Network Intrusion Detection

Description:

This method attacks a legitimate machine within a secure network in hopes to ... Chi Square Automatic Interaction Detection (CHAID) ... – PowerPoint PPT presentation

Number of Views:155

Avg rating:3.0/5.0

Slides: 32

Provided by: enpubFu

Category:

more less

Transcript and Presenter's Notes

Title: Data Mining Approaches for Network Intrusion Detection

1
Data Mining Approaches for Network Intrusion
Detection

Karla Bracamonte
Jeffrey Gawlinski
Jordan Harstad
Omar Rodriguez
Michael Wright

2
Intrusion Detection Current

Detection is best if it occurs during the
scanning step
Real-time intrusion detection
Pros Scan network traffic on the fly looking for
well known scan patterns
Cons Tuned specifically to detect known service
level network attacks
Intrusion detection should follow a proactive
approach

3
VisualizationPresenting a Graphical Summary of
the Data

Communication- presenting a graphical summery of
the data
PROS Possible to communicate the most important
aspects of collected data
CONS Not all information can be communicated
visually it is limited by the complexity that
the human eye can appreciate.

4
VisualizationPresenting a Graphical Summary of
the Data

Visual Techniques
Scatterplots
Projection Matrices
Coplots
Parallel Coordinates
Etc

5
VisualizationPresenting a Graphical Summary of
the Data

Distortion Methods minimize a scope of data
which allows a certain data set to be studied
without loss of entire perspective.
Interactive Methods viewing output dynamically
through the use of a possible UI to project,
zoom, and manipulate the data on demand.

6
VisualizationPresenting a Graphical Summary of
the Data

Audio Data Mining uses visual techniques by
changing signal pitches into graphs to recognize
unique patterns.
Help find a pattern of early warning signs of
human anger through telephone communication.

7
Data Summarization

Data Summarization is an important data analysis
task in data warehouse and online analytic
processing, another used term for data
summarization is summary statistics

8
Data Summarization Offline Data Mining and
Importance of Statistics

For example, networks with high traffic are faced
with a larger amount of data to analyze.
Nevertheless, with the use of data summarization,
data may be analyzed pattern by pattern,
detecting abnormal behavior and/or results

9
Data Summarization Offline Data Mining and
Importance of Statistics

Summary statistics are quantities, such as the
mean and standard deviation that capture various
characteristics of a potential large set of
values with a single number or small set of
numbers
Indeed, for many people, summary statistics are
the most visible manifestation of statistics

10
Data Summarization Offline Data Mining and
Importance of Statistics

Frequencies and Mode
Given a set of unordered categorical values,
there is not much that can be done to further
characterize the values except to compute the
frequency with which each value occurs for a
particular set of data
Percentiles
For ordered data, it is more useful to consider
the percentiles of a set of values

11
Data SummarizationOffline Data Mining and
Importance of Statistics

Measures of Location Mean and Median
For continues data, two of the most widely used
summary statistics are the mean and median, which
are measures of the location of the set of values
Measures of Spread Range and Variance
Another set of commonly used summary statistics
for continuous data are those that measure the
dispersion or spread of a set of values. Such
measures indicate if the attribute values are
widely spread out or if they are rarely
concentrated around a single point such as the
mean.

12
Data SummarizationOffline Data Mining and
Importance of Statistics

Multivariate Summary Statistics
Measures of location for data that consists of
several attributes (multivariate data) can be
obtained by computing the mean or median
separately for each attribute.
Other Ways to Summarize Data
skeweness

13
Data Summarization Offline Data Mining and
Importance of Statistics

off-line processing is a reasonable solution.
Off-line processing provides the techniques for
broader analysis of network traffic.

14
Network Intelligence Gathering

Foot Printing
Administrative, technical, and billing contacts,
which include employee names, email addresses,
and phone fax numbers
IP address range
DNS servers
Mail servers

15
Network Intelligence Gathering

Enumeration
process of extracting valid accounts or exported
resource names from systems
Scanning
the art of detecting which systems are alive and
reachable via the Internet, and what services
they offer, using techniques such as ping sweeps,
port scans, and operating system identification

16
Network Based Attacks

Attack on availability
Making a network unavailable or unusable to a
user or a group of users
Attack on confidentiality-
Many attacks are on that of personal data.
Whether it is a name, address, email, social
security number or credit card number, many
network based attacks are solely there for the
purpose of gathering confidential and/or personal
information on an individual, group of
individuals, company or object.

17
Network Based Attacks

Attack on integrity-
It is possible for the data to be intercepted all
together and thus never reach the intended
recipient.
Attack on authenticity-
Modifies an original data cluster and then passes
it on as unmodified.

18
Network Based Attacks

Attack on access control-
This method attacks a legitimate machine within a
secure network in hopes to access network and
server resources.
Attack on privacy
An attack on privacy is mainly used for the
recording of data in some way or another.
Whether it is tracking specific website usage,
online video game play, email addresses this
method is used by attackers to exploit an
individuals activity on a computer.

19
Network Based Attacks

Prevention
Firewalls
Virus Scanners
Common Sense

20
Known FlagsData Mining for Security

Suspicious red flags are not conclusive proof
that fraud has been committed.
Simply one tool of many for preventative
measures.
Not a single catch all rule through data mining-
should not be solely relied upon.
Consistent pattern is a must for possible fraud
identification.

21
Known FlagsData Mining for Security

Example telecommunications fraud
Nodes represent different countries
Lines represent international phone calls
Unusually bright activity represents strange
activity determined as fraud

22
Known FlagsData Mining for Security

Example compromised credit card accounts-
A distinct pattern usually involves a lost or
stolen account to be swiped at a gas station. No
gas is purchased only used to check status of
account to see if active.
Subsequent large jewelry and electronic purchases
shortly follow.

23
Known FlagsData Mining for Security

Example terrorist activity has not been
countered as a result of data mining.
It has no distinct pattern a terrorists profile
is no clear definition.
Large government (NSA) programs have made
attempts in data mining for preventive measures
without success.
Total Information Awareness generated thousands
of tips every month for over a year without a
single lead into terrorist organizations

24
ClassificationPredicting the Category to Which a
Particular Record Belongs

A major part of the classification process is the
initial information gathering task.
The idea behind this collection of data is that
normal and abnormal patterns of occurrences can
be differentiated from one another, and
algorithms can then be created to detect such
patterns.
Once detected, said algorithms would then be able
to flag suspicious events as abnormal in
real-time, and alert the appropriate person(s) as
to the potential intrusion(s).

25
ClassificationPredicting the Category to Which a
Particular Record Belongs

There are several ways these algorithms can
operate, and commonly they are implemented to run
off of decision trees or simply a set of
predefined rules that the system data must meet.

26
ClassificationPredicting the Category to Which a
Particular Record Belongs

There are many different options available that
employ decision trees. Some of these options
include
Classification and Regression Trees (CART)
Chi Square Automatic Interaction Detection
(CHAID)
CART works by inducing two-way splits in a
dataset, causing it to become segmented, whereas
CHAID uses chi square tests to create splits in a
dataset of variable size, also causing the data
to become segmented.

27
ClassificationPredicting the Category to Which a
Particular Record Belongs

Lee and Stolfo conducted several experiments
pertaining to classification methods in their
paper Data Mining Approaches for Intrusion
Detection. The first of these experiments was
on a set of sendmail system call data. This data
consisted of sendmail traces, with the trace data
consisting of two columns of integers.
The traces contained within the data were
classified as both normal and abnormal, where the
normal constituted a trace of the sendmail
daemon and a concatenation of several invocations
of the sendmail program and the abnormal was
composed of the following attacks
Three traces of sunsendmailcp (sscp)
Two traces of syslog-remote
Two traces of syslog-local
Two traces of decode
One trace of sm5x
One trace of sm565a

28
ClassificationPredicting the Category to Which a
Particular Record Belongs

After this data was obtained, system call
sequences had to be derived and labeled as normal
or abnormal so that they could then be supplied
to RIPPER, the rule learning program that was
used to generate rules that predict whether or
not a sequence is normal or abnormal. The
Intrusion Detection system then followed a
post-processing scheme to decide whether or not
the current trace was an intrusion, using the
RIPPER predictions.
The logic here is that when there is an intrusion
on the system, most of the adjacent system call
sequences will be abnormal.

29
ClassificationPredicting the Category to Which a
Particular Record Belongs

From the results it is important to notice that
generally speaking, intrusion traces will create
much larger abnormal regions than normal traces.
Also note that the results show that the rules
were generated can be applied to intrusion traces
not included in the training dataset.
This means that the rules for normal patterns
can be used to detect anomalies.
The rules from experiments C and D, on the other
hand, represent the abnormal sequence patterns.
These rules work very well for detecting types
seen in the training data, but perform worse than
A and B when it comes to detecting intrusions on
traces that were not seen in the training data.
The implication here is that the rule set for
abnormal patterns performs well on predictable
intrusions from things such as misuse or other
repeatable events where good basis data can be
used to generate the rules, but is unreliable
when it comes to flagging new types of intrusions
that may occur in the future.

30
ClassificationPredicting the Category to Which a
Particular Record Belongs

The next approach that Lee and Stolfo attempted
involved creating an anomaly detection routine
using only normal traces for training data.
Experiments were carried out to determine the
normal correlation between system calls, i.e. the
nth or the middle system calls in normal
sequences of length n.
Lee and Stolfo declared that improvement in
accuracy can come from adding more features,
rather than just system calls, into the models of
program execution. Items such as the file
structure and paths within that were traversed
(directories and names of touched files) could be
used to generate stronger rules.

31
ClassificationPredicting the Category to Which a
Particular Record Belongs

Lee and Stolfo further examined network intrusion
detection by monitoring network traffic directly
using a packet capturing program, tcdump, to
collect data.
In conclusion, when the data is not designed
specifically for security purposes (like in this
case), it cannot be used to build a detection
model without a certain amount of modifications
(or pre-processing) being made. Due to all of
the changes that must be taken care of, it goes
without saying that one must have a lot of
knowledge in the domain being tested, and as such
the process is not easily automated. On the
other hand, it is important to again note that by
adding extra measures, the accuracy of the
classification model can be improved.