An%20Enhanced%20Support%20Vector%20Machine%20Model%20for%20Intrusion%20Detection - PowerPoint PPT Presentation

About This Presentation

Title:

An%20Enhanced%20Support%20Vector%20Machine%20Model%20for%20Intrusion%20Detection

Description:

Intrusion detection: the art of detecting inappropriate, incorrect, or anomalous ... Support Vector Machines. A machine learning method based on statistical ... – PowerPoint PPT presentation

Number of Views:113

Avg rating:3.0/5.0

Slides: 26

Provided by: stude785

Category:

more less

Transcript and Presenter's Notes

Title: An%20Enhanced%20Support%20Vector%20Machine%20Model%20for%20Intrusion%20Detection

1
An Enhanced Support Vector Machine Model for
Intrusion Detection

J. T. Yao, S. L. Zhao, L. Fan
Department of Computer Science
University of Regina
jtyao_at_cs.uregina.ca

2
Intrusion Detection Systems

Intrusion detection the art of detecting
inappropriate, incorrect, or anomalous activity.
Intrusion detection systems
A set of processes, procedures, tools, software,
hardware and databases having intrusion detection
technologies that fit together.
Misuse attack originates from the internal
network
Intrusion attacks from the outside

3
IDS Functional Components

Information Source
Provides a stream of event records.
Analysis Engine
Analyzes event records and detects intrusion.
Decision Maker
Decides the reactions for intrusions.

4
Detection Methods

Misuse detection
Detecting intrusion by known intrusion
signatures.
Anomaly detection
Mining normal event patterns from event records,
then use these patterns to classify normal and
intrusion events.

5
Candidate AI Techniques

Expert Systems
Hidden Markov Model
Fuzzy logic
Classification
Support Vector Machines (SVM)

6
Support Vector Machines

A machine learning method based on statistical
learning theories.
Classifies data by a set of support vectors that
represent data patterns.
Finds a discriminant function that classify new
data.

7
Benefits of Using SVM

Good generalization ability
Capability of handling a large number of features

8
Problem of SVM on IDS

All features are treated equally
Noise features (some feature cause noise during
classification)
Redundant features
High feature numbers affect performance
Training process
Detection process

9
Thoughts of Solution

Reducing feature number while keep the useful
information
Calculating the importance of features
Treating features differently based on their
importance

10
An Enhancing SVM Model

Using Rough Set to calculate reducts
Calculate feature weights from reducts
Remove redundant features based on weights
Apply weights to SVM kernel

11
Calculate Weights from Reducts

The principles of calculation are
If a feature is not in any reducts, its weight0.
More times a feature appears in reducts, more
important the feature is.
The fewer the number of features in a reduct, the
more important these feature are.

12
Apply Weights to Kernel Function

The training result of SVM is
where is the number of training records,
is the
Lagrange multipliers, is the label
associated with the training data,
is a constant,

is the kernel
function and is called a set
of Support Vectors, is a bias term. Weight w
is a diagonal matrix

13
Weights Independent to Kernel Functions

Could apply weights to any known kernel functions
Restrict Wgt0 to make sure enhanced kernel
function satisfies Mercers Condition

14
Experiment Procedures
15
Experiment Data Set

KDD (Knowledge Discovery in Databases) Cup 1999
data set.
Feature-value format.
41 features for each record.
Original data set contains 744 MB data with
4,940,000 connection records.

16
Experiment Data Set 2

UNM (University of New Mexico) data set.
Sequence-based.
Generate a trace each time a user access a
certain UNIX process.

17
KDD Training results of conventional SVM with
different value of gamma (table 1)
Training Result Exp1 Exp2 Exp3
Training record 50,000 50,000 50,000
Feature 41 41 41
Kernel type RBF RBF RBF
Value of
Generated SV 6,948 1,868 1,057
18
KDD Test results of conventional SVM with
different values of gamma (table 2)
Test Result Exp1 Exp2 Exp3
Test record 10,000 10,000 10,000
Feature 41 41 41
Value of
of misclassified 44 63 211
Accuracy 99.56 99.37 97.89
False Positive 37 52 176
False Negative 7 11 35
CPU seconds 49.53 11.34 8.32
19
Comparisons of the experimental results on the
KDD dataset (table 3)
Test Result CPU
Test set 1 Test set 1 Test set 1 Test set 1 Test set 1 Test set 1
Conventional SVM 10,000 41 99.82 7.69 222.28
Enhanced SVM 10,000 16 99.86 6.39 77.63
Improvement 60.0 0.4 16.9 66.0
Test set 2 Test set 2 Test set 2 Test set 2 Test set 2 Test set 2
Conventional SVM 10,000 41 99.80 8.25 227.03
Enhanced SVM 10,000 16 99.85 6.91 78.93
Improvement 60.0 0.5 16.2 65.0
Test set 3 Test set 3 Test set 3 Test set 3 Test set 3 Test set 3
Conventional SVM 10,000 41 99.88 7.45 230.27
Enhanced SVM 10,000 16 99.91 5.49 77.85
Improvement 60 0.3 26.3 66.0
20
Comparisons of the experimental results on the
UNM lpr dataset (table 4)
Test Result CPU
Test set 1 Test set 1 Test set 1 Test set 1 Test set 1 Test set 1
Conventional SVM 2,000 467 100 0 1.62
Enhanced SVM 2,000 9 100 0 0.28
Improvement 98 83
Test set 2 Test set 2 Test set 2 Test set 2 Test set 2 Test set 2
Conventional SVM 2,000 467 100 0 1.71
Enhanced SVM 2,000 9 100 0 0.29
Improvement 98 83
Test set 3 Test set 3 Test set 3 Test set 3 Test set 3 Test set 3
Conventional SVM 2,000 467 100 0 1.59
Enhanced SVM 2,000 9 100 0 0.25
Improvement 98 84
21
Experiment Results

Larger value of results a larger number of
Support Vectors generated.
Larger number of SVs results in higher detection
accuracy and higher computation costs.
Improvement of enhanced SVM is consistent for all
the six test sets

22
Experiment Results 2

Enhanced SVM outperforms the conventional SVM in
precision, false negative rate and CPU time for
KDD dataset.
Enhanced SVM is 80 faster for lpr dataset.

23
Experiment Results

Although generated from a small training set, the
decision boundary is consistent for whole data
set
The test results show little difference between
small and full size of training set, which prove
the good generalization ability of SVM.

24
Conclusion