Title: A Statistical Anomaly Detection Technique based on Three Different Network Features
1A Statistical Anomaly Detection Technique based
on Three Different Network Features
Yuji Waizumi Tohoku Univ.
2Background
- The Internet has entered the business world
- Need to protect information and systems from
hackers and attacks - Network security has been becoming important
issue - Many intrusion/attack detection methods has been
proposed
3Intrusion Detection System
- Two major detection principles
- Signature Detection
- Attempts to flag behavior that is close to some
previously defined pattern signature of a known
intrusion - Anomaly Detection
- Attempts to quantify the usual or acceptable
behavior and flags other irregular behavior as
potentially intrusive.
4Motivation
- Anomaly detection system
- Pro can detect unknown attacks
- Con many false positives
- Improve the performance of Anomaly detection
system - Analyze the characteristics of attacks
- Propose method to construct features as numerical
values from network traffic - Construct detection system using the features
5Classification of Attacks
- DARPA Intrusion Detection Evaluation
- DoS Denial of Service
- Probe Surveillance of Targets
- Remote to Local(R2L), User to Root(U2R)
- Unauthorized Access to a Host or Super User
6Re-classification of Attacks
- Classification by Traffic Characteristics
- DoS, Probe
- Traffic Quantity
- Access Range
- Probe
- Structure of Communication Flows
- DoS, R2L, U2R
- Contents of Communications
To detect attacks with above characteristics, it
is necessary to construct features corresponding
those classes.
7Network Traffic Feature
- Numerical values(vectors) expressing state of
traffic - We propose three different network feature sets
- Based of re-classification of attacks
- Analyzed independently
8Time Slot Feature (34 dimension)
- Count various packets, flags, transmission and
reception bytes, and port variety by a unit time - Estimate scale and range of attacks
- Target
- Probe (Scan)
- DoS
- Each slot is expressed as a vector
- Ex) (TCP,icmp,SYN,FIN,RST,UDP,DNS,)
9Examples (Time Slot Feature)
Element value
Vector element
Values are regularizes as mean0, variance1.0
normal traffic only
10Flow Counting Feature
- Flow is specified by
- (srcIP, dstIP, srcPort,dstPort,protocol)
- Count packets, flags, transmission and reception
bytes in a flow - Target
- Scan with illegal flags
- Ports used as backdoors
- TCP19 dim. , UDP7 dim.
11Examples (Flow Counting Feature)
Specific packets of attacks are extremelyhigh
and low.
Element value
Vector element
Normal traffic
12Flow Payload Feature
- Represent content of communication
- Histogram of character codes of a flow
- Count 8bit-unit(256 class)
- Transmission and reception are counted
independently (total 512 class) - Target
- Buffer overflow
- Malicious code
13Examples (Flow Payload Feature)
Specific character of attacks are extremelyhigh
and low.
Normal traffic
imap attack
14Modeling Normal Behavior
- Each packet appears based on protocol
-
- Correlations between elements of the
feature vectors - Profile based on correlations can represent
normal behavior of network traffic
15Principal Component AnalysisPCA
- Extract correlation among samples as Principal
Component - Principal Component lay along sample distribution
Non-correlated data
Principal Component
16Discriminant Function
- Long Distant Samples
- Unordinary traffic
- Break Correlation
Detection Criterion
17Detection Algorithm
- Independent Detection
- The three features are used for
PCA independently - "Logical OR" operation for detection alerts by
each feature
Features
Alert
Time Slot
PCA
OR
Network Traffic
Alert
Flow Counting
PCA
Alert
Alert
Flow Payload
PCA
18Performance Evaluation
- Two Examine Scenario
- Scenario1
- Learn Week1 and 3
- Test Week4 and 5
- Scenario2
- Learn Week 4 and 5
- Test Week 4 and 5
- More Practical Situation
- Real network traffic may include attack traffic
- Criterion for Evaluation
- Detection rate when number of miss-detection
(false positive) per day is 10
19Data Set
- Data Set
- 1999 DARPA off-line intrusion detection
evaluation test set - Contain 5 weeks data (from Monday to Friday)
- Week1,3 Normal traffic only
- Week2 Including attacks (for learning)
- Week4,5 Including attacks (for testing)
20Scenario 1 Result
of detection of target Detection rate
Proposed Method 104 171 60.8
NETAD 132 185 71.4
Forensics 15 27 55.6
Expert1 85 169 50.3
Expert2 81 173 46.8
Dmine 41 102 40.2
2003
2000
21Scenario 2 Result
of detection of target Detection rate
Proposed Method 100 171 58.5
NETAD 70 185 37.8
- NETAD
- Use IP address as white list
- Overfit learning data
- Proposed Method
- Independent of IP address
- Evaluate only anomaly of traffic
22Detection Results every Features
Scenario 1
Low detection overlap
(FP)
(FC)
(TS)
5
9
22
Time Slot Feature(TS)
6
13
Flow Counting Feature(FC)
Each feature detect different characteristic
attacks
44
Flow Payload Feature(FP)
5
(TS) (FC) (FP)
Scenario 2
(FP)
(FC)
(TS)
2
7
37
Time Slot Feature(TS)
of Detection by both TS FP
3
8
Flow Counting(FC)
40
Flow Payload(FP)
3
(TS) (FC) (FP)
of Detection by FP only
of Detection by all Three Features
23Conclusion
- For network security
- Classification attacks into three types
- Construct three features corresponding to
three attack characteristics - Detection method with PCA
- Learning the three features independently
- Higher detection accuracy
- With samples including attacks