Title: Learning Program Behavior for Intrusion Detection
1Learning Program Behavior for Intrusion Detection
- Yihua Liao
- Advisor Prof. Rao Vemuri
- Feb. 20, 2002
2Outline
- Machine learning and Computer Security
- Related work on modeling program behavior
- K-Nearest Neighbor classifier and text
categorization - Experiments with DARPA BSM data
- Neural Networks
- Conclusions
3Machine Learning Security
- To improve automatically with experience
- Good at learning user, system or network
behavior, extracting usage patterns and rules,
and classifying new instances. - Security related applications access control,
malicious code, misuse and anomaly detection, etc.
4Model user program behavior
- User behavior
- Insider threats
- UNIX shell commands, login events, etc.
- Concept drift, privacy issues
- Program behavior
- Intrusions often occur when program misused.
- Program profiles provide concise and stable
tracks
5Modeling program behavior
- Program policy specification (Ko et al. 1994)
- - Determine intended behavior write security
specifications for monitored progs. - Static Analysis (Wagner et al. 2001)
- - Use NDPDA to find possible system call
sequences from source code, and check for
compliance at runtime. - - Challenges Dynamic linking, threads, large
overhead -
6Modeling program behavior (cont.)
- Learn program behavior profiles from previous
executions. (Forrest, Lee, Ghosh, etc.) - Short sequences of system calls
- Profiles for individual programs
- Time-consuming training and testing process
7Short sequences of system calls
- open read mmap mmap open close
- Unique sequences for window size 3
- open read mmap
- read mmap mmap
- mmap mmap open
- mmap open close
8Analogy
- word ?? system call
- text document ?? list of system calls issued by a
program - different categories ?? normal/intrusive
9Text categorization
- Transform each document into a vector
- open read mmap mmap open close
- Open 2 read 1 mmap 2 close 1
- Frequency of word i in document k fik
- Word-by-document matrix A (aik)
-
10Weighting techniques
- - Frequency weighting
- aik fik
- Term frequency inverse document frequency
weighting (tif-idf) - Ni number of documents for which the word
occurs at least once.
11K-Nearest Neighbor classifier
- Use the class labels of k most similar neighbors
to predict the class of new document. - A cutoff threshold is needed.
12Advantages
- Limited system-call vocabulary. No dimension
reduction techniques needed. - Simple binary categorization problem
- kNN doesnt rely on prior knowledge,
computationally efficient.
13Experiments
- Data set 1998 DARPA BSM data
- Provides a large sample of network-based attacks
embedded in normal background traffic. - TCPDUMP and BSM audit data collected on a
simulated network.
14(No Transcript)
15BSM events
- header,118,2,open(2) - read,,Mon Jun 01 081217
1998, 925767180 msec - path,/usr/lib/libdl.so.1
- attribute,100755,bin,bin,8388614,96882,0
- subject,2104,root,100,2104,100,501,431,24 1
135.8.60.182 - return,success,4
- trailer,118
16More on data set
- DARPA data was labeled with session s. 400500
sessions per day. - Individual session can be extracted from logs.
- each session consists of 1 or more processes
- generate list of system calls for every
process 50 distinct system calls
17Process id 994 close execve open mmap open
mmap mmap munmap mmap mmap close open mmap
close open mmap mmap munmap mmap close close
munmap open ioctl access chown ioctl access
chmod close close close close close exit
18(No Transcript)
19Training testing data
- Training data
- 606 distinct process vectors from 4 simulation
days. 50 distinct system calls. - Testing data
- - 35 attack sessions, including U-2-R, R-2-U,
probe, and several intrusion scenarios. - - 5285 normal processes from one simulation
day.
20Result
21Result
22Result
23Remarks on kNN
- Suitable for dynamic enviroment
- Instance-based learning, lazy learning
computation at query time - All attributes used
- Efficient memory indexing
24Neural networks
- Feed-forward multi-layer network with
backpropagation algorithm. - Randomly generated data to train network. All
data are anomalous by default. - Normal data cause network to recognize a
particular area of input space as normal.
25Summary
- Frequencies of system calls are used to
characterize program behavior. - K-Nearest Neighbor classifier can effectively
detect intrusive program behavior. No individual
program profiles are needed. Low false positive
rate can be achieved. - KNN is suitable for dynamic environment and
real-time ID.
26Reference
- http//wwwcsif.cs.ucdavis.edu/liaoy/knn_ss02.pdf
- (old data analysis)