A Survey of Host Based Intrusion Detection Systems (HIDS) presentation

About This Presentation

Transcript and Presenter's Notes

Title: A Survey of Host Based Intrusion Detection Systems (HIDS)

1
A Survey of Host Based Intrusion Detection
Systems(HIDS)

Emre Can Sezer
Dept. of Comp. Science
North Carolina State University

2
Outline

Introduction and Motivation
Model Creation Techniques
Sampling of Models
N-gram model
Callgraph, Abstract Stack models
Impossible Path Exploit
Brief overview of VtPath, Dyck and VPStatic
Data attacks
Conclusion

3
Introduction

Terminology
IDS Intrusion detection system
IPS Intrusion prevention system
HIDS/NIDS Host/Network Based IDS
Anomaly vs. Intrusion Detection
Anomaly also captures misuse
There is no intrusion, however, due to bad
programming or administering, the process behaves
differently than normal (i.e. a bug in the code)
Intrusions are also anomalies
Difference between IDS and IPS
Detection happens after the attack is conducted
(i.e. the memory is already corrupted due to a
buffer overflow attack)
Prevention stops the attack before it reaches the
system (i.e. shield does packet filtering)

4
Introduction Cont

Idea behind HIDS
Define normal behavior for a process
Create a model that captures the behavior of a
program during normal execution.
Monitor the process
Raise a flag if the program behaves abnormally

5
Why System Calls? (Motivation)

The program is a layer between user inputs and
the operating system
A compromised program cannot cause significant
damage to the underlying system without using
system calls
i.e Creating a new process, accessing a file etc.

6
Model Creation Techniques

Models are created using two different methods
Training The programs behavior is captured
during a training period, in which, there is
assumed to be no attacks. Another way is to craft
synthetic inputs to simulate normal operation.
Static analysis The information required by the
model is extracted either from source code or
binary code by means of static analysis.
Training is easy, however, the model may miss
some of the behavior and therefore produce false
positives.
Static analysis based models produce no false
positives, yet dynamic libraries and source code
availability pose problems.

7
Definitions for Model Analysis

If a model is training based, it is possible that
not every normal sequence is in the database.
This results in some normal sequences being
flagged as intrusions. This is called a false
positive.
If a model fails to flag an intrusion, this is
called a false negative.
Accuracy An accurate model has few or no false
positives.
Completeness A complete model has no false
negatives.
Convergence Rate The amount of training required
for the model to reach a certain accuracy

8
A Visual Description of False Positives and
Completeness
Normal Behavior
Model
9
A Visual Description of False Positives and
Completeness
Normal Behavior
False Positives
Model
10
A Visual Description of False Positives and
Completeness
Normal Behavior
Model
False Negatives
11
N-Gram

Pioneering work in the field.
Forrest et. al. A Sense of Self for Unix
Processes, 1996.
Tries to define a normal behavior for a process
by using sequences of system calls.
As the name of their paper implies, they show
that fixed length short sequences of system calls
are distinguishing among applications.
For every application a model is constructed and
at runtime the process is monitored for
compliance with the model.
Definition The list of system calls issued by a
program for the duration of its execution is
called a system call trace.

12
N-Gram Building the Model by Training

Slide a window of length N over a given system
call trace and extract unique sequences of system
calls.

Example
System Call trace
Unique Sequences
Database
13
N-Gram Monitoring

Monitoring
A window is slid across the system call trace as
the program issues them, and the sequence is
searched in the database.
If the sequence is in the database then the
issued system call is valid.
If not, then the system call sequence is either
an intrusion or a normal operation that was not
observed during training (false positive) !!

14
Experimental Results for N-Gram

Databases for different processes with different
window sizes are constructed
A normal sendmail system call trace obtained from
a user session is tested against all processes
databases.
The table shows that sendmails sequences are
unique to sendmail and are considered as
anomalous by other models.

The table shows the number of mismatched
sequences and their percentage wrt the total
number of subsequences in the user session
15
Problems with Sequence Based Approaches

The minimal foreign sequence problem

Database includes S0,S3,S4 S3,S4,S2
An attack sequence S0,S3,S4,S2 cannot be detected
16
Problems with Sequence Based Approaches Cont

Code insertion
As long as the order in which an attacker issues
system calls are accepted as normal, he can
insert and run his code on the system (i.e.
buffer overflow)

17
FSA Model

Sekar et. al., A Fast Automaton-Based Method for
Detecting Anomalous Program Behaviors, 2001.
Build a non-deterministic finite state automata
(FSA) by training.
Uses program counter (PC) information to address
code insertion problems.
Once PC is coupled with system calls, every
system call site in the code becomes unique.
Instead of using sequences and be limited by
length, they use finite state automaton to
express every possible sequence.
The first piece of research to use PC information
and automata.

18
FSA Example
An example code and the corresponding FSA built
from it
Note the non-determinism in states 1,3,6 and 8.
S0,S3,S4,S2 is captured. No length limitation.
19
Convergence Comparison

Experiment is run on ftpd.
FSA model converges faster than N-gram.

20
Callgraph and Abstract Stack Models

Wagner et. al., Intrusion Detection via Static
Analysis, 2001.
Uses finite state automaton to model the process
behavior.
It is based on static analysis of source code.
They introduce three methods
Callgraph (NFA)
Abstract Stack (PDA)
Digraph (a static version of N-gram with window
size of 2, not mentioned here)

21
Callgraph Model

A control flow graph (CFG) is extracted from the
source code by static analysis.
Every procedure f has an entry(f) and exit(f)
state. At this point the graph is disconnected.
It is assumed that there is exactly one system
call between nodes in the automaton and these
system calls are represented by edges.
Every function call site v, calling f, is split
into two nodes v and v. Epsilon edges are added
from v to entry(f) and from exit(f) to v.
The result is a single, connected, big graph.

22
Callgraph Example
Entry point
Epsilon edges
Function call site is split into two nodes
23
Monitoring Callgraph

The IDS is given system call information alone,
and no PC information.
When a system call is received, the automaton is
simulated to transition between states. If such a
transition does not exist in the model, the IDS
raises a flag.
Due to non-determinism, there might be more than
one possible state at a given time. In this case
every possible state in the program is simulated
against the system call and the ones that do not
have a transition on the given system call are
dropped.
Non-determinism usually incurs too much
computational overhead in large programs.

24
Imprecision in Callgraph
The return address in f can be overridden.
Valid Path
Impossible Path. Yet the model will not be able
to detect it since all transitions are valid.
25
Abstract Stack Model

The more information an IDS has, the more
accurately it can model the behavior of a
program.
Abstract Stack model makes use of the call stack.
In order to incorporate this information into
their model, they use a push-down automata (PDA).
The idea is to have an abstract copy of the call
stack in the PDA stack.
At any given state, the PDAs stack contains the
list of return addresses in the call stack.

26
Push-down automata

As in FSA, PDA have a set of states and a
transition function.
They differ from FSA by also having a stack. They
accept context-free languages.
At every transition, a symbol can be pushed or
popped from the stack.
They can accept either by state or by stack (if
stack is empty), which are equivalent in terms of
computational power.
PDA is stronger than FSA. It can accept regular
languages and also some irregular ones such as
0n1n.

push 0
pop 0
1
0
Stack
1
Start
End
Once you see a 1, switch to the End state. The
stack contains as many 0 as seen in the input. If
the stack is empty at the end of the input,
accept.
27
Detecting the IPE Attack

Consider the previous example of an impossible
path.

The Abstract Stack model will detect the attack
since it stores stack information. When returning
from state Exit(f), the stack will have the
return address v.
State v does not have a transition on system
call exit() hence the attack will be detected.

28
Performance Issues

Both of the models Callgraph and Abstact Stack
have very high operational costs. The reason for
this is non-determinism.
Non-determinism manifest itself in two ways
State non-determinism The automaton can be in a
number of different states. When a system call is
received, all these states need to be checked for
valid transitions.
Stack non-determinism Only applies to Abstract
Stack model. There can be a number of different
ways a state can be reached, resulting in more
than one stack configuration.

29
State and Stack Exposure Techniques

Exposing state can greatly reduce the
non-determinism in the model. The state of the
program can be exposed by using PC information.
The stack can be exposed in two ways
Indirectly as in Abstract Stack, where the PDA
has transitions that simulate the call stack.
Directly by stack walk, simply obtaining the list
of return addresses from the call stack.

30
VtPath

Feng et. al., Anomaly Detection Using Call Stack
Information, .
Inspired by Abstract Stack, they use call stack
information in their model.
It is training based and has better convergence
rate and comparable false positive rates than the
FSA model.
Uses virtual stack lists to create virtual paths
between two consecutive system calls and keeps a
database of these virtual paths.
It uses PC information and stack walk to get the
VSLs.
The model is a collection of virtual paths.
More resistant to IPEs. It can capture the IPE
presented in Wagner et. al.s paper.

31
Dyck

Giffen et. al., Efficient Contest-Sensitive
Intrusion Detection, 2004.
Uses static analysis of binary code.
Exposes stack by inserting null-calls before and
after function call sites.
Null-calls are inserted before and after function
call sites to keep track of function calls using
binary rewriting.
With null-calls, the stack becomes deterministic
and the performance improves greatly compared to
a non-deterministic PDA.
This model is called a stack-deterministic PDA
(SDPDA) in a later paper by the same authors.
Feng et. al., Formalizing Sensitivity in Static
Analysis for Intrusion Detection, 2004.

32
Dyck Model Example
Dyck instrumentation
C source code exapmle
33
Dyck Model Example Cont
Dyck Model w/o Squelching
Callgraph Model
34
VPStatic

Feng et. al., Formalizing Sensitivity in Static
Analysis for Intrusion Detection, 2004.
Static analysis version of VtPath.
Instead of using sequences, they define
transitions on a PDA.
The state of the program is exposed by using PC
information.
The stack is exposed by using stack walk and
VSLs.
The goal is to create a deterministic PDA by
exposing stack and state information.
The model is fully deterministic.
Operating the deterministic PDA is less
expensive, however, the bottleneck in VPStatic is
the stack walk operation.

35
Overview of the Models

The trend has been towards more complicated
automata and static analysis.
Models using state exposure are immune to code
insertion. i.e FSA, VtPath, VPStatic.
Models using stack exposure are immune to
control-flow hijacking. i.e Abstract Stack, Dyck,
VPStatic.
Still, if an attack does not issue system calls,
these models might fail.

36
Data Flow Attack

A variation of the IPE.
The control flow is altered but not hijacked.
Instead of overwriting return addresses to change
the control flow, a data used as a predicate in a
branch is overwritten.
The Data Flow Attack does not traverse any
function boundaries evading even PDA based
models.
The models need to be flow sensitive in order to
capture such an attack.

37
Data Flow Attack Example

The system call sequences ltsys_1, sys_5, sys_3gt
and
ltsys_2, sys_5, sys_4gt are normal sequences.
Any of the afore mentioned models using will also
accept
ltsys_1, sys_5, sys_4gt and
ltsys_2, sys_5, sys_3gt
There is no way the model can relate the first
loop to the second.
Execution path history needs to be known to be
able to detect such an attack.

38
User ID Hijacking

Example attack on WU-FTPD.
When a user issues a get or a put command, the
effective user id (EUID) is temporarily escaladed
to root in order to perform setsockopt().
Using format string vulnerability, pw-gtpw_uid can
be set to 0 (root), giving root privileges to the
user.

FILE getdatasock( ... ) ... seteuid(0) setsockopt( ... ) ... seteuid(pw-gtpw_uid) ...
39
Decision-Making Data Hijacking

The following example code is taken from a SSH
implementation.
The function detect_attack() has an integer
overflow vulnerability.
Using the vulnerability, the authenticated flag
can be set to non-zero, allowing a user root
privilege without him ever supplying a password.

void do_authentication(char user, ...) 1 int authenticated 0 ... 2 while (!authenticated) / Get a packet from the client / 3 type packet_read() // calls detect_attack() internally 4 switch (type) ... 5 case SSH_CMSG_AUTH_PASSWORD 6 if (auth_password(user, password)) 7 authenticated 1 case ... 8 if (authenticated) break / Perform session preparation. / 9 do_authenticated(pw)
40
Why Automata Cant Capture Data Flow Attack

With the call to the system call in between the
branches (sys_5), the model looses all execution
path information.
None of the models mentioned are neither flow nor
path sensitive.

Normal path
Abnormal path

There are no function calls, so stack exposure is
ineffective against this attack.
In the absence of function calls, all the models
keep track of consecutive system calls. In other
words, they are only as powerful as N-gram with a
window size of 2.

41
A Different Look at System-Call Based IDSs

The problem is recording every possible system
call trace an application can produce.
In doing so, other security issues such as code
injection and mimicry attacks must be considered.
The models we have seen are compact
approximations for these infinite sets.

42
Back To Training Based Models

The data attack can be detected in two ways
Finer grained methods Live analysis of
variables, or checking predicates at branches.
Using training Normal user sessions will not
produce sequences seen with the data attack.
Using finer grained models beats the purpose of
having system-call based IDSs.

43
Execution Path History

Given a node v in the graph, the execution path
will go through a number of branch instructions
and loops before reaching this state.
If we were able to keep track of the execution
path that was taken up to node v, we could append
that information to the node using training.
In the data attack example, node Sys_3 would know
that only an execution path thats been through
node Sys_1 should exist.

Start, Sys_1, Sys_5
Start, Sys_2, Sys_5
44
Obtaining Execution Path History

One major tool in accomplishing this task could
be null-call insertion.
It is used in the Dyck model to keep track of
function call sites. The idea can be applied to
every branch that issues a system call.
A sequence of previously issued system calls can
be appended to every node.
During monitoring, the execution path history
will be matched against the possible histories at
every node.

45
Performance Considerations

Every node will have a great number of possible
histories that needs to be kept track of.
Considering the size of today's applications,
recording a list of these paths for every node is
clearly not possible.
Compact representations must be developed. For
example, a node that has only a single incoming
edge needs not keep an entire record of histories
as its history is a single system call appended
to its predecessors execution path history.
Also not every node on the path is critical. If
most of the execution paths have common
substrings, there might be a way to extract the
important information from the sequence.

46
Conclusion

When it comes to real time intrusion detection,
false positives are unacceptable. This has lead
researchers towards static analysis based,
complicated models such as Dyck and VPStatic.
Yet, the data attacks shows that even these
models are not complete.
Still, these models should not be underrated,
since they can capture code insertion, stack
corruption and impossible path attacks with no
false positives.

Write a Comment

User Comments (0)

About PowerShow.com

A Survey of Host Based Intrusion Detection Systems (HIDS) PowerPoint PPT Presentation