Post-Attack Analysis of Unknown Vulnerabilities - PowerPoint PPT Presentation

About This Presentation
Title:

Post-Attack Analysis of Unknown Vulnerabilities

Description:

Title: MemSherlock: An Automated Debugger for Unknown Memory Corruption Vulnerabilities Author: Can Sezer Last modified by: Peng Ning Created Date – PowerPoint PPT presentation

Number of Views:121
Avg rating:3.0/5.0
Slides: 31
Provided by: CanS5
Learn more at: http://s2.ist.psu.edu
Category:

less

Transcript and Presenter's Notes

Title: Post-Attack Analysis of Unknown Vulnerabilities


1
Post-Attack Analysis of Unknown Vulnerabilities
  • Peng Ning
  • With Emre C. Sezer, Chongkyung Kil, and Jun Xu

2
Motivation
  • Vulnerability analysis
  • Essential for
  • Patching
  • Vulnerability based signature generation
  • Painstakingly slow
  • Depends on human efforts
  • Existing approaches
  • Static analysis (e.g., Chen et al. 04 , Feng
    et al. 04, Larochelle Evans 01)
  • False positives
  • Dynamic analysis (e.g., Minos Crandall et al.
    04, TaintCheck Newsome Song 05, DIRA
    Smirnov Chiueh 05)
  • Used for detection inadequate vulnerability
    information
  • Symbolic execution (e.g., Exe Cadar et al. 06,
    DACODA Crandall et al. 05)
  • Scalability issues
  • Recovery (e.g., STEM Sidiroglou et al. 05, SEAD
    Lacosto et al. 07)
  • Change of application semantics

3
MemSherlock
  • MemSherlock is an automated debugger
  • Automated analysis of unknown memory corruption
    vulnerabilities
  • Appeared in ACM CCS 07
  • MemSherlock provides
  • Statement that causes the memory corruption
  • Dynamic program slice leading to the corruption
  • Program variables involved in the vulnerability
  • All presented at programming language level
  • Implications
  • Generating vulnerability conditions
  • Improves signature or patch generation speed

4
General Framework Web Application Example
Traffic
5
MemSherlock Overview
  • Goal is to provide vulnerability information
  • Intuitive, easy to understand for the programmer
  • Not only the corruption point
  • Slice of program involved in the vulnerability
  • Effects of user inputs
  • Program variables involved
  • Variable relationships (e.g., pointer aliasing)
  • Type of vulnerability (e.g., stack buffer
    overflow)
  • MemSherlock performs two important tasks
  • Finding the corruption point
  • Tracking program state

6
MemSherlock Finding Corruption Point
  • Observation A memory object is modified by a
    small set of statements (inspired by AccMon)
  • For memory object m, write set of m is the set of
    statements that legitimately modify m, WS(m)
  • Security Condition Memory object m should only
    be updated by statements in WS(m)

7
MemSherlock Assembly Line
  • Pre-Debugging Phase
  • Instruments the program for debugging phase
  • Extracts program information via static analysis
  • Needs to be performed once
  • Debugging Phase
  • Tracks program state
  • Monitors memory writes and checks for violation
    of security condition
  • Tracks tainted data and its propagation

8
MemSherlock Architecture
9
Pre-debugging Generating Write Sets
  • MemSherlock analyses source code to determine
    write sets
  • For a program variable v, WS(v) includes
  • Assignment statements (i.e., vexpr)
  • Library function calls where v is passed as an
    argument that can be modified (i.e.,
    memcpy(v,src))
  • MemSherlock treats DLLs as black boxes
  • Assumption A DLL is internally secure, but
    externally insecure
  • e.g., no stack overflows in the library functions
  • Sound for common, well tested libraries (e.g.,
    clib)
  • Requires library specifications
  • For each DLL, a list of functions and the
    arguments they might modify

10
Dealing with Pointers
  • For a pointer variable p two write sets are kept
  • WS(p) Statements that modify p
  • WS(ref(p)) Statements that modify the referent
    (e.g., p5)
  • ref(p) is resolved during runtime (debugging)
  • Perform the same analysis for pointer-type
    function arguments at function calls
  • Removes the requirement for inter-procedural
    static analysis

11
Chained Dereferences
1 int z 2 int y z 3 int x y 4 x 10 1 int z 2 int y z 3 int x y 4 int temp x 5 temp 10
  • Earlier technique can only handle simple
    dereferences
  • Source code rewriting is used to convert all
    chained dereferences to simple dereferences
  • Any other dereference that is not simple is
    converted in the same manner

12
Output of Pre-debugging Phase
  • Simplified program
  • Simplified pointer dereferences
  • Compiled with debugging options
  • Input file for the debugger
  • Program variables and their write sets
  • Addresses of global symbols
  • Frame pointer offsets of local variables
  • Other flags that help the debugger

13
MemSherlock Architecture Debugging
14
Debugging Dynamic Monitoring
  • Runtime monitoring
  • State Maintenance
  • Incorporates taint analysis from TaintCheck
  • Produces a dynamic slice of the program leading
    to the vulnerability
  • Write Checking
  • Monitors and validates memory writes
  • Write sets are file name and line number pairs
    ltf,lgt
  • Instruction pointer IP is translated into ltf,lgt
  • Write sets are associated with program variables
  • A destination address is translated into a
    program variable

15
Keeping Program State
Virtual Address Space
Stack base
Stack base
main
main
fnc A
fnc A
fnc B
fnc C
Memory write 0xABABABAB
Memory write 0xABABABAB
Program State 1
Program State 2
  • A given memory region may correspond to different
    program variables depending on program state
  • Dynamic monitor keeps track of memory mapping

16
Debugging Key Data Structures
  • Keeps two lists of memory regions
  • ActiveMemoryRegions
  • Memory corresponding to program variables or
    their referent memory regions
  • NonWritableRegions
  • Saved registers, return addresses, metadata
    encapsulating dynamically allocated memory
    regions

17
Debugging State Maintenance
  • Function calls/returns (memory)
  • Local variable addresses are calculated and added
    to ActiveMemoryRegions
  • Location of return address and saved registers
    are added to NonWritableRegions list
  • Heap memory (memory)
  • malloc/free calls are intercepted
  • Allocated memory is added to ActiveMemoryRegions
  • The metadata encapsulating the buffer is added to
    NonWritableRegions
  • Pointer value updates (write sets)
  • Searches ActiveMemoryRegions to find the referent
    and updates its WS

18
Debugging Write Checking
  • When instruction IP modifies memory m
  • if m is in ActiveMemoryRegions
  • determines the variable v it belongs to
  • converts IP into ltf,lgt
  • checks if ltf,lgt is in WS(v)
  • If the memory write check fails or m is in
    NonWritableRegions
  • Marks the operation as a memory corruption
  • Displays the vulnerability information

19
Generating Vulnerability Information
  • The slice of program contributing to the
    vulnerability
  • Statements that have propagated tainted values
  • Statements that have modified related memory
    regions
  • Dependency between memory objects involved in the
    vulnerability
  • Points to analysis shows memory regions and how
    they were accessed
  • Program state
  • Call stack information
  • Write set information

20
Example Test Case Null HTTP
  • http.c
  • 91 void ReadPOSTData(int sid)
  • 100 connsid.PostDatacalloc(connsid.dat-gtin_C
    ontentLength1024, sizeof(char))
  • 101 if (connsid.PostDataNULL) ...
  • 107 do
  • 108 rcrecv(connsid.socket, pPostData, 1024,
    0)
  • 109
  • Error Report
  • --20361-- Error type Heap Buffer Overflow
  • --20361-- Dest Addr 3AB3E360
  • --20361-- IP 0x804E5C7 ReadPOSTData
    (http.c108)
  • --20361-- Dest address resolved to
  • --20361-- Global variable "heap var"
  • _at_ 3AB3E280 (size 224)
  • --20361--
  • --20361-- Memory allocated by 0x804E531
  • ReadPOSTData (http.c100)
  • --20361-- TAINTED destination 3AB3E360
  • --20361-- Fully tainted from
  • --20361-- 0x804E5C7 ReadPOSTData
    (http.c108)
  • --20361--
  • --20361-- TAINTED size used during allocation
  • --20361-- Tainted from
  • --20361-- 0x804E456 ReadPOSTData
    (http.c100)
  • --20361-- 0x804FBB5 read_header (http.c153)
  • --20361-- 0x805121B sgets (server.c211)

21
Vulnerability Analysis Example
http.c 91 void ReadPOSTData(int sid) 92
char pPostData ... 100 connsid.PostDatacal
loc( connsid.dat-gtin_ContentLength1024,
sizeof(char)) ... 107 do 108
rcrecv(connsid.socket, pPostData, 1024,
0) ...
Create
Heap Object
22
Vulnerability Analysis Example
http.c 119 int read_header(int sid) 121
char line2048 ... 127 do 128
memset(line, 0, sizeof(line)) 129 sgets(line,
sizeof(line)-1, connsid.socket)
... 153 connsid.dat-gtin_ContentLengthato
i((char )line16) ... 169 if
(connsid.dat-gtin_ContentLengthltMAX_POSTSIZE)
170 ReadPOSTData(sid)
Object
Taint
http.c 91 void ReadPOSTData(int sid) 92
char pPostData ... 100 connsid.PostDatacal
loc( connsid.dat-gtin_ContentLength1024,
sizeof(char)) ... 107 do 108
rcrecv(connsid.socket, pPostData, 1024,
0) ...
Object
Use
23
Vulnerability Analysis Example
http.c 119 int read_header(int sid) 121
char line2048 ... 127 do 128
memset(line, 0, sizeof(line)) 129 sgets(line,
sizeof(line)-1, connsid.socket)
... 153 connsid.dat-gtin_ContentLengthato
i((char )line16) ... 169 if
(connsid.dat-gtin_ContentLengthltMAX_POSTSIZE)
170 ReadPOSTData(sid)
Create
server.c 202 int sgets(char buffer, int
max, int fd) 203 ... 209
connsid.atimetime((time_t)0) 210 while
(nltmax) 211 if ((rcrecv(connsid.socke
t, buffer, 1, 0))lt0) ...
Taint
Object
Taint
Object
24
Implementation
  • Source code is rewritten using CIL (C
    Intermediate Language)
  • CodeSurfer was used to extract program variables
    and their write sets
  • A commercial static analysis tool
  • objdump and dwarfdump were used to extract global
    symbol information
  • Dynamic Monitoring is implemented in Valgrind
  • An open source emulator

25
Evaluation
  • Tested 11 real-world applications with known
    memory corruption vulnerabilities
  • Test cases included
  • Stack/Heap buffer overflow, Format string
  • Both control flow and non-control data attacks
  • Testing methodology
  • Programs were run under MemSherlock
  • Exploit programs were used to attack the
    applications
  • Log and replay was not used

26
Evaluation Results
Application Name Vuln.Type Description Captured? FP
GHTTP S A small HTTP server Yes 7
Icecast S An mp3 broadcast server Yes 0
Sumus S A game server for mus Yes 0
Monit S Multi-purpose anomaly detector Yes 0
Newspost S Automatic news posting Yes 2
Prozilla S A download accelerator for Linux No 0
NullHTTP H An HTTP server Yes 0
Xtelnet H A telnet server Yes 4
Wsmp3 H Web server with mp3 broadcasting Yes 0
OpenVMPS F Open source VLan management policy server Yes 2
Power F UPS monitoring utility Yes 10
  • Type abbreviations (S)tack overflow, (H)eap
    overflow and (F)ormat string

27
False Negatives
  • Prozilla
  • memcpy uses a kernel function to manipulate page
    tables when copying entire pages
  • Valgrind cannot trace into kernel
  • Can be prevented by function wrappers
  • Other false negatives are theoretically possible
  • structs within unions or arrays
  • Current implementation does not support unions
  • Currently do not differentiate between elements
    of an array
  • Memory corruption errors inside DLLs

28
False Positives
  • Embedded assembly
  • Incomplete library specification
  • library functions keeping internal state (e.g.,
    strtok(Null, delim) )
  • library functions that modify global variables as
    side effects (e.g., optarg, errno)
  • pointers that point to hidden global structures
    (e.g., getdatetime() in time.h)
  • struct pointers
  • void pointers that are type-cast to modify struct
    variables
  • since the pointer is not of type struct,
    MemSherlock fails to update accordingly

29
Conclusion
  • Fully automated vulnerability analysis
  • The analysis output is intuitive and human
    readable
  • Future Challenges
  • Automated, long-term fix of vulnerabilities
  • Semantic consistency is a great challenge
  • Automated, temporary fix of vulnerabilities
  • Generating vulnerability condition
  • Improving signature generation

30
Thank You
Write a Comment
User Comments (0)
About PowerShow.com