Title: Diversity Algorithms for Worrisome Software and Networks DAWSON
1Diversity Algorithms for Worrisome Software and
Networks(DAWSON)
- James Just, Mark Cornwell, Jason Minto, Art
Torrey - Global InfoTek, Inc.
- Karl Levitt, Jeff Rowe, Tufan Demir
- UC Davis
- R. Sekar
- Consultant (SUNY Stony Brook)
- 27 January 2005
2Overview
- Project overview
- Integration framework
- Diversity to break exploits
- Diversity to break payloads
- Framework for analyzing effectiveness
- Binary rewriting review
- Next steps
3Problem Space (1) Excessive Homogeneity gt
Systemic Vulnerability
How prevent exponentially cascading failures?
- Attacks exploit dense environment with ease to
spread fast and/or far - Foreseeable cyber-risks dominated by static,
durable monoculture of executables
4Problem Space (2) Common Mode Failures Impede
Intrusion Tolerant Systems
- Intrusion tolerant systems are
- Expensive
- Have large hw/sw footprints
- Assume a priori knowledge of
- attack modalities
- Success depends on availability of spare
components - Assumption of independent intrusions/faults is
flawed - Availability of diverse commercial spares limits
effectiveness even if intrusion tolerance system
affordable - Rapid learning of attack signatures for blocking
is hard - Custom N-version programming is costly
5DAWSON Approach
- Randomized transforms of Windows executables at
runtime - Preserve functionality of executable modules
(e.g., dll) - Transform binary code, machine addresses, names,
etc - Use annotations to facilitate
- Pseudo-random numbers produce unique
transformations on each application restart - Network protocol diversity effort replaced by
breaking payload execution - Goal Beat program metric by 10X for large
fraction of exploit space if transforms are
focused - 100 functional equivalents with no more than 3
susceptible to same exploit as baseline code for
most exploits - Low overhead transforms (runtime performance)
6Attack Space of Interest Memory Error Exploits
Memory corruption attacks
- Corrupt target of existing pointer
- Compromise security critical data
- File names opened for write or execute
- Security credentials -- has the user
authenticated himself?
Corrupt a pointer value
Includes common buffer overflows, strncpy(),
off-by-one, cast screw-up, format strings,
double-free, return to libc, other heap structure
exploits
- Corrupt code pointer
- Return address
- Function pointer
- Dynamic linkage tables (GOT, IAT)
- Corrupt data pointer
- Frame pointer
- Local variables, parameters
- Pointer used to copy input
Pointer to injected data
- Pointer to existing data
- Example corrupt string arguments to
functions so that they point to attacker
desired data already in memory, e.g.,
/bin/sh, /etc/passwd
Pointer to injected code
Pointer to existing code
7Evaluation
- Identify assumptions and ROE for possible Red
Teaming - Internal testing with
- Fabricated applications with known
vulnerabilities and exploits - Real applications with known vulnerabilities and
exploits - Possible use of Emulab or Deter network emulation
testbeds
8Some Assumptions Red Team ROEs
- Attacks are remote, automated and non-directed
- Attacker cannot observe the execution of valid
programs without using system calls - Processes cannot transition from user mode to
kernel mode without using system calls - Attacker cannot automate non-trivial static
analysis of memory contents - Modification is limited to binary (or memory)
editing source code is unavailable
9Status
- Interim products
- Native Windows (MFC, .Net) PE File Editor
- Transforms
- Automated permutation of the Import Address Table
in PE files - Automated replacement of DLL names and functions
with random strings in PE files - Local variable location modification not quite
automated yet - In-process
- Reordering of binary code blocks and insertion of
dead code blocks - Asymmetric transformation of function parameters
using dummy functions. - New insights on requirements
- Address obfuscation (to defeat trivial static
analysis of memory) - Fail-crash detection mechanism (to defeat brute
force trial and error) - Non-by-passability of transform mechanisms
Balzer wrapper mechanism UC Davis investigating
option from another project - Because of perceived higher value, shifted some
effort to developing diversity to break payloads
rather than diversity for network protocols
10Transition and Future Work
- Interim use intermediate products by other SRS
contractors - Integrate with follow-on projects/products
- If successful
- Package for military users
- Possible GITI commercial product
- Possible open source toolkit approach
- Transition to Microsoft or other software vendors
- Some expressions of VC interest
- Standard research publications
11DAWSON Project Schedule Milestones
- Baseline Tasks
- 1. Requirement Refinement
- Exploit Diversification
- 3. Payload Diversification
- 4. Integration
- 5. TE
- Program Mgt.
- Prototypes
FY04
FY05
FY06
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q2
1
2
3
12GITI
13Current Attack Problem
Software Specification
Source Code
Machine Code Specification
Executable Code
Known V-Spec
Vulnerability
Loader
Machine-level Code
Exploit Payload
14Vulnerability Specification?
- All aspects of a program execution which can be
exploited by malicious code to gain control of
the Program Counter, e.g., - Memory Topology
- Stack Specification
- System APIs
- Application APIs
- Libraries
- Exception Handling
- Etc
15Breaking the V-Spec
Software Specification
Source Code
Machine Code Specification
Executable Code
Known V-Spec
Transforming Loader
Vulnerability
Vulnerability??
Machine-level Code
Machine-level Code
Machine-level Code
Machine-level Code
Doesnt Match X
Machine-level Code
Machine-level Code
Machine-level Code
Machine-level Code
Machine-level Code
Exploit Payload
Machine-level Code
Machine-level Code
Machine-level Code
Machine-level Code
Machine-level Code
Machine-level Code
Transform Specifications
V-Spec Unknown Until Load-Time
16Transform Techniques in Literature
- Obfuscation
- Layout obfuscation (scramble identifiers, remove
comments, change formats) - Control flow obfuscations (Statement grouping,
ordering, computation, opaque constructs) - Data obfuscation (Storage, encoding, grouping,
ordering) - Preventative transformations (prevent decompilers
from operating by exploiting weaknesses) - Inherent (aliases, variable or bogus
dependencies, opaqueness side effect
difficulty) - Targeted
- Source code
- N-version programming
- Functional-behavior preserving diversity in
components used (e.g., different encryption
algorithms, different scales for data such as
Celsius or Fahrenheit) - Semantics preserving source code transformations
- Place sensitive data (such as function and data
pointer) below the starting address of any buffer - Variable ordering
- Equivalent instructions
- Variable compilation --Variable internal names,
padding and addresses, linking orders - Insertion of opaque constructs or other dead code
to change memory layout - Binary code
- Address transformations (relative and absolute)
on binary code - Randomize base address of memory regions (Stack,
Heap, DLL, routines/static data in executable)
References shown on later slides
17Multi-Layer Defense Strategy
Prevent Remote Exploit of Memory Errors
GITI
Prevent Injected Code from Properly Executing
Prevent Access to Windows DLLs
UC Davis
Prevent Use of Windows DLLs
GITI UC Davis
Prevent the Bypass of DLLs
18Diversity System Functional ArchitectureNormal
Normal user inputs are translated
Modified loader transforms original stored
program and generates wrapper that retranslates
external calls
untranslated so they work
User Inputs
Other System Resources
Original Program
Modified Loader
Transformed In-memory program
Annotation File
PRN
19Diversity System Functional ArchitectureInitial
Exploit
Modified loader transforms original stored
program and generates wrapper that retranslates
external calls
Attacker
Other System Resources
Original Program
Modified Loader
Transformed In-memory program
Some attacks fail because assumed vulnerability
is gone
Annotation File
PRN
20Diversity System Functional ArchitecturePayload
Execution
Modified loader transforms original stored
program and generates wrapper that retranslates
external calls
Attacker
Other System Resources
Original Program
Modified Loader
Transformed In-memory program
Other attacks fail because injected commands are
wrong
Annotation File
PSN
21DAWSON Implementation Concept (I)
Approach eases integration of various transform
techniques
DAWSON Randomizer
PE File Editor
Original Binary Program on Disk
PE File Macro Randomizer
Modified Binary Program in Memory or on disk
API
PE File Component Randomizers
Program Annotation File
API
API
API
API
Transform Technique 1
Object 1
Transform Technique 2
Object 2
Transform Technique 3
Object 3
o o o
o o o
PRNG
Transform Technique N
Object K
22DAWSON Implementation Concept (2)
Manual Approach How best to automate?
Original Binary Code
Disassembled Code
(Optional) Decomiled Code
Structure (Pattern) Analyzer
Structure (Pattern) Modifier
(Optional) Recompiled Code
Reassembled Code
Modified Binary Code
23DAWSON Implementation Concept (2)
Automated Runtime Randomization
Binary Rewriting Randomizer
Modified Binary Code
Original Binary Code
Disassembled Code
(Optional) Decomiled Code
Structure (Pattern) Analyzer
Structure (Pattern) Modifier
Annotation File
24Automated Defensive Transformations
25Automated Layout Randomizations
26Exploit Breaking Transforms
- Randomize base of stack by a large number
(preferably, by 100MB or more for single-threaded
programs) - Randomize locations of installed DLLs
- Manage all of the DLLs installed on a system
- Ensure that they get mapped to non-overlapping
locations - Change the mappings periodically
- Need simple management tools to make all of this
happen - Randomize location of functions in the
executable. - Randomize base of the heap and the distance
between two successively allocated heap blocks - Randomize location of static variables in the
executable
27Transform Characterization
- Type of transformation?
- What object is transformed?
- When can transformation occur?
- Pre-load, load, post-load?
- Are annotations or compiler support required?
- What types of exploits, payloads are impacted?
- How difficult is implementation of transform?
28Random Stack Rebasing
- Linear Randomization over the Range
- 4K Byte Granularity
- Effective address domain approximately 1.0 GB
limited by the demands of other process segments. - Approximate 256K distinct bases are possible
- Two approaches examined
- One approach requires modification of the loader
to implement in the NT-2000-XP environments - Second approach increase stack reserve space in
PE file and decrements ESP does not require
loader modification
Note that stack rebasing can be implemented
directly in the PE file for DOS applications
29Memory Topology 2 GByte User Space (Win32)
typical
Randomization Domain
0x00041000
0x03600000
0x00100000
Stack Base
30Modifying Static Variable Locations (1)
- Preamble Postamble code generated by compiler
- Code Block built by developer
- Padding inserted by compiler
Preamble
Code Block
Postamble
Padding
31Modifying Static Variable Locations (2)
- Preamble modified to increase size for local
variables - Code Block modified to use new offsets for local
variables - Postamble stays unchanged
Preamble
Code Block
Postamble
Padding
32Example Original Assembly Code
Preamble
Code Block
Postamble
Padding
33Modified Assembly Code
Preamble
Code Block
Postamble
Padding
34Diversification of the Windows Vulnerability
Environment
Karl Levitt, Hao Chen, Matt Bishop, Zhendong Su,
Jeff Rowe, Ivan Balepin, Ebrima Ceesay, Tufan
Demir, Bhume Bhumiratana, Lynn Nguyen, Daisuke
Nojiri UC Davis Computer Security Lab
35The Problem
- Microsoft Windows provides the ideal conditions
for epidemic cyber-attacks - Plenty of software vulnerabilities (root level
buffer overflows). - Widespread installation of identical software
- Attack prevention in MS Windows is difficult
- No protection via compiler modification
- No source code for the OS or applications
- A single scripted exploit works against
- Any machine
- All machines
36Diversification of the Windows Vulnerability
Environment
- Windows executables typically call API functions
for any significant task - All API functions are provided in DLLs.
- Load address of API functions is not known until
the program loads - Load address of API functions varies from host to
host - Major goal of Windows exploits is to locate the
addresses of critical DLL functions
37Multi-Layer Defense Strategy
Prevent Remote Exploit of Memory Errors
Prevent Injected Code from Properly Executing
Prevent Access to Windows DLLs
Prevent Use of Windows DLLs
Prevent the Bypass of DLLs
38Outline
- How Code Red and Slammer work
- Permute IAT and Change DLL name strings Defeat
known attacks - Hypothesized attacks that will succeed
- Parameter modification Padding, transformation
- Preventing direct system calls from injected code
- Towards quantitative analysis of our approaches
- Techniques for binary rewriting
- Demonstration
39How does DLL system work?
80000000
stack
kernel32.dll
20000000
LoadLibraryA()
77E9D961
IAT
010031A0
77E9D961
LoadLibraryA
.text
77E80000
Call 010031A0
01001000
heap
00070000
65D60000
00000000
40Code Red Worm
stack
kernel32.dll
Injected code
LoadLibraryA()
77E9D961
20000000
.text
EAT
LoadLibraryA 77E9D961
01001000
heap
KERNEL32
77E80000
00070000
00000000
41SQL Slammer/Sapphire
stack
Injected code
kernel32.dll
77E9D961
LoadLibraryA()
20000000
.text
sqlsort.dll
01001000
IAT
77E9D961
heap
77E80000
00070000
00000000
42Preventing DLL Access
- Add Synthetic Diversity to Windows PE Format
- Permutation of the Import Address Table
- Random String replacement of DLL names and
functions
43Randomize Plain Text Strings
PEB
stack
Injected code
KERNEL32 77E80000
a7Ly4SZq19 77E80000
20000000
kernel32.dll
.text
LoadLibraryA()
77E9D961
IAT
LoadLibraryA
4Cu74xIpI9q2
EAT
Call 010031A0
LoadLibraryA 77E9D961
4Cu74xIpI9q2 77E9D961
01001000
heap
KERNEL32
a7Ly4SZq19
77E80000
00070000
00000000
44Permute IAT
80000000
stack
Call 010031A0
kernel32.dll
20000000
77E9D961
LoadLibraryA()
IAT
LoadLibraryA
010031A0
77E90332
.text
77E9D961
GetProcAddress
0100308C
77E90332
GetProcAddress()
77E80000
Call 010031A0
Call 0100308C
01001000
heap
00070000
65D60000
00000000
45Preventing DLL Access
- Add Synthetic Diversity to Windows PE Format
- Permutation of the Import Address Table
- Random String replacement of DLL names and
functions
46Some Assumptions
- Attacks are remote, automated and non-directed
- Attacker cannot observe the execution of valid
programs without using system calls - Processes cannot transition from user mode to
kernel mode without using system calls - Attacker cannot automate static analysis of
memory contents - Modification is limited to binary (or memory)
editing source code is unavailable
47Preventing DLL Access
- Add Synthetic Diversity to Windows PE Format
- Permutation of the Import Address Table
- Random String replacement of DLL names and
functions - Add Diversity to Binary Code
- Randomize Base Addresses
- Reorder code blocks
- Interleave nonfunctional code block
48Hypothesis Operand hijacking
80000000
PEB
stack
Injected code
20000000
kernel32.dll
LoadLibraryA()
77E9D961
IAT
.text
0100308C
77E9D961
77E80000
Call 0100308C
01001000
heap
00070000
65D60000
00000000
49Binary Transformation
80000000
stack
kernel32.dll
20000000
77E9D961
IAT
010031A0
.text
1
3
1
77E80000
2
3
2
65D60000
50Binary Transformation
80000000
stack
kernel32.dll
20000000
77E9D961
IAT
010031A0
.text
3
77E80000
2
1
2
65D60000
51Challenges in Binary Rewriting
80000000
stack
kernel32.dll
20000000
77E9D961
Indirect Jumps
IAT
010031A0
.text
1
3
JMP EAX Call EBX
X
1
77E80000
jmp EAX call EBX
2
Function Pointers
3
2
65D60000
52Our Binary Rewriting Approach
80000000
stack
kernel32.dll
20000000
77E9D961
LoadLibraryA
IAT
010031A0
77E9D961
.text
jmp 697FA0D6
Call 010031A0
77E80000
cmp eax, ebx jde 6687EF03 call 77E9D961 jmp
65D833AE call 77E804AA
53Preventing DLL Access
- What About Brute Force Searches for DLL
Addresses? - Insert code and table entries that point to
invalid addresses ? page fault, start over - Attempted execution of inserted dead code blocks
? dereference null pointer, start over - Landmines Insert code and table entries pointing
to similar code segments that actually generate
alarms - Other Issues
- Insertion of redundant DLLs
- Runtime Diversification
54Preventing DLL Access
- Key points
- No performance impact upon running programs
- Access prevention is policy free
- Challenges
- Safety properties of address transformation
- Impact of increased program size in memory
55Preventing DLL Use
- For attacks that locate the proper DLLs, diverse
transformations prevent their use - Parameters passed to DLL functions are
transformed per machine - Asymmetric parameter value transformation
- Additional parameter padding
56Function Parameters in Assembly
push EDI push EBX
80000000
call 77E9D961
stack
kernel32.dll
20000000
77E9D961
GetProcAddress()
mov ECX, EBP0xc mov EDX, EBP0x8
IAT
77E9D961
010031A0
.text
push EAX push EBX
77E80000
01001000
call 010031A0
65D60000
00000000
57Parameter Padding
push EDI push EBX
80000000
call 77E9D961
stack
kernel32.dll
20000000
77E9D961
GetProcAddress()
IAT
mov ECX,EBP0x14 mov ECX,EBP0x10
77E9D961
010031A0
.text
push EAX push EBX
mov ECX, EBP0xc mov EDX, EBP0x8
push ECX push EDX
77E80000
01001000
call 010031A0
65D60000
00000000
58Asymmetric Parameter Transformation
push EDI push EBX
80000000
call 77E9D961
stack
kernel32.dll
20000000
77E9D961
GetProcAddress()
mov ECX, EBP0xc mov EDX, EBP0x8
IAT
77E9D961
010031A0
.text
jmp 00045234 cmp EDI, EBX jde 00897EF1
Reverse Transformation
77E80000
push EAX push EBX
01001000
call 010031A0
T ans or tio
00045234
r f ma n
65D60000
00000000
59Preventing DLL Bypass
- Problem Attacker can provide assembly components
that implement some DLL functions making direct
low level (undocumented) Windows system calls - Trap System Interrupts for Runtime Checking
60Signing System Calls by Location
- Post-load pre-execute binary instrumentation
- What is instrumented
- System call ID.
- In Linux system call ID is stored in eax and
interrupt is issued. - We substitute original syscall_id with signed_id
(stored in eax prior to interrupt) - Advantage
- Preserve system consistency. Programs are
modified only after theyre loaded.
foo.exe
Foo in memory
Normal load and link
Instrument in memory
execute
- Address is 32-bit address of the location where
system call is made
Ek Fast trapdoor permutation with secret key
k. F0,132 ? 0,124 token F(Address) eax
signed_id Ek(token syscall_id)
- syscall_id is only 8 bit (only about 200
syscalls exist in Linux)
61Authentication
- Assume
- Non-bypassibility - Every time a program makes
system call, we always intercept it before the
kernel. - Memory trace inspection We need to inspect the
stack of the program. - Method
- Decrypt the signed_id.
- (token syscall_id) Dk(eax) eax contains
the signed_id - Inspect the program stack for return address.
Compute token of the address - check_token F(Memoryesp)
- If check_token token, then
- set eax syscall_id
- Forward the system call to kernel
- Otherwise fail
62Limitations
- Only authenticate whether system call is made
from original source or not. Attacker can still
use library function to do system call. - Possible solution is to inspect further up the
stack.
63Cross Layer Commonalities
- Address Obfuscation (to defeat trivial static
analysis of memory) - Insertion of non-functional blocks
- Basic block permutation
- Permutation and Insertion of dummy function
parameters - Run-time obfuscation
- Fail-crash Detection Mechanism (to defeat brute
force trial and error) - Insert invalid virtual address (page fault)
- Execution of dead code (deref. null pointer
error) - Deliberate Landmines
64Evaluation
- Use the DETER Testbed
- Based upon the Emulab technology with extra
security controls - On demand large scale testbed for the testing and
evaluation of security tools - Deploy hundreds (thousands) of hosts with diverse
configurations. - Try single attacks against all machines
- Red Team could launch automated worm attacks
against all machines
65320 Virtual Node Experiment
- Each node has its own OS, filesystem, processes,
network interfaces. - 32 gateways
- 10 hosts per gateway
- Emulate 320 nodes totally
- Colocate factor 10
- Use 44 physical nodes
66Status
- Weve have implemented and demonstrated
diversification of Windows PE format for attack
prevention. - We are implementing the reordering of binary code
blocks and insertion of dead code blocks. - We are implementing the asymmetric transformation
of function parameters using dummy functions. - Investigating potential non-bypassability
mechanisms
67Probability of Successful Attacks
- Pr(A) Pr(V)/DE(A) PEE(A)
- Success probability of attack A exploiting
vulnerability V - DE denotes derandomization effort
- Range of randomization of addresses involved in A
- Requires randomization to change after each
successful derandomization - If rerandomization happens after k attempts,
multiply Pr(A) by k - No rerandomization gt effect of DE and PEE is
additive - PEE denotes payload execution effort
- Attempts to successfully execute attack payload
- Note System susceptibility to attacks can be
reduced without addressing every vulnerability
68What can the injected code do?
- DLL access
- Walk PEB to learn base addresses
- O(dlls loaded)
- DLLs (except a few) will be renamed, prevent
search by name - Intercept dynamically loaded ones to rename
- Can increase the number of loaded dlls
69Contd
- Access to function in DLLS
- Walk IAT of the application
- O(entries in IAT)
- Permute the IAT
- Can increase the size of IAT
- Add invalid addresses
- Scan the code section for call imm_addr
- Replace imm_addr with computed goto
- Force static analysis
70State-of-Art in Binary Analysis Transformation
71Motivation
- No source code needed
- Language-neutral (C, C or other)
- Can be largely independent of OS
- Ideally, would provide instruction-set
independent abstractions - This ideal is far from todays reality
- Applications in
- Instrumenting long-running programs
- Legacy code migration
- Program optimizations
- Security
- Program obfuscation, security-enhancing
transformations
72Approaches
- Static analysis/transformation
- Binaries files are analyzed/transformed
- Benefits
- No runtime performance impact
- No need for runtime infrastructure
- Weakness
- Prone to error, problem with checksums/signed
code - Dynamic analysis/transformation
- Code analyzed/transformed at runtime
- Benefit more robust/accurate
- Weakness
- Some runtime overhead
- Runtime infrastructure needed
73Previous Works (Static)
- OM/ATOM (DEC WRL)
- Proprietary and probably outdated
- EEL (Jim Larus et al, PLDI 95)
- The precursor of most modern rewriters
- Targets RISC (SPARC)
- Provides processor independent abstractions
- Follow up works
- UQBT (for RISC)
- LEEL (for Linux/i386)
74Previous Works (Static)
- WISA (U. Wisconsin)
- Uses EEL for SPARC
- Uses IDAProCodeSurfer for x86
- Etch (U. of Washington) x86/Windows
- Application in performance optimization
- Does not seem to be active any more
- PLTO/SOLAR (U. Arizona)
- Linux/x86, but has limitations (e.g. static
linking) - Brew (Stony Brook)
- DisassemblyRAD implementation
- Various tools for Java
- BCEL seems most advanced
75Previous Works (Dynamic)
- DynInst (U. Maryland, U. Wisconsin)
- Instrumentation of running programs
- Provides OS/architecture independent abstractions
for instrumentation - LibVerify (Bell Labs/RST Corp)
- Runtime rewriting for StackGuard
- DynamoRIO (HP Labs/MIT), Strata (UVA)
- Disassembles basic blocks at runtime
- Provides API to hook into this process and
transform executable - Used in Program Shepherding USENIX Sec '02
76Most Active Research Groups
- WISA project (Wisconsin)
- Somesh Jha, Tom Reps
- DynInst project (Wisconsin/Maryland)
- Barton Miller
- SOLAR project (U. Arizona)
- Saumya Debray
- DynamoRIO (HP/MIT)
- Tool available (binary form), Linux/Win32
- Strata (UVA)
- Tool claimed to be available in source form
77Most Promising Tools
- BREW Stony Brook
- DynamoRIO
- IDAPro/CodeSurfer (Commercial)
- DynInst is robust, but capabilities limited for
our purpose - Strata may be good, and is supposedly available
in source code, but may not be as mature as
DynamoRIO
78Phases in Static Analysis of Binaries
- Disassembly
- Instruction decoding/understanding
- Insertion of new code
79Questions You Might Ask
- How many variants does defense require?
- 100 by contract, but the more the better
- Why design in depth?
- Belt and suspenders, but get multiplicative
advantage from multiple randomization --
hopefully - How do we assure defense achieves multiplicative
effect from multiple stages of randomization? - From different phenomenafail-crash (attacker has
to retry attack), independence of stages - How do we achieve multiplicative effect within a
stage, e.g., IAT randomization? - E.g., prevent attacker from doing DE for a DLL at
a time, e.g., Kernel32
80Questions You Might Ask (2)
- How often does defense re-randomize?
- Depends on cost of re-randomization (down-time),
DE, number of variants needed - What is the cost of randomization?
- Low for IAT randomization, low for parameter
padding, unknown but determinable for other
kinds, e.g., parameter value transformation,
return address authentication to prevent system
calls from injected code - How does defense do control flow randomization
for subtle optimized code? - Potentially difficult because static analysis of
binary code is hard, but will accept only sound
transformations - Can attacker do de-randomization in payload?
- Very unlikely
81Questions You Might Ask (3)
- What if attacker obtains defenses randomization
algorithm and sample randomized code -- known
plaintext and cybertext attack? - Through static analysis he might generate all
variants, but cannot use them in a payload to
compromise more than a fraction of the hosts
this depends on fail crash assumption - Can attacker bypass randomization stages?
- Hopefully this is achieved only if kernel is not
secure - How do we verify this assertion?
- Careful analysis and lots of testing
- Is all this new?
- Builds on existing obfuscation work, but much is
new defense in depth, parameter transformation,
random space analysis
82Questions You Might Ask (4)
- Do we have the staff to investigate the numerous
issues posed? - ???
- Do we have a plan for all this?
- Yes!!!
83DAWSON Next Steps
- Continue developing automated transforms to break
exploits specifications - Implement five key transforms and evaluate cost
and effectiveness - Evaluate alternative annotation approaches and
implement - Look specifically at non-buffer overflow attacks
- Continue developing automated transforms to break
payload specifications - Prevent brute force searches
- Obfuscating code and landmines
- Evaluate integration approaches
- PE Editor style v Brew style (v. DynamoRIO style)
- Integrate transforms and test
- Pre-loader prefered
- Loader hooks, if absolutely required, or
DynamoRIO style
84Demonstrations Tonight
Thank You!
85Backup
86Collberg, Thomborson, Low
- First systematic studies of Java code obfuscation
- Produced taxonomy (layout, control flow, data,
and preventative transforms) - Low-cost, stealthy opaque constructs
- Techniques for obscuring data structures and
abstractions - Measured effectiveness using software complexity
metrics
87Wang
- Studied malicious host problem to protect trusted
probe communicating with trusted host - Key threats impersonation, intelligent
tampering, input spoofing, not DOS or random
tampering - Input spoofing, in general, unsolvable but
- If spoofing input x requires solving the
algortihm-secrecy or execution-integrity problem,
then techniques to ensure the later can be used
to counteract input spoofing. However, there are
applications where this is not possible. - Pervasive aliasing enabled proof precise
analysis of transformed program (e.g., CFG) is NP
hard - Replacing 50 of branches gt
- Execution time 4X
- Size 2X
- Wroblewski extended ideas and implemented purely
sequential, controllable approach that worked on
binary code
88Linn and Debray
- Rewrote binaries (IA-86) to disrupt major static
disassembly approaches (linear sweep and
recursive traversal) - Best commercial tools failed on 65 of
instructions and 85 of functions - Execution times 1.13 X
- Executable size 1.15-1.20 X
89Digital Rights Management
- Malicious host is key problem in DRM
- White box cryptography approach
- Chow et al.
- Notwithstanding Barak, can provide useful
commercial levels of security - Obscured DES and AES algorithms
- Jacobs et al.
- Broke obscured DES but showed general problem of
retrieving data from circuits is NP hard - Admitted that, in practice, usually easy
- Link and Neumann improved on Chow
90Barak et al.
- Seminal proof showed
- Impossibility of completely obscuring code
- No general obfuscator possible
- Badger et al. began to extend Wangs work
- Unable to prove minimum resistance time to
reverse engineering effort - Redirected to review obfuscation work (tour de
force report)
91Mitigating Vulnerabilities in Code
- Forrest et al. randomized stack resident data
addresses via modified gcc compiler - Chew and Song randomized stack base address,
system call numbers library entry points via
modifying Linux loader and kernel system call
table and binary rewriting - Xu et al. modified Linux kernel to randomize base
addresses of program regions - Approaches still vulnerable to relative address
attacks
92Forrest et al.
- Scrambled executable (prn), then unscrambled
through modified code emulator (x86) - Speed 1.05 X
- Memory usage 3 X
- Discussed danger of generating valid instruction
during scrambling but did not see experimentally - Kc produced similar results
93Bhaktar et al.
- Key difference between program obfuscation and
address obfuscation is that program obfuscation
is oriented towards preventing most static
analyses of a program, while address obfuscation
has a more limited goal of making it impossible
to predict the relative or absolute addresses of
program code and data. Other analyses, including
reverse compilation, extraction of flow graphs,
etc., are generally not affected by address
obfuscation - Focused on memory error exploits
- Randomized absolute/relative addresses in Linux
binary code - Approach offered protection against classic
attacks - Stack smashing, existing code exploits, format
string, data modification, heap overflow,
double-free, integer overflows - Data modification attacks still possible but Etoh
and Yoda approach could help
94Performance of Bhaktar Transforms
Combination 1 link time static relocation of
stack, heap and code regions with random gaps in
stack frames Combination 2 load time dynamic
relocation of above
95Vulnerabilities and Exploits
- Aleph One, Smashing The Stack For Fun And
Profit, Phrack 49, Volume Seven, Issue
Forty-Nine, File 14 of 16, 11/8/1995 - David Litchfield, Defeating the Stack-Based
Overflow Prevention Mechanism of Microsoft
Windows 2003 Server, NGS Research Whitepaper,
August 9, 2003, http//www.nextgenss.com/papers.ht
m - Mudge, How To write buffer overflows,
http//www.insecure.org/stf/mudge_buffer_overflow_
tutorial.html, 10/20/1995 - w00w00, Heap Overflow, http//www.w00w00.org/fil
es/articles/heaptut.txt, 1/1999 - Ryan Permeh, Marc Maiffret, Code Red Disassembly
Analysis, eEye Digital Security,
http//www.eeye.com/html/advisories/codered.zip. - Stuart Staniford, Nicholas Weaver, Vern Paxson.
Flash Worms Is there any Hope? Silicon
Defense, Retrieved 27 March 2003
lthttp//silicondefense - Stuart Staniford, Vern Paxson, Nicholas Weaver.
How to Own the Internet in Your Spare Time,
Proceedings of the 11th USENIX Security
Symposium. August 2002, Retrieved 27 March 2003,
lthttp//www-dirt.cs.unc.edu/netlunch/fall02/SPW02-
worms.htmgt
96Software Fault Tolerance N-version Programming
- A.Avizienis, Fault Tolerance and fault
intolerance. Complimentary approaches to reliable
computing, Proc. 1975 Int. Conf. Reliable
Software, Los Angels, CA, Apr 21- 27, 1975, pp
458 - 464 - A.Avizienis, N-Version Approach to fault
tolerant Software, IEEE-Software eg., vol- SE11,
No12, Dec 1985, pp.1491 -1501 - V. Bharathi, N-Version programming method of
Software Fault Tolerance A Critical Review,
Indian Institute of Technology, Kharagpur 721302,
December 28-30, 2003 - L. Chen and A. Avizienis, "N-version programming
A fault-tolerance approach to reliability of
software operation," IEEE 8th FTCS, pp. 3-9, 1978 - J.C. Knight and N.G. Leveson, A Large Scale
Experiment In N-Version Programming, Digest of
Papers FTCS-15 Fifteenth International Symposium
on Fault-Tolerant Computing, June 1985, Ann
Arbor, MI. pp. 135-139. - J.C. Knight and N.G. Leveson, An Experimental
Evaluation of the Assumption of Independence in
Multi-version Programming, IEEE Transactions on
Software Engineering, Vol. SE-12, No. 1 (January
1986), pp. 96-109. - M.R. Lyu, J.-H. Chen, and A. Avizienis, "Software
diversity metrics and measurements," In Proc. The
Sixteen Annual Int. Computer Software and
Applications Conf. 1992, pp. 69-78.
97Obfuscation -- Java Code
- C. Collberg, C. Thomborson, and D. Low. A
Taxonomy of Obfuscating Transformations.
Technical Report 148, Department of Computer
Science, University of Auckland, July 1997. - C. Collberg, C. Thomborson, and D. Low.
Manufacturing Cheap, Resilient, and Stealthy
Opaque Constructs Department of Computer
Science, University of Auckland. ACM
SIGPLAN-SIGACT Symposium on Principles of
Programming Languages (POPL'98). January 1998 - C. Collberg, C. Thomborson, D. Low. Breaking
Abstractions and Unstructuring Data Structures,
Proceedings of the 1998 International Conference
on Computer Languages, pages 28-38. IEEE Computer
Society Press. May 1998. - Larry DAnna, Brian Matt, Andrew Reisse, Tom Van
Vleck, Steve Schwab, Patrick LeBlanc,
Self-Protecting Mobile Agents Obfuscation Report
- Final report, Network Associates Laboratories,
Report 03-015, June 30, 2003 - Lee Badger, Larry D'Anna, Doug Kilpatrick, Brian
Matt, Andrew Reisse, Tom Van Vleck.
Self-Protecting Mobile Agents Obfuscation
Techniques Evaluation Report, Network Associates
Laboratories, Report 01-036, Nov 30, 2001,
updated March 22, 2002. - Douglas Low, Java Control Flow Obfuscation, MS
Thesis, Univ. Auckland, 3 June 1998
98Obfuscation -- Protecting Software
- Boaz Barak, Oded Goldreich, Russell Impagaliazzo,
Steven Rudich, Amit Sahai, Salil Vadhan, and Ke
Yang. On the (im)possibility of obfuscating
programs. In J. Kilian, editor, Advances in
Cryptology-CRYPTO 01, Lecture Notes in Computer
Science. Springer-Verlag. - Stanley Chow, Philip A. Eisen, Harold Johnson,
Paul C. van Oorschot A White-Box DES
Implementation for DRM Applications. Digital
Rights Management Workshop 2002 1-15 - S. Chow, P. Eisen, H. Johnson and P.C. van
Oorschot, White-Box Cryptography and an AES
Implementation'', Proceedings of the Ninth
Workshop on Selected Areas in Cryptography (SAC
2002) - Matthias Jacob, Dan Boneh, and Edward Felten.
Attacking an obfuscated cipher by injecting
faults , 2002 ACM Workshop on Digital Rights
Management. Washington, D.C., 2002 - Hamilton E. Link and William D. Neumann,
Clarifying Obfuscation Improving the Security
of White-Box Encoding, Sandia National
Laboratories, Albuquerque, NM, downloaded from
eprint.iacr.org/2004/025.pdf - Chenxi Wang, A Security Architecture for
Survivability Mechanisms. PhD thesis, University
of Virginia, October 2000. - Chenxi Wang, "Protection of software-based
survivability schemes", in the proceedings of
2001 Dependable Systems and Networks. Gutenburg,
Sweden. July 2001. - w00w00, Heap Overflow, http//www.w00w00.org/fil
es/articles/heaptut.txt, 1/1999 - Gregory Wroblewski, General Method of Program
Code Obfuscation, PhD Dissertation, Wroclaw
University of Technology, Institute of
Engineering Cybernetics, 2002. - Gregory Wroblewski General Method of Program
Code Obfuscation, 2002 International Conference
on Software Engineering Research and Practice
(SERP02), June 24 - 27, 2002, Monte Carlo
Resort, Las Vegas, Nevada, USA - Hamilton E. Link and William D. Neumann,
Clarifying Obfuscation Improving the Security
of White-Box Encoding, Sandia National
Laboratories, Albuquerque, NM, downloaded from
eprint.iacr.org/2004/025.pdf - Cullen Linn, Saumya Debray, Obfuscation of
Executable Code to Improve Resistance to Static
Disassembly, ACM Conference on Computer and
Communications Security, Washington DC, October
27-31, 2003.
99Source Code Transforms to Mitigate Vulnerabilities
- M. Chew, D. Song. Mitigating Buffer Overflows by
Operating System Randomization, Technical Report
CMU-CS-02-197. - Hiroaki Etoh and Kunikazu Yoda. Protecting from
stack smashing attacks. Published on
World-WideWeb at URL http//www.trl.ibm.com/projec
ts/security/ssp/main.html, June 2000. - Stephanie Forrest, Anil Somayaji, and David H.
Ackley. Building diverse computer systems. In
6th Workshop on Hot Topics in Operating Systems,
pages 67-72, Los Alamitos, CA, 1997. IEEE
Computer Society Press. - Selvin George, David Evens, Steven Marchette. A
Biological Programming Model for Self-Healing,
First ACM Workshop on Survivable and
Self-Regenerative Systems (in association with
10th ACM Conference on Computer and
Communications Security) October 31, 2003, George
W. Johnson Center, George Mason University,
Fairfax, VA - Pax. Published on World-Wide Web at URL
http//pageexec.virtualave.net, 2001. - Jun Xu, Z. Kalbarczyk and R. K. Iyer.
Transparent Runtime Randomization for Security.
Proc. of 22nd Symposium on Reliable and
Distributed Systems (SRDS), Florence, Italy,
October 6-8, 2003 - StackGuard, Libverify, RAD, PointGuard, MS C
compiler - Peter Silberman and Richard Johnson, A Comparison
of Buffer Overflow Prevention Implementations and
Weaknesses, I-Defense, 1875 Campus Commons Dr.
Suite 210 Reston, VA 20191, http//www.blackhat.co
m/presentations/bh-usa-04/bh-us-04-silberman/bh-us
-04-silberman-paper.pdf
100Run-time Transforms to Mitigate Vulnerabilities
- Elena Gabriela Barrantes, David H. Ackley,
Stephanie Forrest, Trek S. Palmer, Darko
Stefanovic and Dino Dai Zovi, Randomized
instruction set emulation to disrupt binary code
injection attacks, 10th ACM Conference on
Computer and Communications Security, Washington
DC, October 27-31, 2003. - Sandeep Bhatkar, Daniel C. DuVarney, and R.
Sekar, Address Obfuscation An Efficient
Approach to Combat a Broad Range of Memory Error
Exploits, 12th USENIX Security Symposium, August
2003. - Gaurav S. Kc, Angelos D. Keromytis, Vassilis
Prevelakis, Countering Code-Injection Attacks
with Instruction-Set Randomization, 10th ACM
Conference on Computer and Communications
Security, Washington DC, October 27-31, 2003.
101CMU Ballista Study
- Most production quality operating system and core
library code exhibit large numbers of flaws in
validating input, call order, etc. - Specification-driven testing verifies this result.
102SANS Top 10 Top Vulnerabilities to Windows Systems
- W1 Web Servers Services
- W2 Workstation Service
- W3 Windows Remote Access Services
- W4 Microsoft SQL Server (MSSQL)
- W5 Windows Authentication
- W6 Web Browsers
- W7 File-Sharing Applications
- W8 LSAS Exposures
- W9 Mail Client
- W10 Instant Messaging
103SANS Top Vulnerabilities to UNIX Systems
- U1 BIND Domain Name System
- U2 Web Server
- U3 Authentication
- U4 Version Control Systems
- U5 Mail Transport Service
- U6 Simple Network Management Protocol (SNMP)
- U7 Open Secure Sockets Layer (SSL)
- U8 Misconfiguration of Enterprise Services
NIS/NFS - U9 Databases
- U10 Kernel