Diversity Algorithms for Worrisome Software and Networks DAWSON

About This Presentation

Title:

Diversity Algorithms for Worrisome Software and Networks DAWSON

Description:

bin/sh', '/etc/passwd' Pointer to injected code. Pointer to existing code ... Code Red Worm. 01001000. 00070000. heap .text. 00000000. 77E80000. kernel32.dll ... – PowerPoint PPT presentation

Number of Views:113

Avg rating:3.0/5.0

Slides: 104

Provided by: markco54

Category:

more less

Transcript and Presenter's Notes

Title: Diversity Algorithms for Worrisome Software and Networks DAWSON

1
Diversity Algorithms for Worrisome Software and
Networks(DAWSON)

James Just, Mark Cornwell, Jason Minto, Art
Torrey
Global InfoTek, Inc.
Karl Levitt, Jeff Rowe, Tufan Demir
UC Davis
R. Sekar
Consultant (SUNY Stony Brook)
27 January 2005

2
Overview

Project overview
Integration framework
Diversity to break exploits
Diversity to break payloads
Framework for analyzing effectiveness
Binary rewriting review
Next steps

3
Problem Space (1) Excessive Homogeneity gt
Systemic Vulnerability
How prevent exponentially cascading failures?

Attacks exploit dense environment with ease to
spread fast and/or far
Foreseeable cyber-risks dominated by static,
durable monoculture of executables

4
Problem Space (2) Common Mode Failures Impede
Intrusion Tolerant Systems

Intrusion tolerant systems are
Expensive
Have large hw/sw footprints
Assume a priori knowledge of
attack modalities
Success depends on availability of spare
components
Assumption of independent intrusions/faults is
flawed
Availability of diverse commercial spares limits
effectiveness even if intrusion tolerance system
affordable
Rapid learning of attack signatures for blocking
is hard
Custom N-version programming is costly

5
DAWSON Approach

Randomized transforms of Windows executables at
runtime
Preserve functionality of executable modules
(e.g., dll)
Transform binary code, machine addresses, names,
etc
Use annotations to facilitate
Pseudo-random numbers produce unique
transformations on each application restart
Network protocol diversity effort replaced by
breaking payload execution
Goal Beat program metric by 10X for large
fraction of exploit space if transforms are
focused
100 functional equivalents with no more than 3
susceptible to same exploit as baseline code for
most exploits
Low overhead transforms (runtime performance)

6
Attack Space of Interest Memory Error Exploits
Memory corruption attacks

Corrupt target of existing pointer
Compromise security critical data
File names opened for write or execute
Security credentials -- has the user
authenticated himself?

Corrupt a pointer value
Includes common buffer overflows, strncpy(),
off-by-one, cast screw-up, format strings,
double-free, return to libc, other heap structure
exploits

Corrupt code pointer
Return address
Function pointer
Dynamic linkage tables (GOT, IAT)

Corrupt data pointer
Frame pointer
Local variables, parameters
Pointer used to copy input

Pointer to injected data

Pointer to existing data
Example corrupt string arguments to
functions so that they point to attacker
desired data already in memory, e.g.,
/bin/sh, /etc/passwd

Pointer to injected code
Pointer to existing code
7
Evaluation

Identify assumptions and ROE for possible Red
Teaming
Internal testing with
Fabricated applications with known
vulnerabilities and exploits
Real applications with known vulnerabilities and
exploits
Possible use of Emulab or Deter network emulation
testbeds

8
Some Assumptions Red Team ROEs

Attacks are remote, automated and non-directed
Attacker cannot observe the execution of valid
programs without using system calls
Processes cannot transition from user mode to
kernel mode without using system calls
Attacker cannot automate non-trivial static
analysis of memory contents
Modification is limited to binary (or memory)
editing source code is unavailable

9
Status

Interim products
Native Windows (MFC, .Net) PE File Editor
Transforms
Automated permutation of the Import Address Table
in PE files
Automated replacement of DLL names and functions
with random strings in PE files
Local variable location modification not quite
automated yet
In-process
Reordering of binary code blocks and insertion of
dead code blocks
Asymmetric transformation of function parameters
using dummy functions.
New insights on requirements
Address obfuscation (to defeat trivial static
analysis of memory)
Fail-crash detection mechanism (to defeat brute
force trial and error)
Non-by-passability of transform mechanisms
Balzer wrapper mechanism UC Davis investigating
option from another project
Because of perceived higher value, shifted some
effort to developing diversity to break payloads
rather than diversity for network protocols

10
Transition and Future Work

Interim use intermediate products by other SRS
contractors
Integrate with follow-on projects/products
If successful
Package for military users
Possible GITI commercial product
Possible open source toolkit approach
Transition to Microsoft or other software vendors
Some expressions of VC interest
Standard research publications

11
DAWSON Project Schedule Milestones

Baseline Tasks
1. Requirement Refinement
Exploit Diversification
3. Payload Diversification
4. Integration
5. TE
Program Mgt.
Prototypes

FY04
FY05
FY06
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q2
1
2
3
12
GITI
13
Current Attack Problem
Software Specification
Source Code
Machine Code Specification
Executable Code
Known V-Spec
Vulnerability
Loader
Machine-level Code
Exploit Payload
14
Vulnerability Specification?

All aspects of a program execution which can be
exploited by malicious code to gain control of
the Program Counter, e.g.,
Memory Topology
Stack Specification
System APIs
Application APIs
Libraries
Exception Handling
Etc

15
Breaking the V-Spec
Software Specification
Source Code
Machine Code Specification
Executable Code
Known V-Spec
Transforming Loader
Vulnerability
Vulnerability??
Machine-level Code
Machine-level Code
Machine-level Code
Machine-level Code
Doesnt Match X
Machine-level Code
Machine-level Code
Machine-level Code
Machine-level Code
Machine-level Code
Exploit Payload
Machine-level Code
Machine-level Code
Machine-level Code
Machine-level Code
Machine-level Code
Machine-level Code
Transform Specifications
V-Spec Unknown Until Load-Time
16
Transform Techniques in Literature

Obfuscation
Layout obfuscation (scramble identifiers, remove
comments, change formats)
Control flow obfuscations (Statement grouping,
ordering, computation, opaque constructs)
Data obfuscation (Storage, encoding, grouping,
ordering)
Preventative transformations (prevent decompilers
from operating by exploiting weaknesses)
Inherent (aliases, variable or bogus
dependencies, opaqueness side effect
difficulty)
Targeted
Source code
N-version programming
Functional-behavior preserving diversity in
components used (e.g., different encryption
algorithms, different scales for data such as
Celsius or Fahrenheit)
Semantics preserving source code transformations
Place sensitive data (such as function and data
pointer) below the starting address of any buffer
Variable ordering
Equivalent instructions
Variable compilation --Variable internal names,
padding and addresses, linking orders
Insertion of opaque constructs or other dead code
to change memory layout
Binary code
Address transformations (relative and absolute)
on binary code
Randomize base address of memory regions (Stack,
Heap, DLL, routines/static data in executable)

References shown on later slides
17
Multi-Layer Defense Strategy
Prevent Remote Exploit of Memory Errors
GITI
Prevent Injected Code from Properly Executing
Prevent Access to Windows DLLs
UC Davis
Prevent Use of Windows DLLs
GITI UC Davis
Prevent the Bypass of DLLs
18
Diversity System Functional ArchitectureNormal
Normal user inputs are translated
Modified loader transforms original stored
program and generates wrapper that retranslates
external calls
untranslated so they work
User Inputs
Other System Resources
Original Program
Modified Loader
Transformed In-memory program
Annotation File
PRN
19
Diversity System Functional ArchitectureInitial
Exploit
Modified loader transforms original stored
program and generates wrapper that retranslates
external calls
Attacker
Other System Resources
Original Program
Modified Loader
Transformed In-memory program
Some attacks fail because assumed vulnerability
is gone
Annotation File
PRN
20
Diversity System Functional ArchitecturePayload
Execution
Modified loader transforms original stored
program and generates wrapper that retranslates
external calls
Attacker
Other System Resources
Original Program
Modified Loader
Transformed In-memory program
Other attacks fail because injected commands are
wrong
Annotation File
PSN
21
DAWSON Implementation Concept (I)
Approach eases integration of various transform
techniques
DAWSON Randomizer
PE File Editor
Original Binary Program on Disk
PE File Macro Randomizer
Modified Binary Program in Memory or on disk
API
PE File Component Randomizers
Program Annotation File
API
API
API
API
Transform Technique 1
Object 1
Transform Technique 2
Object 2
Transform Technique 3
Object 3
o o o
o o o
PRNG
Transform Technique N
Object K
22
DAWSON Implementation Concept (2)
Manual Approach How best to automate?
Original Binary Code
Disassembled Code
(Optional) Decomiled Code
Structure (Pattern) Analyzer
Structure (Pattern) Modifier
(Optional) Recompiled Code
Reassembled Code
Modified Binary Code
23
DAWSON Implementation Concept (2)
Automated Runtime Randomization
Binary Rewriting Randomizer
Modified Binary Code
Original Binary Code
Disassembled Code
(Optional) Decomiled Code
Structure (Pattern) Analyzer
Structure (Pattern) Modifier
Annotation File
24
Automated Defensive Transformations
25
Automated Layout Randomizations
26
Exploit Breaking Transforms

Randomize base of stack by a large number
(preferably, by 100MB or more for single-threaded
programs)
Randomize locations of installed DLLs
Manage all of the DLLs installed on a system
Ensure that they get mapped to non-overlapping
locations
Change the mappings periodically
Need simple management tools to make all of this
happen
Randomize location of functions in the
executable.
Randomize base of the heap and the distance
between two successively allocated heap blocks
Randomize location of static variables in the
executable

27
Transform Characterization

Type of transformation?
What object is transformed?
When can transformation occur?
Pre-load, load, post-load?
Are annotations or compiler support required?
What types of exploits, payloads are impacted?
How difficult is implementation of transform?

28
Random Stack Rebasing

Linear Randomization over the Range
4K Byte Granularity
Effective address domain approximately 1.0 GB
limited by the demands of other process segments.
Approximate 256K distinct bases are possible
Two approaches examined
One approach requires modification of the loader
to implement in the NT-2000-XP environments
Second approach increase stack reserve space in
PE file and decrements ESP does not require
loader modification

Note that stack rebasing can be implemented
directly in the PE file for DOS applications
29
Memory Topology 2 GByte User Space (Win32)
typical
Randomization Domain
0x00041000
0x03600000
0x00100000
Stack Base
30
Modifying Static Variable Locations (1)

Preamble Postamble code generated by compiler
Code Block built by developer
Padding inserted by compiler

Preamble
Code Block
Postamble
Padding
31
Modifying Static Variable Locations (2)

Preamble modified to increase size for local
variables
Code Block modified to use new offsets for local
variables
Postamble stays unchanged

Preamble
Code Block
Postamble
Padding
32
Example Original Assembly Code
Preamble
Code Block
Postamble
Padding
33
Modified Assembly Code
Preamble
Code Block
Postamble
Padding
34
Diversification of the Windows Vulnerability
Environment
Karl Levitt, Hao Chen, Matt Bishop, Zhendong Su,
Jeff Rowe, Ivan Balepin, Ebrima Ceesay, Tufan
Demir, Bhume Bhumiratana, Lynn Nguyen, Daisuke
Nojiri UC Davis Computer Security Lab
35
The Problem

Microsoft Windows provides the ideal conditions
for epidemic cyber-attacks
Plenty of software vulnerabilities (root level
buffer overflows).
Widespread installation of identical software
Attack prevention in MS Windows is difficult
No protection via compiler modification
No source code for the OS or applications
A single scripted exploit works against
Any machine
All machines

36
Diversification of the Windows Vulnerability
Environment

Windows executables typically call API functions
for any significant task
All API functions are provided in DLLs.
Load address of API functions is not known until
the program loads
Load address of API functions varies from host to
host
Major goal of Windows exploits is to locate the
addresses of critical DLL functions

37
Multi-Layer Defense Strategy
Prevent Remote Exploit of Memory Errors
Prevent Injected Code from Properly Executing
Prevent Access to Windows DLLs
Prevent Use of Windows DLLs
Prevent the Bypass of DLLs
38
Outline

How Code Red and Slammer work
Permute IAT and Change DLL name strings Defeat
known attacks
Hypothesized attacks that will succeed
Parameter modification Padding, transformation
Preventing direct system calls from injected code
Towards quantitative analysis of our approaches
Techniques for binary rewriting
Demonstration

39
How does DLL system work?
80000000
stack
kernel32.dll
20000000
LoadLibraryA()
77E9D961
IAT
010031A0
77E9D961
LoadLibraryA
.text
77E80000
Call 010031A0
01001000
heap
00070000
65D60000
00000000
40
Code Red Worm
stack
kernel32.dll
Injected code
LoadLibraryA()
77E9D961
20000000
.text
EAT
LoadLibraryA 77E9D961
01001000
heap
KERNEL32
77E80000
00070000
00000000
41
SQL Slammer/Sapphire
stack
Injected code
kernel32.dll
77E9D961
LoadLibraryA()
20000000
.text
sqlsort.dll
01001000
IAT
77E9D961
heap
77E80000
00070000
00000000
42
Preventing DLL Access

Add Synthetic Diversity to Windows PE Format
Permutation of the Import Address Table
Random String replacement of DLL names and
functions

43
Randomize Plain Text Strings
PEB
stack
Injected code
KERNEL32 77E80000
a7Ly4SZq19 77E80000
20000000
kernel32.dll
.text
LoadLibraryA()
77E9D961
IAT
LoadLibraryA
4Cu74xIpI9q2
EAT
Call 010031A0
LoadLibraryA 77E9D961
4Cu74xIpI9q2 77E9D961
01001000
heap
KERNEL32
a7Ly4SZq19
77E80000
00070000
00000000
44
Permute IAT
80000000
stack
Call 010031A0
kernel32.dll
20000000
77E9D961
LoadLibraryA()
IAT
LoadLibraryA
010031A0
77E90332
.text
77E9D961
GetProcAddress
0100308C
77E90332
GetProcAddress()
77E80000
Call 010031A0
Call 0100308C
01001000
heap
00070000
65D60000
00000000
45
Preventing DLL Access

Add Synthetic Diversity to Windows PE Format
Permutation of the Import Address Table
Random String replacement of DLL names and
functions

46
Some Assumptions

Attacks are remote, automated and non-directed
Attacker cannot observe the execution of valid
programs without using system calls
Processes cannot transition from user mode to
kernel mode without using system calls
Attacker cannot automate static analysis of
memory contents
Modification is limited to binary (or memory)
editing source code is unavailable

47
Preventing DLL Access

Add Synthetic Diversity to Windows PE Format
Permutation of the Import Address Table
Random String replacement of DLL names and
functions
Add Diversity to Binary Code
Randomize Base Addresses
Reorder code blocks
Interleave nonfunctional code block

48
Hypothesis Operand hijacking
80000000
PEB
stack
Injected code
20000000
kernel32.dll
LoadLibraryA()
77E9D961
IAT
.text
0100308C
77E9D961
77E80000
Call 0100308C
01001000
heap
00070000
65D60000
00000000
49
Binary Transformation
80000000
stack
kernel32.dll
20000000
77E9D961
IAT
010031A0
.text
1
3
1
77E80000
2
3
2
65D60000
50
Binary Transformation
80000000
stack
kernel32.dll
20000000
77E9D961
IAT
010031A0
.text
3
77E80000
2
1
2
65D60000
51
Challenges in Binary Rewriting
80000000
stack
kernel32.dll
20000000
77E9D961
Indirect Jumps
IAT
010031A0
.text
1
3
JMP EAX Call EBX
X
1
77E80000
jmp EAX call EBX
2
Function Pointers
3
2
65D60000
52
Our Binary Rewriting Approach
80000000
stack
kernel32.dll
20000000
77E9D961
LoadLibraryA
IAT
010031A0
77E9D961
.text
jmp 697FA0D6
Call 010031A0
77E80000
cmp eax, ebx jde 6687EF03 call 77E9D961 jmp
65D833AE call 77E804AA
53
Preventing DLL Access

What About Brute Force Searches for DLL
Addresses?
Insert code and table entries that point to
invalid addresses ? page fault, start over
Attempted execution of inserted dead code blocks
? dereference null pointer, start over
Landmines Insert code and table entries pointing
to similar code segments that actually generate
alarms
Other Issues
Insertion of redundant DLLs
Runtime Diversification

54
Preventing DLL Access

Key points
No performance impact upon running programs
Access prevention is policy free
Challenges
Safety properties of address transformation
Impact of increased program size in memory

55
Preventing DLL Use

For attacks that locate the proper DLLs, diverse
transformations prevent their use
Parameters passed to DLL functions are
transformed per machine
Asymmetric parameter value transformation
Additional parameter padding

56
Function Parameters in Assembly
push EDI push EBX
80000000
call 77E9D961
stack
kernel32.dll
20000000
77E9D961
GetProcAddress()
mov ECX, EBP0xc mov EDX, EBP0x8
IAT
77E9D961
010031A0
.text
push EAX push EBX
77E80000
01001000
call 010031A0
65D60000
00000000
57
Parameter Padding
push EDI push EBX
80000000
call 77E9D961
stack
kernel32.dll
20000000
77E9D961
GetProcAddress()
IAT
mov ECX,EBP0x14 mov ECX,EBP0x10
77E9D961
010031A0
.text
push EAX push EBX
mov ECX, EBP0xc mov EDX, EBP0x8
push ECX push EDX
77E80000
01001000
call 010031A0
65D60000
00000000
58
Asymmetric Parameter Transformation
push EDI push EBX
80000000
call 77E9D961
stack
kernel32.dll
20000000
77E9D961
GetProcAddress()
mov ECX, EBP0xc mov EDX, EBP0x8
IAT
77E9D961
010031A0
.text
jmp 00045234 cmp EDI, EBX jde 00897EF1
Reverse Transformation
77E80000
push EAX push EBX
01001000
call 010031A0
T ans or tio
00045234
r f ma n
65D60000
00000000
59
Preventing DLL Bypass

Problem Attacker can provide assembly components
that implement some DLL functions making direct
low level (undocumented) Windows system calls
Trap System Interrupts for Runtime Checking

60
Signing System Calls by Location

Post-load pre-execute binary instrumentation
What is instrumented
System call ID.
In Linux system call ID is stored in eax and
interrupt is issued.
We substitute original syscall_id with signed_id
(stored in eax prior to interrupt)
Advantage
Preserve system consistency. Programs are
modified only after theyre loaded.

foo.exe
Foo in memory
Normal load and link
Instrument in memory
execute

Address is 32-bit address of the location where
system call is made

Ek Fast trapdoor permutation with secret key
k. F0,132 ? 0,124 token F(Address) eax
signed_id Ek(token syscall_id)

syscall_id is only 8 bit (only about 200
syscalls exist in Linux)

61
Authentication

Assume
Non-bypassibility - Every time a program makes
system call, we always intercept it before the
kernel.
Memory trace inspection We need to inspect the
stack of the program.
Method
Decrypt the signed_id.
(token syscall_id) Dk(eax) eax contains
the signed_id
Inspect the program stack for return address.
Compute token of the address
check_token F(Memoryesp)
If check_token token, then
set eax syscall_id
Forward the system call to kernel
Otherwise fail

62
Limitations

Only authenticate whether system call is made
from original source or not. Attacker can still
use library function to do system call.
Possible solution is to inspect further up the
stack.

63
Cross Layer Commonalities

Address Obfuscation (to defeat trivial static
analysis of memory)
Insertion of non-functional blocks
Basic block permutation
Permutation and Insertion of dummy function
parameters
Run-time obfuscation
Fail-crash Detection Mechanism (to defeat brute
force trial and error)
Insert invalid virtual address (page fault)
Execution of dead code (deref. null pointer
error)
Deliberate Landmines

64
Evaluation

Use the DETER Testbed
Based upon the Emulab technology with extra
security controls
On demand large scale testbed for the testing and
evaluation of security tools
Deploy hundreds (thousands) of hosts with diverse
configurations.
Try single attacks against all machines
Red Team could launch automated worm attacks
against all machines

65
320 Virtual Node Experiment

Each node has its own OS, filesystem, processes,
network interfaces.
32 gateways
10 hosts per gateway
Emulate 320 nodes totally
Colocate factor 10
Use 44 physical nodes

66
Status

Weve have implemented and demonstrated
diversification of Windows PE format for attack
prevention.
We are implementing the reordering of binary code
blocks and insertion of dead code blocks.
We are implementing the asymmetric transformation
of function parameters using dummy functions.
Investigating potential non-bypassability
mechanisms

67
Probability of Successful Attacks

Pr(A) Pr(V)/DE(A) PEE(A)
Success probability of attack A exploiting
vulnerability V
DE denotes derandomization effort
Range of randomization of addresses involved in A
Requires randomization to change after each
successful derandomization
If rerandomization happens after k attempts,
multiply Pr(A) by k
No rerandomization gt effect of DE and PEE is
additive
PEE denotes payload execution effort
Attempts to successfully execute attack payload
Note System susceptibility to attacks can be
reduced without addressing every vulnerability

68
What can the injected code do?

DLL access
Walk PEB to learn base addresses
O(dlls loaded)
DLLs (except a few) will be renamed, prevent
search by name
Intercept dynamically loaded ones to rename
Can increase the number of loaded dlls

69
Contd

Access to function in DLLS
Walk IAT of the application
O(entries in IAT)
Permute the IAT
Can increase the size of IAT
Add invalid addresses
Scan the code section for call imm_addr
Replace imm_addr with computed goto
Force static analysis

70
State-of-Art in Binary Analysis Transformation
71
Motivation

No source code needed
Language-neutral (C, C or other)
Can be largely independent of OS
Ideally, would provide instruction-set
independent abstractions
This ideal is far from todays reality
Applications in
Instrumenting long-running programs
Legacy code migration
Program optimizations
Security
Program obfuscation, security-enhancing
transformations

72
Approaches

Static analysis/transformation
Binaries files are analyzed/transformed
Benefits
No runtime performance impact
No need for runtime infrastructure
Weakness
Prone to error, problem with checksums/signed
code
Dynamic analysis/transformation
Code analyzed/transformed at runtime
Benefit more robust/accurate
Weakness
Some runtime overhead
Runtime infrastructure needed

73
Previous Works (Static)

OM/ATOM (DEC WRL)
Proprietary and probably outdated
EEL (Jim Larus et al, PLDI 95)
The precursor of most modern rewriters
Targets RISC (SPARC)
Provides processor independent abstractions
Follow up works
UQBT (for RISC)
LEEL (for Linux/i386)

74
Previous Works (Static)

WISA (U. Wisconsin)
Uses EEL for SPARC
Uses IDAProCodeSurfer for x86
Etch (U. of Washington) x86/Windows
Application in performance optimization
Does not seem to be active any more
PLTO/SOLAR (U. Arizona)
Linux/x86, but has limitations (e.g. static
linking)
Brew (Stony Brook)
DisassemblyRAD implementation
Various tools for Java
BCEL seems most advanced

75
Previous Works (Dynamic)

DynInst (U. Maryland, U. Wisconsin)
Instrumentation of running programs
Provides OS/architecture independent abstractions
for instrumentation
LibVerify (Bell Labs/RST Corp)
Runtime rewriting for StackGuard
DynamoRIO (HP Labs/MIT), Strata (UVA)
Disassembles basic blocks at runtime
Provides API to hook into this process and
transform executable
Used in Program Shepherding USENIX Sec '02

76
Most Active Research Groups

WISA project (Wisconsin)
Somesh Jha, Tom Reps
DynInst project (Wisconsin/Maryland)
Barton Miller
SOLAR project (U. Arizona)
Saumya Debray
DynamoRIO (HP/MIT)
Tool available (binary form), Linux/Win32
Strata (UVA)
Tool claimed to be available in source form

77
Most Promising Tools

BREW Stony Brook
DynamoRIO
IDAPro/CodeSurfer (Commercial)
DynInst is robust, but capabilities limited for
our purpose
Strata may be good, and is supposedly available
in source code, but may not be as mature as
DynamoRIO

78
Phases in Static Analysis of Binaries

Disassembly
Instruction decoding/understanding
Insertion of new code

79
Questions You Might Ask

How many variants does defense require?
100 by contract, but the more the better
Why design in depth?
Belt and suspenders, but get multiplicative
advantage from multiple randomization --
hopefully
How do we assure defense achieves multiplicative
effect from multiple stages of randomization?
From different phenomenafail-crash (attacker has
to retry attack), independence of stages
How do we achieve multiplicative effect within a
stage, e.g., IAT randomization?
E.g., prevent attacker from doing DE for a DLL at
a time, e.g., Kernel32

80
Questions You Might Ask (2)

How often does defense re-randomize?
Depends on cost of re-randomization (down-time),
DE, number of variants needed
What is the cost of randomization?
Low for IAT randomization, low for parameter
padding, unknown but determinable for other
kinds, e.g., parameter value transformation,
return address authentication to prevent system
calls from injected code
How does defense do control flow randomization
for subtle optimized code?
Potentially difficult because static analysis of
binary code is hard, but will accept only sound
transformations
Can attacker do de-randomization in payload?
Very unlikely

81
Questions You Might Ask (3)

What if attacker obtains defenses randomization
algorithm and sample randomized code -- known
plaintext and cybertext attack?
Through static analysis he might generate all
variants, but cannot use them in a payload to
compromise more than a fraction of the hosts
this depends on fail crash assumption
Can attacker bypass randomization stages?
Hopefully this is achieved only if kernel is not
secure
How do we verify this assertion?
Careful analysis and lots of testing
Is all this new?
Builds on existing obfuscation work, but much is
new defense in depth, parameter transformation,
random space analysis

82
Questions You Might Ask (4)

Do we have the staff to investigate the numerous
issues posed?
???
Do we have a plan for all this?
Yes!!!

83
DAWSON Next Steps

Continue developing automated transforms to break
exploits specifications
Implement five key transforms and evaluate cost
and effectiveness
Evaluate alternative annotation approaches and
implement
Look specifically at non-buffer overflow attacks
Continue developing automated transforms to break
payload specifications
Prevent brute force searches
Obfuscating code and landmines
Evaluate integration approaches
PE Editor style v Brew style (v. DynamoRIO style)
Integrate transforms and test
Pre-loader prefered
Loader hooks, if absolutely required, or
DynamoRIO style

84
Demonstrations Tonight
Thank You!
85
Backup
86
Collberg, Thomborson, Low

First systematic studies of Java code obfuscation
Produced taxonomy (layout, control flow, data,
and preventative transforms)
Low-cost, stealthy opaque constructs
Techniques for obscuring data structures and
abstractions
Measured effectiveness using software complexity
metrics

87
Wang

Studied malicious host problem to protect trusted
probe communicating with trusted host
Key threats impersonation, intelligent
tampering, input spoofing, not DOS or random
tampering
Input spoofing, in general, unsolvable but
If spoofing input x requires solving the
algortihm-secrecy or execution-integrity problem,
then techniques to ensure the later can be used
to counteract input spoofing. However, there are
applications where this is not possible.
Pervasive aliasing enabled proof precise
analysis of transformed program (e.g., CFG) is NP
hard
Replacing 50 of branches gt
Execution time 4X
Size 2X
Wroblewski extended ideas and implemented purely
sequential, controllable approach that worked on
binary code

88
Linn and Debray

Rewrote binaries (IA-86) to disrupt major static
disassembly approaches (linear sweep and
recursive traversal)
Best commercial tools failed on 65 of
instructions and 85 of functions
Execution times 1.13 X
Executable size 1.15-1.20 X

89
Digital Rights Management

Malicious host is key problem in DRM
White box cryptography approach
Chow et al.
Notwithstanding Barak, can provide useful
commercial levels of security
Obscured DES and AES algorithms
Jacobs et al.
Broke obscured DES but showed general problem of
retrieving data from circuits is NP hard
Admitted that, in practice, usually easy
Link and Neumann improved on Chow

90
Barak et al.

Seminal proof showed
Impossibility of completely obscuring code
No general obfuscator possible
Badger et al. began to extend Wangs work
Unable to prove minimum resistance time to
reverse engineering effort
Redirected to review obfuscation work (tour de
force report)

91
Mitigating Vulnerabilities in Code

Forrest et al. randomized stack resident data
addresses via modified gcc compiler
Chew and Song randomized stack base address,
system call numbers library entry points via
modifying Linux loader and kernel system call
table and binary rewriting
Xu et al. modified Linux kernel to randomize base
addresses of program regions
Approaches still vulnerable to relative address
attacks

92
Forrest et al.

Scrambled executable (prn), then unscrambled
through modified code emulator (x86)
Speed 1.05 X
Memory usage 3 X
Discussed danger of generating valid instruction
during scrambling but did not see experimentally
Kc produced similar results

93
Bhaktar et al.

Key difference between program obfuscation and
address obfuscation is that program obfuscation
is oriented towards preventing most static
analyses of a program, while address obfuscation
has a more limited goal of making it impossible
to predict the relative or absolute addresses of
program code and data. Other analyses, including
reverse compilation, extraction of flow graphs,
etc., are generally not affected by address
obfuscation
Focused on memory error exploits
Randomized absolute/relative addresses in Linux
binary code
Approach offered protection against classic
attacks
Stack smashing, existing code exploits, format
string, data modification, heap overflow,
double-free, integer overflows
Data modification attacks still possible but Etoh
and Yoda approach could help

94
Performance of Bhaktar Transforms
Combination 1 link time static relocation of
stack, heap and code regions with random gaps in
stack frames Combination 2 load time dynamic
relocation of above
95
Vulnerabilities and Exploits

Aleph One, Smashing The Stack For Fun And
Profit, Phrack 49, Volume Seven, Issue
Forty-Nine, File 14 of 16, 11/8/1995
David Litchfield, Defeating the Stack-Based
Overflow Prevention Mechanism of Microsoft
Windows 2003 Server, NGS Research Whitepaper,
August 9, 2003, http//www.nextgenss.com/papers.ht
m
Mudge, How To write buffer overflows,
http//www.insecure.org/stf/mudge_buffer_overflow_
tutorial.html, 10/20/1995
w00w00, Heap Overflow, http//www.w00w00.org/fil
es/articles/heaptut.txt, 1/1999
Ryan Permeh, Marc Maiffret, Code Red Disassembly
Analysis, eEye Digital Security,
http//www.eeye.com/html/advisories/codered.zip.
Stuart Staniford, Nicholas Weaver, Vern Paxson.
Flash Worms Is there any Hope? Silicon
Defense, Retrieved 27 March 2003
lthttp//silicondefense
Stuart Staniford, Vern Paxson, Nicholas Weaver.
How to Own the Internet in Your Spare Time,
Proceedings of the 11th USENIX Security
Symposium. August 2002, Retrieved 27 March 2003,
lthttp//www-dirt.cs.unc.edu/netlunch/fall02/SPW02-
worms.htmgt

96
Software Fault Tolerance N-version Programming

A.Avizienis, Fault Tolerance and fault
intolerance. Complimentary approaches to reliable
computing, Proc. 1975 Int. Conf. Reliable
Software, Los Angels, CA, Apr 21- 27, 1975, pp
458 - 464
A.Avizienis, N-Version Approach to fault
tolerant Software, IEEE-Software eg., vol- SE11,
No12, Dec 1985, pp.1491 -1501
V. Bharathi, N-Version programming method of
Software Fault Tolerance A Critical Review,
Indian Institute of Technology, Kharagpur 721302,
December 28-30, 2003
L. Chen and A. Avizienis, "N-version programming
A fault-tolerance approach to reliability of
software operation," IEEE 8th FTCS, pp. 3-9, 1978
J.C. Knight and N.G. Leveson, A Large Scale
Experiment In N-Version Programming, Digest of
Papers FTCS-15 Fifteenth International Symposium
on Fault-Tolerant Computing, June 1985, Ann
Arbor, MI. pp. 135-139.
J.C. Knight and N.G. Leveson, An Experimental
Evaluation of the Assumption of Independence in
Multi-version Programming, IEEE Transactions on
Software Engineering, Vol. SE-12, No. 1 (January
1986), pp. 96-109.
M.R. Lyu, J.-H. Chen, and A. Avizienis, "Software
diversity metrics and measurements," In Proc. The
Sixteen Annual Int. Computer Software and
Applications Conf. 1992, pp. 69-78.

97
Obfuscation -- Java Code

C. Collberg, C. Thomborson, and D. Low. A
Taxonomy of Obfuscating Transformations.
Technical Report 148, Department of Computer
Science, University of Auckland, July 1997.
C. Collberg, C. Thomborson, and D. Low.
Manufacturing Cheap, Resilient, and Stealthy
Opaque Constructs Department of Computer
Science, University of Auckland. ACM
SIGPLAN-SIGACT Symposium on Principles of
Programming Languages (POPL'98). January 1998
C. Collberg, C. Thomborson, D. Low. Breaking
Abstractions and Unstructuring Data Structures,
Proceedings of the 1998 International Conference
on Computer Languages, pages 28-38. IEEE Computer
Society Press. May 1998.
Larry DAnna, Brian Matt, Andrew Reisse, Tom Van
Vleck, Steve Schwab, Patrick LeBlanc,
Self-Protecting Mobile Agents Obfuscation Report
- Final report, Network Associates Laboratories,
Report 03-015, June 30, 2003
Lee Badger, Larry D'Anna, Doug Kilpatrick, Brian
Matt, Andrew Reisse, Tom Van Vleck.
Self-Protecting Mobile Agents Obfuscation
Techniques Evaluation Report, Network Associates
Laboratories, Report 01-036, Nov 30, 2001,
updated March 22, 2002.
Douglas Low, Java Control Flow Obfuscation, MS
Thesis, Univ. Auckland, 3 June 1998

98
Obfuscation -- Protecting Software

Boaz Barak, Oded Goldreich, Russell Impagaliazzo,
Steven Rudich, Amit Sahai, Salil Vadhan, and Ke
Yang. On the (im)possibility of obfuscating
programs. In J. Kilian, editor, Advances in
Cryptology-CRYPTO 01, Lecture Notes in Computer
Science. Springer-Verlag.
Stanley Chow, Philip A. Eisen, Harold Johnson,
Paul C. van Oorschot A White-Box DES
Implementation for DRM Applications. Digital
Rights Management Workshop 2002 1-15
S. Chow, P. Eisen, H. Johnson and P.C. van
Oorschot, White-Box Cryptography and an AES
Implementation'', Proceedings of the Ninth
Workshop on Selected Areas in Cryptography (SAC
2002)
Matthias Jacob, Dan Boneh, and Edward Felten.
Attacking an obfuscated cipher by injecting
faults , 2002 ACM Workshop on Digital Rights
Management. Washington, D.C., 2002
Hamilton E. Link and William D. Neumann,
Clarifying Obfuscation Improving the Security
of White-Box Encoding, Sandia National
Laboratories, Albuquerque, NM, downloaded from
eprint.iacr.org/2004/025.pdf
Chenxi Wang, A Security Architecture for
Survivability Mechanisms. PhD thesis, University
of Virginia, October 2000.
Chenxi Wang, "Protection of software-based
survivability schemes", in the proceedings of
2001 Dependable Systems and Networks. Gutenburg,
Sweden. July 2001.
w00w00, Heap Overflow, http//www.w00w00.org/fil
es/articles/heaptut.txt, 1/1999
Gregory Wroblewski, General Method of Program
Code Obfuscation, PhD Dissertation, Wroclaw
University of Technology, Institute of
Engineering Cybernetics, 2002.
Gregory Wroblewski General Method of Program
Code Obfuscation, 2002 International Conference
on Software Engineering Research and Practice
(SERP02), June 24 - 27, 2002, Monte Carlo
Resort, Las Vegas, Nevada, USA
Hamilton E. Link and William D. Neumann,
Clarifying Obfuscation Improving the Security
of White-Box Encoding, Sandia National
Laboratories, Albuquerque, NM, downloaded from
eprint.iacr.org/2004/025.pdf
Cullen Linn, Saumya Debray, Obfuscation of
Executable Code to Improve Resistance to Static
Disassembly, ACM Conference on Computer and
Communications Security, Washington DC, October
27-31, 2003.

99
Source Code Transforms to Mitigate Vulnerabilities

M. Chew, D. Song. Mitigating Buffer Overflows by
Operating System Randomization, Technical Report
CMU-CS-02-197.
Hiroaki Etoh and Kunikazu Yoda. Protecting from
stack smashing attacks. Published on
World-WideWeb at URL http//www.trl.ibm.com/projec
ts/security/ssp/main.html, June 2000.
Stephanie Forrest, Anil Somayaji, and David H.
Ackley. Building diverse computer systems. In
6th Workshop on Hot Topics in Operating Systems,
pages 67-72, Los Alamitos, CA, 1997. IEEE
Computer Society Press.
Selvin George, David Evens, Steven Marchette. A
Biological Programming Model for Self-Healing,
First ACM Workshop on Survivable and
Self-Regenerative Systems (in association with
10th ACM Conference on Computer and
Communications Security) October 31, 2003, George
W. Johnson Center, George Mason University,
Fairfax, VA
Pax. Published on World-Wide Web at URL
http//pageexec.virtualave.net, 2001.
Jun Xu, Z. Kalbarczyk and R. K. Iyer.
Transparent Runtime Randomization for Security.
Proc. of 22nd Symposium on Reliable and
Distributed Systems (SRDS), Florence, Italy,
October 6-8, 2003
StackGuard, Libverify, RAD, PointGuard, MS C
compiler
Peter Silberman and Richard Johnson, A Comparison
of Buffer Overflow Prevention Implementations and
Weaknesses, I-Defense, 1875 Campus Commons Dr.
Suite 210 Reston, VA 20191, http//www.blackhat.co
m/presentations/bh-usa-04/bh-us-04-silberman/bh-us
-04-silberman-paper.pdf

100
Run-time Transforms to Mitigate Vulnerabilities

Elena Gabriela Barrantes, David H. Ackley,
Stephanie Forrest, Trek S. Palmer, Darko
Stefanovic and Dino Dai Zovi, Randomized
instruction set emulation to disrupt binary code
injection attacks, 10th ACM Conference on
Computer and Communications Security, Washington
DC, October 27-31, 2003.
Sandeep Bhatkar, Daniel C. DuVarney, and R.
Sekar, Address Obfuscation An Efficient
Approach to Combat a Broad Range of Memory Error
Exploits, 12th USENIX Security Symposium, August
2003.
Gaurav S. Kc, Angelos D. Keromytis, Vassilis
Prevelakis, Countering Code-Injection Attacks
with Instruction-Set Randomization, 10th ACM
Conference on Computer and Communications
Security, Washington DC, October 27-31, 2003.

101
CMU Ballista Study

Most production quality operating system and core
library code exhibit large numbers of flaws in
validating input, call order, etc.
Specification-driven testing verifies this result.

102
SANS Top 10 Top Vulnerabilities to Windows Systems