Title: Compilers and Software Security
1Compilers and Software Security
- Gaurav S. Kcgskc_at_cs.columbia.edu
- http//www.cs.columbia.edu/gskc
- Programming Systems Lab
Tuesday, 22nd April 2003
2Outline
- Security
- Runtime Management of Processes
- Vulnerabilities and Attack Techniques
- Compilers 4115
- Security Research
- Conclusion
3Security
- What does security mean?
- Focus Security of resources
- No unauthorised access (using Authentication)
- Availability for authorised users (no DoS)
- Also Security of data during transit
- Protection from eavesdropping
- Protection from malformation
- Solutions PKI for encryption, digital signatures
for non-repudiation
4Security Models Threats
- Social aspects of security failure
- 3Bs Burglary, Bribery, Brutality
- Social Engineering
- Threats to Security During Transit
- Man-in-the-middle attack
- Identity spoofing / Masquerading
- Packet sniffing
- Communication replay
5Threats to Application Security
- Trojan HorsesMalicious security breaking program
disguised as something benign like a screen saver
or game program - Keystroke loggers powerful remote-control
utility like Back Orifice - Abnormal system behaviour, e.g. open server
socket, CTRL-ALT-DEL signal handler - Zombie nodes, awaiting instructions for
conducting D.DoS - Computer VirusesExecutable code that, when run
by someone, infects or attaches itself to other
executable code in a computer in an effort to
reproduce itself - Can be malicious, erase files, lock up systems
- Boot Sector, File, Macro, Multipartite,
Polymorphic, Stealth - Anti-virus search for known signature in suspect
files
6Threats to Application Security 2
- Internet WormsA worm is a self-replicating
program that does not alter files, but resides in
active memory and duplicates itself by means of
computer networks - Morris Worm (RTM) exploited fingerd, sendmail,
weak passwords - Code Red exploited a (publicised) vulnerability
in Microsoft IIS - Code Red II had a Trojan payload
- Nimda Swiss Army knife of worms worm, virus,
trojan!Spread via its own e-mail engine, IIS
servers that it scanned, and shared disks on
corporate networks. - Common TraitWell-crafted input data can let you
take control of a computer - WinNuke for rebooting remote Win95 machine )
7- Security
- Runtime Management of Processes
- Vulnerabilities and Attack Techniques
- Compilers 4115
- Security Research
- Conclusion
8Process Runtime
0xffffffff
kernel space
0xbfffffff
env
- x86
- 32-bit von Neumann machine
- 232 4GB memory locations
- Breakdown of process space
- stack
- lt 0xbfffffff, Grows downwards
- Environment variables, Program parameters
- Automatically allocated stack variables
- Activation records
- heap
- Dynamic allocation
- Explicitly through malloc, free
argv
char env
char argv
int argc
runtime stack
runtime heap
.bss
.data
int main(int argc, char argv, char env)
return 0
.text
0x08048000
0x00000000
9Process Runtime 2
0xffffffff
kernel space
- .bss
- assembler directive for IBM 704 assembler
- runtime allocation of space
- RWX
- .data
- compile-time space allocation,and initialisation
values - RWX
- .text
- program code
- runtime DLLs
- RO, X
- .rodata
- RO, X
- constants
- const int x 4
- hello, world
Block Started by Segment // static global
uninitialised data
Data Section // static global initialised data
Text Section // executable machine code
0x08048000
0x00000000
10Activation Records
- Subroutines
- functions and procedures
- abstraction of computation
- structured programming concept
- Stack frame, Function frame, Activation frame
- Block of stack space reserved for duration of
function - Logical stack frames are crucial for implementing
subroutines - Each frame contains information related to the
context of the given function. Grows downwards
for each nested invocation. - Reserved registers
- eip (next instruction), esp, ebp (fixed
offsets)
11Activation Records 2
- Source function
- Visualisation of the runtime stack frame
16(ebp)
void function(char s, float y, int x) int
a int b char bufferSIZE int c
strcpy(buffer, s) return define SIZE
9 int main(void) function(yep, 2.f, 93)
return 0
12(ebp)
PC
8(ebp)
-12(ebp)
SP
-16(ebp)
FP
-40(ebp)
-44(ebp)
12Activation Records 3
function pushl ebp movl esp, ebp
subl 56, esp subl 8, esp pushl
8(ebp) leal -40(ebp), eax pushl eax
call strcpy addl 16, esp leave
ret .LC0 .string yep main ... pushl
93 pushl 0x40000000 pushl .LC0
call function ...
- Source function
- Assembly equivalent
- Building the stack frame
void function(char s, float y, int x) int
a int b char bufferSIZE int c
strcpy(buffer, s) return define SIZE
9 int main(void) function(yep, 2.f, 93)
return 0
13- Security
- Runtime Management of Processes
- Vulnerabilities and Attack Techniques
- Compilers 4115
- Security Research
- Conclusion
14Vulnerabilities
- C Low level, high level systems language
- Efficient execution, Usable for real-time
solutions - Pointers and Arrays
- Pointer to (null-terminated?) block of memory
- Lack of bounds checking
- Buffer overflow causes havoc
15Attack Techniques
- Criteria for successful attack
- Locate a buffer that has an unsafe operation
applied to it - Well-crafted input data to trigger the overflow
- Buffer overrun vulnerabilities
- Stack-based Stack-smashing attack
- Heap-based Function pointers, C virtual
pointers, Exception handlers (CodeRed) - FormatString exploits
- n format converter for printf family of
functions - writes bytes output so far to n argument (int
) - printf(\x70\xf7\xff\xbfn) //0xbffff770 4
16Smashing the Stack
- To overflow (automatic) stack buffer, one would
need - Shellcode, i.e. characters representing machine
code (obtain from gdb, as) - Memory location of injected shellcode (typically
buffer address) - Can approximate to make up for lack of precise
information - nop instructions at the beginning of the
shellcode - overwrite locations around 0(ebp)with shellcode
address - suid installed programs. Shellcode shell, export
xterm display
void function(char s, float y, int x) int
a int b char bufferSIZE int c ...
strcpy(buffer, s) ...
int x
float y
char s
PC
ret. addr
0x0abcdef0
old fp
0x4fedcba8
int a
int b
char bufferSIZE
int c
17Heap-Based Attacks
class ABC char buffer10 virtual void
print() cout ltlt buffer void set(char
s) strcpy(buffer, s) int main(int
argc, char argv) static char buffer10
static int (f)(void) exit // gets(buffer)
strcpy(buffer, argv1) (f)() ABC abc
new ABC() abc-gtset(argv1)
abc-gtprint()
- Function pointer
- Higher address function pointer
- Lower address buffer
- C Pointer to vtable
- Higher address virtual pointer
- Lower address buffer
18- Security
- Runtime Management of Processes
- Vulnerabilities and Attack Techniques
- Compilers 4115
- Security Research
- Conclusion
19Compilers 4115
- GCC GNU Compiler Collection
- Just a wrapper for different phases
- cpp C preprocessor
- program.c ? program.i
- cc1 C compiler proper
- program.i ? program.s
- as Assembler (a.out, ELF relocatable files)
- program.s ? program.o
- ld Link editor (ELF executables)
- program.o ? program
20GCC
- Command line options
- gcc save-temps (-pipe) Wall O0 dr v
static-IHOME/include LHOME/lib-lsocket lm
-lpthread - Standard libraries
- /lib/libc.so.6, /lib/ld-linux.so.2
- Standard library header files
- /usr/include
21Other tools
- GNU Debugger gdb
- GNU Binutils
- objcopy add/remove ELF sections
- readelf,objdump print ELF information
- Miscellaneous
- ldd list dynamic dependencies (DLLs)
- strace trace syscall invocations
22- Security
- Runtime Management of Processes
- Vulnerabilities and Attack Techniques
- Compilers 4115
- Security Research
- Conclusion
23Security Research
- Know thy enemy
- Monitor the attackers behaviour and tactics
- In a constrained resource environment
- Honeypots
- Illusion of an easy target to lure attackers
- Jail
- Sandboxed environment using chroot
- All necessary files are available locally
- Virtual machines
- Sandboxes with limited syscalls
24Automatic Defence Mechanisms
- Face thy enemy
- Applications fortified with runtime checks
- Stackguard, Memguard, .NET cl.exe /gs
- canary word to detect Stack-smashing
- READONLY stack frame
- .NET C/C compiler protects 0(ebp),4(ebp)
- Libsafe, Libverify
- safe implementation of standard libraries
- runtime backup/checking of return address
25Defence through Diversity
- Code Diversity
- Code randomisation for diversity
- Security through obscurity even for open-source
software - No more breach once, breach everywhere
- Compiler-based Protection
- Secure the stack data
- Potentially vulnerable heap data
26Casper
- Paper Casper Compiler-assisted securing of
programs at runtime - Via added runtime checks as part of function
invocations - Add protection code
- Protect what control data in stack frames
- What from most stack-smashing attacks
- Available as patches
- Compiler gcc-2.95
- Debugger gdb-5.2.1
27Casper in Action
- Similar in nature to Stackguard, but with much
smaller overhead - XOR property idempotent when applied twice.
Simplest form of encryption / obfuscation of data
int x
float y
PC
char s
- Casper protection
- Mask original return address value when entering
function - Unmask and restore the original return address
value when returning from function - Overwritten value will be restored to invalid
code address
ret. addr 32-bit XOR ret. addr
ret. addr
0x0abcdef0
old fp
0x4fedcba8
int a
int b
char bufferSIZE
int c
28Get the Processor Involved
- Paper Countering Code-Injection Attacks With
Instruction-Set Randomization - Machine instruction translation unique per
process - Reversible mapping
- machine instruction ? garbage bit sequence
- Post-compilation stage
- Encode all executable sections with key
- Store codec key in file header
- Modified von Neumann fetch, decrypt, decode,
execute - decrypt Processor restores each block of bytes
to valid, original instruction - Injected code gets probabilistically transformed
to garbage bit-sequence that cannot be decoded
29Binary Encryption and Execution
SOURCE CODE
30Binary Encryption and Execution 2
- Bochs Pentium emulator is the modified machine
- Support for hidden register gav
- Interrupt routine handler saves gav to process
structure - Linux 2.2.14
- Kernel recognises new register
- Support for register in process structure
- as and objcopy for program encryption and codec
storage
code
31Future Work
- Randomised ISA on real machine
- Programmable Transmeta chips
- Dynamo Dynamic optimiser of native code
- Activation records
- automatically managed, randomised layout
- Heap smashing techniques
- break type-system
- corrupt malloc data, Diversified research
- Languages, Compilers C, Sun CC, Visual C
- Other architectures Solaris, Alpha (DLX -)
32Conclusion
- Security
- Process Security
- Runtime Management of Processes
- Stack, Heap, Activation Records
- Vulnerabilities and Attack Techniques
- Buffer overrun. Stacksmashing. Pointer
overwriting. - Compilers 4115
- GCC, GDB, Binutils
- Security Research
- Monitoring. Runtime protection
33References
- The Bochs Pentium emulatorhttp//bochs.sourceforg
e.net/ - Aleph One. Smashing The Stack For Fun And
Profithttp//www.phrack.org/show.php?p49a14 - Arash Baratloo, N. Singh, T. TsaiTransparent
Run-Time Defense Against Stack Smashing Attacks - Crispin Cowan, M. Barringer, et al.FormatGuard
Automatic Protection From printf format string
vulnerabilities - Crispin Cowan, Calton Pu, et al.StackGuard
Automatic Adaptive Detection and Prevention of
Buffer-Overflow Attacks - Gaurav S. Kc, Stephen A. Edwards, Gail E. Kaiser,
Angelos KeromytisCasper Compiler-assisted
securing of programs at runtime - Gaurav S. Kc, Angelos D. Keromytis, Vassilis
PrevelakisCountering Code-Injection Attacks With
Instruction-Set Randomization
34Optimisation of Tail-Recursion
C source code
Assembly
- int factorial(int n)
- if (1 gt n) return 1
- return nfactorial(n-1)
-
- int val factorial(x)
- int factorial(int n, int v)
- if (1 gt n) return v
- return factorial(n-1, vn)
-
- int val factorial(x, 1)
factorial ... pushl n-1 call factorial
... factorial ... n n-1 v vn
goto factorial
back
35x86 Processor
- Dual integer pipeline
- Hidden register eip does not always fetch the
next instruction
back
36Binary Encryption Code GNU as
- if ! 1 then echo "usage 0
ltELF_executable_imagegt key" exit fi - if ! 2 then XOR_KEY"0xRANDOM" else
XOR_KEY2 fi - file names
- NEW_FILE"1.XOR_KEY"
- ORG_FILE1
- INTERMEDIATE"XOR_KEY.o"
- modified binary
- OBJCOPY/home/gskc/usr/binutils-2.13.2/bin/objcopy
- create an intermediate ELF object file with an
.xor.stuff section - as -o INTERMEDIATE ltltEOF
- .section .xor.stuff
- .long XOR_KEY
- EOF
- merge the .xor.stuff section into the specified
file - OBJCOPY --encrypt-xor-key XOR_KEY --add-section
.xor.stuffINTERMEDIATE ORG_FILE NEW_FILE
back