Advanced Techniques in Reverse Engineering - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Advanced Techniques in Reverse Engineering

Description:

Solution Graph coloring. Using IDA, collect all function addresses of the module ... Graph coloring ... We can now automatically color these functions in the ... – PowerPoint PPT presentation

Number of Views:210
Avg rating:3.0/5.0
Slides: 27
Provided by: Raf16
Category:

less

Transcript and Presenter's Notes

Title: Advanced Techniques in Reverse Engineering


1
Advanced Techniques in Reverse Engineering
  • Robert Parham

2
Language Questions
  • Eu Falo Pouco Português
  • English is not my native language, and probably
    not yours either
  • Feel free to stop me at any stage and ask for
    clarifications (Julio can help with translations
    ?)
  • Dont keep questions to the end just raise your
    hand I prefer it that way

3
Who am I ?
  • M.Sc in Computer Security from the Israeli
    Institute of Technology
  • Commander of the Israeli armys internal
    pen-testing team (yeah, they paid me to hack into
    their systems, cool, ha ?)
  • Project manager of Microsofts ISA server 2008
    (Firewall, VPN)
  • Just arrived to travel brazil for 8 months

4
What is Reverse Engineering (RE)?
  • Understanding program workings, given a compiled
    program (assembly code)
  • No source code
  • Sometimes unable to execute code, either
  • Basically, the invert of compiling...
  • Dynamic vs. Static RE
  • Dynamic debug.exe, WinDbg, SoftIce, OllyDbg
  • Static plain source view, IDA

5
What will we talk about?
  • Understanding static RE using IDA
  • Functions
  • Execution blocks
  • Call graphs
  • Understanding the patching process
  • Microsofts Patch Tuesday

6
What will we talk about?
  • Binary diffing
  • What is it?
  • What is it good for?
  • How does it work?
  • Combining Static and Dynamic RE (optional)
  • What is it good for?
  • How is it done?

7
Understanding IDA
  • IDA Interactive Disassembler
  • Displays assembly code, seperated into functions
    and execution blocks
  • We will focus on C
  • OO and Managed code are the same idea, but harder
    to explain
  • Function
  • A code section from a CALL address to a RET
    command
  • Usually small, well defined purpose
  • Execution block
  • A set of commands always executed together
  • Any JMP (GOTO) or Branch (IF) breaks a code block
  • Every function is composed of one or more blocks
  • Every module is composed of one or more functions

8
Example RE for calc.exe
  • How do interactive disassembly works?
  • What is a function-call graph?
  • What is a function graph?

9
The Patch Problem
  • Software errors happen
  • Some of these errors have security implications
  • Patches are released to correct the error
  • However...
  • Sometimes the hacker community doesnt know why a
    specific patch was released
  • Sometimes a patch for problem A also patches a
    completely different problem B
  • New bug classes can be identified by analyzing
    released patches

10
The Patch Problem
  • Example a windows patch
  • Save CRC for all files in a windows system
  • Save copy of all files (a mirror of the OS)
  • Applay patch
  • Check CRC for all files and find out which files
    changed
  • Discard false positives (log files, registry
    files, etc)
  • Identify the changed binary files

11
The Patch Problem single binary file
  • We know a single file was changed (say,
    cryptdll.dll the windows cryptography manager)
  • We want to know WHAT changed in the file
    (usually, just the addition of an IF statement to
    check for some condition that wasnt checked
    before)
  • Can give us valuable insight about similar bugs,
    or help us mount an attack on un-patched servers

12
The Patch Problem single binary file
  • Simple diffing (WinDiff) is not possible
    compliation changes EVERYTHING (compiler
    optimizations)
  • Hard way Fully RE old and new version, and
    compare source codes manually
  • Easy way Binary Diffing

13
Binary Diffing
  • Goal find the differences between two binary
    files that were slightly changed
  • Usually just enough to find the function that
    changed
  • Do as little manual RE as possible
  • Method must work if there were multiple changes,
    too

14
Binary diffing the idea
  • Create call graphs for both modules
  • Assign specific properties for each function in
    each module
  • Fit the two call graphs (which by definition,
    will be almost the same, but not identical)
  • Find the functions that dont fit

15
Assigning properties
  • 8 nodes
  • 11 edges
  • 3 function calls
  • We will call it (8,11,3) from now on

16
Why these properties?
  • Easy to calculate on given source
  • Dont change between compilations and different
    compilation optimizations, because they are
    affected by program logic
  • Change when patch is applied

17
Matching graphs
  • Create complete function-call graph for each
    module
  • Start fitting functions with high values
  • Fit the functions around them
  • Continue until all possible functions are fitted
    between the two modules

18
Example
After
Before
(5,12,1)
(5,12,1)
(4,6,2)
(4,6,2)
(6,9,6)
(8,12,6)
(5,9,6)
(5,9,6)
(14,24,8)
(14,24,8)
(3,7,3)
(3,7,3)
(7,13,6)
(7,13,6)
19
Example
After
Before
(5,12,1)
(5,12,1)
(4,6,2)
(4,6,2)
(6,9,6)
(8,12,6)
(5,9,6)
(5,9,6)
(14,24,8)
(14,24,8)
(3,7,3)
(3,7,3)
(7,13,6)
(7,13,6)
20
Example
After
Before
(5,12,1)
(5,12,1)
(4,6,2)
(4,6,2)
(6,9,6)
(8,12,6)
(5,9,6)
(5,9,6)
(14,24,8)
(14,24,8)
(3,7,3)
(3,7,3)
(7,13,6)
(7,13,6)
21
Example
After
Before
(5,12,1)
(5,12,1)
(4,6,2)
(4,6,2)
(6,9,6)
(8,12,6)
(5,9,6)
(5,9,6)
(14,24,8)
(14,24,8)
(3,7,3)
(3,7,3)
(7,13,6)
(7,13,6)
22
Example
After
Before
(5,12,1)
(5,12,1)
(4,6,2)
(4,6,2)
(6,9,6)
(8,12,6)
(5,9,6)
(5,9,6)
(14,24,8)
(14,24,8)
(3,7,3)
(3,7,3)
(7,13,6)
(7,13,6)
23
Comments
  • The split from one to three blocks is typical for
    the addition of an IF statement in the block
    that was split
  • Valuable information indeed...
  • Graph fitting is not always perfect, but current
    algorithm solves more than 95 of cases

24
Combining Static and Dynamic RE
  • Two equally powerful techniques for RE
  • But can be even stronger together
  • Main problem with static RE is not knowing what
    is a typical code path
  • Or sometimes we want to only RE the code path for
    a specific input (say we fuzzed the program,
    found a crashing input, and want to find the code
    path that leads to the crash)

25
Solution Graph coloring
  • Using IDA, collect all function addresses of the
    module
  • Run the code in debug mode (say, using windbg)
    and set interrupt on every address from the last
    step
  • The interrupt handler will just write the address
    that caused the interrupt to file, and continue
    execution

26
Graph coloring
  • So we run the program with the typical or
    specific input, and get a list of functions
    that were triggered.
  • Usually, around 5-20 of functions
  • We can now automatically color these functions in
    the IDA graph of the module, and add a comment to
    each function
  • Immensly reduced code we need to actually RE...
Write a Comment
User Comments (0)
About PowerShow.com