Cthulhu - PowerPoint PPT Presentation

About This Presentation
Title:

Cthulhu

Description:

A software analysis framework built on Phoenix We must define our flow descriptors * Data flow Def-Use relationships between components Interpretation: In at least ... – PowerPoint PPT presentation

Number of Views:976
Avg rating:3.0/5.0
Slides: 43
Provided by: hickOrgm
Learn more at: https://hick.org
Category:

less

Transcript and Presenter's Notes

Title: Cthulhu


1
Cthulhu
  • A software analysis framework built on Phoenix

2
Who am I?
  • Matt Miller
  • Leviathan Security Group
  • Metasploit Framework
  • Uninformed Journal
  • Not a static analysis expert ?

3
Whats this talk about?
  • Cthulhu software analysis framework
  • Very high-level architectural overview
  • Interesting features
  • Case study

4
Phoenix Overview
  • Software optimization and analysis
  • Basis for future Microsoft compilers and tools
  • Robust and extensible architecture
  • Plugins
  • Phases
  • Check out Richard Johnsons talk to learn more ?

5
Why extend Phoenix?
  • RDK/SDK not yet completely solidified
  • Encapsulation can help here
  • API is feature rich but verbose
  • No simplified wrapper
  • No solution for large-scale analysis
  • LTCG is not enough

6
Cthulhu Overview
  • Software analysis framework
  • Hobby project started in June, 2006
  • Written in C
  • Currently around 28KLOC

7
Cthulhu Goals
  • Simplified Programming Interface
  • Simple and extensible API
  • Fundamental independence
  • Large-scale analysis
  • Modeling behavior of large systems
  • Pie in the sky Windows Vista ?
  • Research Sandbox
  • A playground for experimentation
  • Phoenix can also be used directly for this purpose

8
Cthulhu Architecture
DB
IDA
Data Flow
Phoenix
Control Flow
Fundamentals
Analysis Engine
Peons
Tools
Rendering
Analysis
9
Cthulhu Architecture
DB
IDA
Data Flow
Phoenix
Control Flow
Fundamentals
Analysis Engine
Peons
Tools
Rendering
Analysis
10
Analysis Engine Process
  • Uses a fundamental to load assemblies
  • Runs phases
  • Import
  • Analyze
  • Render
  • Peons register to be notified on certain events

11
Import Phase
Phoenix Fundamental
DB
1. Load Assembly
2. Assembly Loaded
Analysis Engine
4. Normalize Information
3. Import Event
Importing Peons
Basic Types
5. Import Event
Control Flow
Data Flow
12
Analysis Phase
2. Denormalize Assembly Information
DB
Database Fundamental
1. Load Assembly
3. Assembly Loaded
Analysis Engine
5. Normalize and Denormalize Information
4. Analysis Event
Analyzing Peons
Path Discovery
6. Analysis Event
Leak Check
13
Render Phase
DB
2. Denormalize
Rendering Peons
Output Store
Analysis Engine
1. Render
3. Display
Console
GUI
14
Database Implications
  • Extensible and flexible way to represent binary
    information
  • May be used to support large-scale analysis
  • Hundreds of modules
  • More work needs to be done
  • Performance overhead is non-trivial
  • Processing time can be high
  • Volatile memory usage can be kept low

15
A few cool features
  • Simplified API
  • Version-independent modeling
  • Conceptual modeling

16
Simplified API
Abstract classes provide fundamental independence
Assembly
Module
Data Type
Method

Assembly
Assembly
Module
Module
Data Type
Data Type
Method
Method
DB
Phoenix
Concrete Implementations
17
Version-independent Modeling
Modeling version independent relationships
between software elements in the database
Appropriate versions can be selected at analysis
time
void CallExitProcess() ExitProcess(0)
ExitProcess 1
ExitProcess 2
ExitProcess
ExitProcess 3
CallExitProcess 1
ExitProcess 4
Call to version independent kernel32!ExitProcess
Distinct versions of kernel32!ExitProcess
18
Conceptual Modeling
Universe
VPN Client
VPN Server
Device Driver
Daemon
vpn.sys
daemon.exe
User Interface
vpngui.exe
dialogs.dll
19
Case StudyWeb Services
  • Finding inter-component data flow paths

20
Overview
  • Web Services is a simple remoting interface
  • Clients invoke methods hosted on a web server
  • Server handles requests and provides responses
  • Problematic for static analysis
  • Clients pass data to the server indirectly
    (network)
  • Limits the scope at which analysis can be
    performed
  • Lets walk through an example

21
Example Web Service
WebService public class WebService WebMethod
public void ExecuteCommand(string
command) Process.Start(command)
Simple web service that invokes a process using
the supplied command string
22
Example Web Service Client
WebServiceBinding public class WebClient
SoapHttpClientProtocol SoapDocumentMethod pu
blic void ExecuteCommand(string
command) Invoke("ExecuteCommand",
new object command )
Simple web client that wraps the invocation of
the web service method
23
Bridging the gap
  • To illustrate a relationship, the client
    invocation and server method must be bridged
  • Bridging can take a few different forms
  • Automatic detection of relationships
  • Manual description of relationships
  • Bridging is an abstract concept though
  • How do we make it concrete?

24
Bridging the gap
  • A concrete relationship can be shown by linking
    formal parameters

fin(ExecuteCommand, 0)
WebService
fin(ExecuteCommand, 0)
WebClient
25
Benefits of bridging
Web Application
Web Client
Web Service
WebClient.dll
WebService.dll
WebClient
WebService
ExecuteCommand
ExecuteCommand
Enter Block
Enter Block
fin(0)
fin(0)
26
Whats the point?
  • Describing indirect relationships improves the
    quality of analysis information
  • Widens the scope for control flow and data flow
    analysis
  • The Path Discovery peon can help illustrate this

27
Path Discovery Overview
  • Designed to find reachable flow paths
  • From a set of sources
  • To a set of sinks
  • Within a set of target assemblies
  • Current restrictions
  • Requires the database fundamental
  • Only operates on data flow information

28
Path Discovery Scenario
  • Command Injection represents one type of security
    flaw found in managed applications
  • This can happen when user-controlled data is used
    in conjunction with launching a process
  • For example, data passing
  • From HttpRequest.get_QueryString
  • To Process.Start
  • This should be easy to detect, right?

29
Path Discovery Problem
  • Finding data flow paths from get_QueryString to
    Start can be problematic
  • Lowest level data flow information is conveyed
    with respect to instructions
  • What if hundreds of assemblies are being
    analyzed?
  • Not enough physical memory!

30
Path Discovery Solution
  • Path Discovery makes use of generalized data flow
    relationships
  • Block-tier, method-tier, type-tier, etc
  • Reachable paths are identified using a simple
    algorithm
  • Progressive Qualified Elaboration (PQE)
  • PQE is designed to reduce the amount of analysis
    information that must be considered

31
Progressive Qualified Elaboration
Reachable paths are progressively found between
source and sink flow descriptors within a set of
target assemblies
32
Flow descriptors for this scenario
Tier Information
Component fout(Undefined)
Assembly fout(System.Web)
Data Type fout(System.Web.HttpRequest)
Method fout(get_QueryString, 0)
Basic Block fout(get_QueryString, 0)
Instruction fout(get_QueryString, 0)
Source flow descriptor
Tier Information
Component fin(Undefined)
Assembly fin(System)
Data Type fin(System.DiaProcess)
Method fin(Start, 0)
Basic Block fin(Start, 0)
Instruction fin(Start, 0)
Sink flow descriptor
33
Applying this to web services
  • Suppose there is some code in the web client that
    does the following
  • client.ExecuteCommand(request.QueryStringx)
  • Bridging makes it possible to show a complete
    data flow path from get_QueryString to Start
  • Lets see how we get there using PQE
  • PQE starts from a macro-tier, such as the
    component tier

34
Reachability Component Tier
Data flow Def-Use relationships between
components
Interpretation In at least one situation, v uses
data defined by u
35
Reachability Assembly Tier
Data flow Def-Use relationships between
assemblies
36
Reachability Data Type Tier
Data flow Def-Use relationships between data
types
37
Reachability Method Tier
Data flow Def-Use relationships between methods
38
Reachability Basic Block Tier
Data flow Def-Use relationships between blocks
39
Reachability Instruction Tier
Data flow Def-Use relationships between
instructions
40
The end-result
  • A complete data flow path is identified
  • Data flows across an indirect boundary
  • Without bridging, it would not be possible to
    seamlessly perform this analysis
  • This means the security issue would be missed
  • Note that the security issue exists in the web
    service independent of the web client
  • Example was meant to show simple indirect data
    flow

41
Future Work
  • Import and analyze large data sets
  • All PE modules from Windows Vista?
  • Improve database performance
  • Optimization work has not started yet
  • It is currently very slow
  • Implement additional peons
  • Leak Check
  • And the list goes on

42
Conclusion
  • Phoenix is an exciting project
  • Software analysis is fun challenging
  • Hopefully the database stuff pans out ?
  • Questions?
Write a Comment
User Comments (0)
About PowerShow.com