Title: Deconstructing Dyninst: The SymtabAPI
1Deconstructing DyninstThe SymtabAPI
- Giridhar Ravipati
- giri_at_cs.wisc.edu
- Paradyn project
- University of Wisconsin
2The Binary
Type Information
Program Headers
Relocations
Symbol Versions
Exception Information
Section Headers
Local variable Information
Section Data
Shared Object Dependencies
Symbols
Line Number Information
Dynamic Segment Information
3Motivation
- Binaries are increasingly complex
- Different formats
- Lots of information
- Lack of portability
- Need for a tool that provides a simple view of
binaries on different platforms.
4SymtabAPI - Overview
- A multi platform library for parsing object files
- Goals -
- Extensibility
- User-extensible data structures
- Generality
- Parse ELF/XCOFF/PE object files
- On-Disk/ In-Memory parsing
- Abstraction
- Be file format independent
- Interactivity
- Update data incrementally
5SymtabAPI Features
- Symtab 1.0
- Parse query the symbols in a binary
- Update existing symbol information
- Add new symbols
- Export/Emit symbols
- Symtab 2.0
- Parse debug information
- Ability to generate new binary files
- Dynamic address mapping
6Agenda
- Motivation
- Overview
- Features
- Different Scenarios
- Look up a symbol
- Update symbol information
- Look up the type of a local variable
- Look up line information
- Get region information
- Map absolute addresses to offsets
- Emit a new binary
7Scenario 1Lookup an Address of a Function
- Functionality Ability to query symbols symbol
information - Abstraction Symbol
Symbol
Address
Size
func1
0x0804cc84
100
var1
0x0804cd00
4
func2
0x0804cd1d
0
...
...
...
8Scenario 1 Operation
- Parse an object file
- Lookup a symbol
- Get the address of the symbol
stdvectorltSymbol gt syms bool err
SymtabopenFile(foo) obj-gtfindSymbolByType(sy
ms, func2, SymbolST_FUNCTION) syms0-gtgetAdd
r()
9Scenario 2 Update Symbols with meaningful
information
- Functionality Allow incremental updates of
symbol objects
Symbol
Address
Size
func1
0x0804cc84
100
var1
0x0804cd00
4
func2
0x0804cd1d
0
50
...
...
...
10Scenario 2 Operation
- Parse an object file
- Lookup a symbol
- Change attributes of the symbol
stdvectorltSymbol gt syms obj-gtfindSymbolByType
(syms, func2, SymbolST_FUNCTION) syms0-gtset
Size(50)
11Scenario 3Lookup the data type of a local
Variable
- Functionality Ability to parse types, local
variables and query the type of a local Variable
Symbol
Address
Size
func1
0x0804cc84
100
Type
Local Var
var1
0x0804cd00
4
int
v1
func2
0x0804cd1d
50
char
v2
...
...
...
12Type Interface
- Lazy type parsing
- Functionality
- Parse type information from the object-file
debug information. - Lookup type information
- Addition of new types
- Generic Type abstraction (name, size etc.)
13Type Interface Example
typeStruct
struct s1 int f1 float f210 char
f3
typeScalar(int)
typeArray
typePointer
typeScalar(float)
typeScalar(char)
14Local Variable Interface
- Lazy parsing along with the type parsing
- Functionality
- Parse local variable and parameter information of
functions - Lookup variable information
- Addition of new local variables/parameters
15Scenario 3 Operation
- Parse Types, local variables
- Retrieve the type of the local variable
stdvectorltSymbol gt syms stdvectorltlocalVar
gt vars obj-gtfindSymbolByType(syms, func1,
SymbolST_FUNCTION) syms0-gtfindLocalVar(vars,
v1) Type vtype vars0-gtgetType()
16Scenario 4Lookup Line Information
- Functionality Ability to parse and query the
line information
Source File
Address Range
foo.c 30
0x0804cc84 - 0x0804cc9f
0x0804cca0 - 0x0804cccf
foo.c 31
0x0804ccd0 - 0x0804ccf0
foo.c 32
Line Number Map
17Line Number Interface
- Lazy parsing
- Abstractions lineInformation, LineNoTuple
- Functionality
- Parse line number information from the
object-file debug information - Look-up line number information
- Addition of new line information
18Scenario 4 Operation
- Parse Line number information
- Retrieve the address corresponding to a source
line
stdvectorltLineNoTuplegt lines Address
addr obj-gtgetSourceLines(lines, addr) cout ltlt
lines0.first ltlt lines0.second ltlt endl
19Scenario 5 Identify if an address falls within
a Code Region
- Functionality Ability to identify all the
regions (code/data) and query region information
.text
.text
CodeRegions
CodeRegions
.data
.data
DataRegions
DataRegions
.rodata
Dynamic Region
.dynamic
.bss
Dynamic Region
CodeRegion
.dyninstInst
.dynamic
20Region Interface
- Move away from one Code/Data Region notion to
many Code/Data Regions - Responsible for querying region information
- Permissions
- Type
- Disk Offset/size
- Memory Offset/size
- Handles addition of new Regions
- Useful for new binary generation
21Scenario 5 Operation
- Parse object file for regions
- Identify the types of regions
- Use the type to find if an address is within a
code region
stdvectorltRegion gt regions Offset
addr Obj-gtgetCodeRegions(regions) Obj-gtisCode(a
ddr)
22Scenario 6 Map absolute addresses to offsets
- Functionality Ability to translate absolute
addresses to offsets
Static Symtab objects
23Address Mapping Interface
- Class AddressLookup provides the mapping
interface - Associated with one process
- Examines a process and finds its dynamic
libraries and executables and each ones load
address.
24Scenario 6 Operation
- Create AddressLookup associated with the process
- Find the function name at that address
Address addr Symbol sym Symtab
obj AddressLookup alookup
AddressLookupcreateAddressLookup(pid) alookup-gt
getSymbol(addr, sym, obj)
25Scenario 7Generate a new binary
- Functionality Ability to emit a new binary with
all the changes made
.text
Modified Regions
.data
Unmodified Regions
.dynamic
New Region
.dyninstInst
.symtab
Modified Symbol Table
26Scenario 7 Operation
- Parse an object file
- Make changes
- Add new symbols
- Add new code (Instrumentation)
- Emit a new binary (permanent changes)
Offset newoff obj-gtgetFreeOffset(dataSize) obj-
gtaddRegion(newoff, dataBuffer, dataSize,
.newtext, RegionRT_TEXT) obj-gtemit(new
binary)
27Questions?
- Downloads
- SymtabAPI
- http//www.paradyn.org/html/downloads.html
- SymtabAPI Programmers guide
- http//www.paradyn.org/html/manuals.html