Deconstructing Dyninst: The SymtabAPI - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Deconstructing Dyninst: The SymtabAPI

Description:

Functionality : Ability to parse types, local variables and query the type of a local Variable ... Parse line number information from the object-file debug information ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 28
Provided by: giri
Category:

less

Transcript and Presenter's Notes

Title: Deconstructing Dyninst: The SymtabAPI


1
Deconstructing DyninstThe SymtabAPI
  • Giridhar Ravipati
  • giri_at_cs.wisc.edu
  • Paradyn project
  • University of Wisconsin

2
The Binary
Type Information
Program Headers
Relocations
Symbol Versions
Exception Information
Section Headers
Local variable Information
Section Data
Shared Object Dependencies
Symbols
Line Number Information
Dynamic Segment Information
3
Motivation
  • Binaries are increasingly complex
  • Different formats
  • Lots of information
  • Lack of portability
  • Need for a tool that provides a simple view of
    binaries on different platforms.

4
SymtabAPI - Overview
  • A multi platform library for parsing object files
  • Goals -
  • Extensibility
  • User-extensible data structures
  • Generality
  • Parse ELF/XCOFF/PE object files
  • On-Disk/ In-Memory parsing
  • Abstraction
  • Be file format independent
  • Interactivity
  • Update data incrementally

5
SymtabAPI Features
  • Symtab 1.0
  • Parse query the symbols in a binary
  • Update existing symbol information
  • Add new symbols
  • Export/Emit symbols
  • Symtab 2.0
  • Parse debug information
  • Ability to generate new binary files
  • Dynamic address mapping

6
Agenda
  • Motivation
  • Overview
  • Features
  • Different Scenarios
  • Look up a symbol
  • Update symbol information
  • Look up the type of a local variable
  • Look up line information
  • Get region information
  • Map absolute addresses to offsets
  • Emit a new binary

7
Scenario 1Lookup an Address of a Function
  • Functionality Ability to query symbols symbol
    information
  • Abstraction Symbol

Symbol
Address
Size
func1
0x0804cc84
100
var1
0x0804cd00
4
func2
0x0804cd1d
0
...
...
...
8
Scenario 1 Operation
  • Parse an object file
  • Lookup a symbol
  • Get the address of the symbol

stdvectorltSymbol gt syms bool err
SymtabopenFile(foo) obj-gtfindSymbolByType(sy
ms, func2, SymbolST_FUNCTION) syms0-gtgetAdd
r()
9
Scenario 2 Update Symbols with meaningful
information
  • Functionality Allow incremental updates of
    symbol objects

Symbol
Address
Size
func1
0x0804cc84
100
var1
0x0804cd00
4
func2
0x0804cd1d
0
50
...
...
...
10
Scenario 2 Operation
  • Parse an object file
  • Lookup a symbol
  • Change attributes of the symbol

stdvectorltSymbol gt syms obj-gtfindSymbolByType
(syms, func2, SymbolST_FUNCTION) syms0-gtset
Size(50)
11
Scenario 3Lookup the data type of a local
Variable
  • Functionality Ability to parse types, local
    variables and query the type of a local Variable

Symbol
Address
Size
func1
0x0804cc84
100
Type
Local Var
var1
0x0804cd00
4
int
v1
func2
0x0804cd1d
50
char
v2
...
...
...
12
Type Interface
  • Lazy type parsing
  • Functionality
  • Parse type information from the object-file
    debug information.
  • Lookup type information
  • Addition of new types
  • Generic Type abstraction (name, size etc.)

13
Type Interface Example
typeStruct
struct s1 int f1 float f210 char
f3
typeScalar(int)
typeArray
typePointer
typeScalar(float)
typeScalar(char)
14
Local Variable Interface
  • Lazy parsing along with the type parsing
  • Functionality
  • Parse local variable and parameter information of
    functions
  • Lookup variable information
  • Addition of new local variables/parameters

15
Scenario 3 Operation
  • Parse Types, local variables
  • Retrieve the type of the local variable

stdvectorltSymbol gt syms stdvectorltlocalVar
gt vars obj-gtfindSymbolByType(syms, func1,
SymbolST_FUNCTION) syms0-gtfindLocalVar(vars,
v1) Type vtype vars0-gtgetType()
16
Scenario 4Lookup Line Information
  • Functionality Ability to parse and query the
    line information

Source File
Address Range
foo.c 30
0x0804cc84 - 0x0804cc9f
0x0804cca0 - 0x0804cccf
foo.c 31
0x0804ccd0 - 0x0804ccf0
foo.c 32
Line Number Map
17
Line Number Interface
  • Lazy parsing
  • Abstractions lineInformation, LineNoTuple
  • Functionality
  • Parse line number information from the
    object-file debug information
  • Look-up line number information
  • Addition of new line information

18
Scenario 4 Operation
  • Parse Line number information
  • Retrieve the address corresponding to a source
    line

stdvectorltLineNoTuplegt lines Address
addr obj-gtgetSourceLines(lines, addr) cout ltlt
lines0.first ltlt lines0.second ltlt endl
19
Scenario 5 Identify if an address falls within
a Code Region
  • Functionality Ability to identify all the
    regions (code/data) and query region information

.text
.text
CodeRegions
CodeRegions


.data
.data
DataRegions
DataRegions

.rodata
Dynamic Region
.dynamic
.bss
Dynamic Region
CodeRegion
.dyninstInst
.dynamic
20
Region Interface
  • Move away from one Code/Data Region notion to
    many Code/Data Regions
  • Responsible for querying region information
  • Permissions
  • Type
  • Disk Offset/size
  • Memory Offset/size
  • Handles addition of new Regions
  • Useful for new binary generation

21
Scenario 5 Operation
  • Parse object file for regions
  • Identify the types of regions
  • Use the type to find if an address is within a
    code region

stdvectorltRegion gt regions Offset
addr Obj-gtgetCodeRegions(regions) Obj-gtisCode(a
ddr)
22
Scenario 6 Map absolute addresses to offsets
  • Functionality Ability to translate absolute
    addresses to offsets

Static Symtab objects
23
Address Mapping Interface
  • Class AddressLookup provides the mapping
    interface
  • Associated with one process
  • Examines a process and finds its dynamic
    libraries and executables and each ones load
    address.

24
Scenario 6 Operation
  • Create AddressLookup associated with the process
  • Find the function name at that address

Address addr Symbol sym Symtab
obj AddressLookup alookup
AddressLookupcreateAddressLookup(pid) alookup-gt
getSymbol(addr, sym, obj)
25
Scenario 7Generate a new binary
  • Functionality Ability to emit a new binary with
    all the changes made

.text
Modified Regions

.data
Unmodified Regions

.dynamic
New Region
.dyninstInst
.symtab
Modified Symbol Table
26
Scenario 7 Operation
  • Parse an object file
  • Make changes
  • Add new symbols
  • Add new code (Instrumentation)
  • Emit a new binary (permanent changes)

Offset newoff obj-gtgetFreeOffset(dataSize) obj-
gtaddRegion(newoff, dataBuffer, dataSize,
.newtext, RegionRT_TEXT) obj-gtemit(new
binary)
27
Questions?
  • Downloads
  • SymtabAPI
  • http//www.paradyn.org/html/downloads.html
  • SymtabAPI Programmers guide
  • http//www.paradyn.org/html/manuals.html
Write a Comment
User Comments (0)
About PowerShow.com