Title: Dependence Graphs for Information Assurance
1Dependence Graphs for Information Assurance
- Paul Anderson
- paul_at_grammatech.com
Tim Teitelbaum tt_at_grammatech.com (Cornell)
GrammaTech, Inc. Ithaca, NY http//www.grammatech.
com
2- Problem
- Understanding information flows
- important for solving security problems
- But tool support for understanding information
flows is poor - too abstract / not code based / research
languages - too imprecise / dont scale
- Opportunity
- Dependence analysis
- sound and tractable basis for understanding
information flows - Theory of dependence graphs and program slicing
- mature theory for compilers and
software-engineering tools - Objective
- Effective information-flow analysis tools based
on dependence graphs - modest goal less expressive than type-based
approaches - ambitious goal C/C interprocedural precision
scalable - Apply to security problems
3Sample application 1 Covert Channel Analysis
HI
ACK
NRL PUMP
read_high
write_high
read_low
write_low
LO
The chop between read_high and write_low shows
possible information flows from HI to LO
4Covert Channel Analysis, continued
5Sample application 2 Buffer Overrun Analysis
External strings
Internal strings
foo
bar
internal string
strcpy
strcpy
strncpy
Unbounded string copies
Bounded string copies
The chop between external strings and unbounded
copies shows possible exploitable overruns
6Sample Application 3 Analysis for Efficient IRM
Insertion
- Sample policy
- Sends
- never preceded by a read gt necessarily in state
1 - omit code to check state
- always preceded by a read gt necessarily in state
2 - replace code with error
- Reads
- always preceded by a read gt necessarily in state
2 - omit code to change state
- never followed by a send gt no subsequent
violation possible - omit code to change state
- Simple implementation Schneider
- inline the security automaton everywhere
- partial evaluation
- Efficient implementation
- exploit dependence information
read
read
send
1
2
7Control Flow Graphs
entry main
void main() int sum, i sum 0 i 1
while (ilt11) sum add(sum,i) i
add(i, 1) printf(sumd\n, sum)
printf(id\n, i) static int add(int a,
int b) return (ab)
sum 0
i 1
while i lt 11
print sum
print i
call add
call add
ain sum
sum ret
ain i
bin 1
i ret
bin i
entry add
a ain
ret result
b bin
8Dependence Graphs
entry main
void main() int sum, i sum 0 i 1
while (ilt11) sum add(sum,i) i
add(i, 1) printf(sumd\n, sum)
printf(id\n, i) static int add(int a,
int b) return (ab)
sum 0
i 1
while i lt 11
print sum
print i
call add
call add
i ret
ain sum
sum ret
ain i
bin 1
bin i
entry add
a ain
ret result
b bin
9Dependence Graphs and Slicing
entry main
Legend control data
void main() int sum, i sum 0 i 1
while (ilt11) sum add(sum,i) i
add(i, 1) printf(sumd\n, sum)
printf(id\n, i) static int add(int a,
int b) return (ab)
sum 0
i 1
while i lt 11
print sum
print i
call add
call add
ain i
i ret
ain sum
sum ret
bin 1
bin i
entry add
a ain
ret result
b bin
result a b
The backward slice from a statement shows all
influences on that statement.
10Dependence Graphs and Slicing
entry main
Legend control data
void main() int sum, i sum 0 i 1
while (ilt11) sum add(sum,i) i
add(i, 1) printf(sumd\n, sum)
printf(id\n, i) static int add(int a,
int b) return (ab)
sum 0
i 1
while i lt 11
print sum
print i
call add
call add
ain i
i ret
ain sum
sum ret
bin 1
bin i
entry add
a ain
ret result
b bin
result a b
The backward slice from a statement shows all
influences on that statement.
11Wide-Spectrum Program Representation
Analysis operations
FE1
C client code
A P I
Builder
...
IR
Pre-IR
GUI
FEm
Synthesis operations
- ASTs
- symbol table
- local def, use, conditional kill, pointer
ref-deref info - CFGs
- source positions
Scheme scripts
- front ends
- EDG
- - ANSI C
- - other C
- - C
- - Java
- assembler / binaries
- UML (Rose/RT)
- Verilog
- VHDL
- Jovial
-
Support for make, libraries and archives, loader
Code done current designed prototyped by
others IASET / OASIS
12Pointer Analysis
- Flow insensitive, context insensitive
- Andersen
- Steensgaard / Das
- Improvements
- Structure fields
- Context sensitive
- PCC-like application for IRM
13Andersen Pointer
normalized statements of program (base facts)
program-independent rules
p q
p q
p q
p q
iterate to a fixpoint
14Steensgaard / Das
- Andersen
- time cubic in variables
- Steensgaard
- time almost linear in variables
- keep at most one out edge form unions on pq
- precision ltlt Andersen
- Das
- time Steensgaard
- precision (size of points-to sets) Andersen
- GrammaTech implementation of Steensgaard / Das
- tentative finding
- performance gain often substantial
- precision loss often unacceptable
- current plan continue improving Andersen
15Need for Discrimination by Structure Field
- Current release all fields participate in every
operation on any field - Must discriminate among fields
- Must consider unions and casts
- Offsets
- Cannot use for portable analysis
- Should use for precise platform dependent analysis
Which assignments through p should be in the
slice?
backward slice from here
16Need for Context Sensitive Pointer Analysis
points-to(p) c, d
shouldnt be in slice
backward slice here
buf c
buf d
17PCC-like Pointer-Analysis for IRM
subset lattice of points-to info
- Generate pointer analysis for object code
- Ship B and FP with code
- Receiver verifies
- B corresponds to the program
- B ? FP
- FP is a fixpoint
- Receiver avoids the iteration
fixpoint solution
FP
iteration
B
base facts
object code
18PCC-like Pointer-Analysis for IRM, continued
fixpoint solution
FP
iteration
B
base facts
object code
source code
19Need for Variable-Based Queries
- Security analysts need information flows per
variable - Point-based backward slice (implicitly w.r.t. p,
a, and b) - a 1
- b 2
- if ( ) p a else p b
- x p
- Point-and-variable backward slice w.r.t. a
- a 1
- b 2
- if ( ) p a else p b
- x p
20Non-Structured Control Constructs Horwitz
- Unstructured jumps are a source of imprecision in
slicing based on dependence graphs - Importance
- in C / C
- switch, break, continue, goto
- in assembler and binaries
- Solution
- distinguish between
- transitive closure of direct control dependence
- generalized control dependence
21Model Checking
- Boolean combinations of primitive queries are not
sufficient - e.g., chop(p,q,) ? forward-slice(p) ?
backward-slice(q) - Need a language for posing generalized path
queries - Example Find tell-tale signs of Trojan-horse in
login shell - Approach use model checking on CFGs and
dependence graph - Model checker for CTL (interprocedurally
imprecise) prototyped - Model checker for Modal Mu Calculus
(interprocedurally precise)
22Summary
- Extensive static-analysis infrastructure
- constructing dependence graphs
- performing precise interprocedural
information-flow queries - inspecting flows
- Applicable today to real C programs demo
- Limitations being addressed now
- precision
- pointer analysis
- non-structured control constructs
- variable-based queries
- performance
- query expressiveness
- Upcoming inquiry
- verifiable static analysis for efficient IRM
insertion