Program Representations - PowerPoint PPT Presentation

About This Presentation
Title:

Program Representations

Description:

Edges represent potential flow of control between BBs. Program path. B1. B2. B3. B4 ... V = Vertices, nodes (BBs) E = Edges, potential flow of control E V V ... – PowerPoint PPT presentation

Number of Views:83
Avg rating:3.0/5.0
Slides: 43
Provided by: srirama5
Category:

less

Transcript and Presenter's Notes

Title: Program Representations


1
Program Representations
Xiangyu Zhang
2
Program Representations
  • Static program representations
  • Abstract syntax tree
  • Control flow graph
  • Program dependence graph
  • Call graph
  • Points-to relations.
  • Dynamic program representations
  • Control flow trace, address trace and value
    trace
  • Dynamic dependence graph
  • Whole execution trace

3
(1) Abstract syntax tree
  • An abstract syntax tree (AST) is a finite,
    labeled, directed tree, where the internal nodes
    are labeled by operators, and the leaf nodes
    represent the operands of the operators.

Program chipping.
4
(2) Control Flow Graph (CFG)
  • Consists of basic blocks and edges
  • A maximal sequence of consecutive instructions
    such that inside the basic block an execution can
    only proceed from one instruction to the next
    (SESE).
  • Edges represent potential flow of control between
    BBs.
  • Program path.

B1
  • CFG
  • V Vertices, nodes (BBs)
  • E Edges, potential flow of control E ? V V
  • Entry, Exit ?V, unique entry and exit

B2
B3
B4
5
(2) An Example of CFG
  • BB- A maximal sequence of consecutive
    instructions such that inside the basic block an
    execution can only proceed from one instruction
    to the next (SESE).

1 sum0 2 i1
1 sum0 2 i1 3 while ( i 4 ii1 5 sumsumi endwhile 6
print(sum)
3 while ( i
4 ii1 5 sumsumi
6 print (sum)
6
(3) Program Dependence Graph (PDG) Data
Dependence
  • S data depends T if there exists a control flow
    path from T to S and a variable is defined at T
    and then used at S.

1 2 3
4 5 6
7 8 9
10
7
(3) PDG Control Dependence
  • X dominates Y if every possible program path from
    the entry to Y has to pass X.
  • Strict dominance, dominator, immediate dominator.

1 sum0 2 i1
1 sum0 2 i1 3 while ( i 4 ii1 5 sumsumi endwhile 6
print(sum)
3 while ( i
4 ii1 5 sumsumi
6 print (sum)
DOM(6)1,2,3,6 IDOM(6)3
8
(3) PDG Control Dependence
  • X post-dominates Y if every possible program path
    from Y to EXIT has to pass X.
  • Strict post-dominance, post-dominator, immediate
    post-dominance.

1 sum0 2 i1
1 sum0 2 i1 3 while ( i 4 ii1 5 sumsumi endwhile 6
print(sum)
3 while ( i
4 ii1 5 sumsumi
6 print (sum)
PDOM(5)3,5,6 IPDOM(5)3
9
(3) PDG Control Dependence
  • Intuitively, Y is control-dependent on X iff X
    directly determines whether Y executes
    (statements inside one branch of a predicate are
    usually control dependent on the predicate)
  • there exists a path from X to Y s.t. every node
    in the path other than X and Y is post-dominated
    by Y
  • X is not strictly post-dominated by Y

X
Y
Sorin Lerner
10
(3) PDG Control Dependence
  • A node (basic block) Y is
    control-dependent on another X iff X directly
    determines whether Y executes
  • there exists a path from X to Y s.t. every node
    in the path other than X and Y is post-dominated
    by Y
  • X is not strictly post-dominated by Y

1 sum0 2 i1
1 sum0 2 i1 3 while ( i 4 ii1 5 sumsumi endwhile 6
print(sum)
3 while ( i
4 ii1 5 sumsumi
6 print (sum)
CD(5)3
CD(3)3, tricky!
11
(3) PDG Control Dependence is not Syntactically
Explicit
  • A node (basic block) Y is
    control-dependent on another X iff X directly
    determines whether Y executes
  • there exists a path from X to Y s.t. every node
    in the path other than X and Y is post-dominated
    by Y
  • X is not strictly post-dominated by Y

1 sum0 2 i1
1 sum0 2 i1 3 while ( i 4 ii1 5 if (i20) 6
continue 7 sumsumi endwhile 8
print(sum)
3 while ( i
4 ii1 5 if (i20)
7 print (sum)
8 print (sum)
12
(3) PDG Control Dependence is Tricky!
  • A node (basic block) Y is
    control-dependent on another X iff X directly
    determines whether Y executes
  • there exists a path from X to Y s.t. every node
    in the path other than X and Y is post-dominated
    by Y
  • X is not strictly post-dominated by Y
  • Can a statement control depends on two predicates?

13
(3) PDG Control Dependence is Tricky!
  • A node (basic block) Y is
    control-dependent on another X iff X directly
    determines whether Y executes
  • there exists a path from X to Y s.t. every node
    in the path other than X and Y is post-dominated
    by Y
  • X is not strictly post-dominated by Y
  • Can one statement control depends on two
    predicates?

1 ? p1
1 if ( p1 p2 ) 2 s1 3 s2
1 ? p2
What if ? 1 if ( p1 p2 ) 2 s1 3
s2
2 s1
3 s2
Interprocedural CD, CD in case of exception,
14
(3) PDG
  • A program dependence graph consists of control
    dependence graph and data dependence graph
  • Why it is so important to software reliability?
  • In debugging, what could possibly induce the
    failure?
  • In security

pgetpassword( ) if (pzhang) send
(m)
15
(4) Points-to Graph
  • Aliases two expressions that denote the same
    memory location.
  • Aliases are introduced by
  • pointers
  • call-by-reference
  • array indexing
  • C unions

16
(4) Points-to Graph
  • Aliases two expressions that denote the same
    memory location.
  • Aliases are introduced by
  • pointers
  • call-by-reference
  • array indexing
  • C unions

17
(4) Why Do We Need Points-to Graphs
  • Debugging

x.lock() ... y.unlock() // same object as x?
  • Security

F(x,y) x.fpassword print (y.f)
F(a,a) disaster!
18
(4) Points-to Graph
  • Points-to Graph
  • at a program point, compute a set of pairs of the
    form p - x, where p MAY/MUST points to x.

m(p) r new C() p-f r t new C() if
() qp r-f t
r
19
(4) Points-to Graph
  • Points-to Graph
  • at a program point, compute a set of pairs of the
    form p-x, where p MAY/MUST points to x.

m(p) r new C() p-f r t new C() if
() qp r-f t
r
p
f
20
(4) Points-to Graph
  • Points-to Graph
  • at a program point, compute a set of pairs of the
    form p-x, where p MAY/MUST points to x.

m(p) r new C() p-f r t new C() if
() qp r-f t
r
p
f
t
21
(4) Points-to Graph
  • Points-to Graph
  • at a program point, compute a set of pairs of the
    form p-x, where p MAY/MUST points to x.

m(p) r new C() p-f r t new C() if
() qp r-f t
r
p
f
q
t
22
(4) Points-to Graph
  • Points-to Graph
  • at a program point, compute a set of pairs of the
    form p-x, where p MAY/MUST points to x.

m(p) r new C() p-f r t new C() if
() qp r-f t
r
p
f
f
q
t
p-f-f and t are aliases
23
(5) Call Graph
  • Call graph
  • nodes are procedures
  • edges are calls
  • Hard cases for building call graph
  • calls through function pointers

Can the password acquired at A be leaked at G?
24
How to acquire and use these representations?
  • Will be covered by later lectures.

25
Program Representations
  • Static program representations
  • Abstract syntax tree
  • Control flow graph
  • Program dependence graph
  • Call graph
  • Points-to relations.
  • Dynamic program representations
  • Control flow trace
  • Address trace, Value trace
  • Dynamic dependence graph
  • Whole execution trace

26
(1) Control Flow Trace
N2
11 sum0
21 i1
1 sum0 2 i1
31 while ( i41 ii1
51 sumsumi
3 while ( i32 while ( i42 ii1
4 ii1 5 sumsumi
52 sumsumi
33 while ( i61 print (sum)
6 print (sum)
x is a program point, xi is an execution point


27
(1) Control Flow Trace
N2
11 sum0 i1
1 sum0 2 i1
31 while ( i41 ii1 sumsumi
3 while ( i32 while ( i4 ii1 5 sumsumi
42 ii1 sumsumi
33 while ( i6 print (sum)
61 print (sum)
A More Compact CFT
28
(2) Dynamic Dependence Graph (DDG)
Input N2
11 z0
1 z0 2 a0 3 b2 4
pb 5 for i 1 to N do 6 if ( i
2 0) then 7 pa
endif endfor 8 aa1 9
z2(p) 10 print(z)
21 a0
31 b2
41 pb
51 for i1 to N do
61 if (i20) then
81 aa1
29
(2) Dynamic Dependence Graph (DDG)
Input N2
1 z0 2 a0 3 b2 4
pb 5 for i 1 to N do 6 if ( i
2 0) then 7 pa
endif endfor 8 aa1 9
z2(p) 10 print(z)
One use has only one definition at runtime One
statement instance control depends on only one
predicate instance.
30
(3) Whole Execution Trace
Input N2
T 1 2 3 4 5 6 7 8 9 10 11 12 13 14
11 z0 21 a0 31 b2 41 pb 51
for i 1 to N do 61 if ( i 2 0) then 81
aa1 91 z2(p) 52 for i 1 to N do 62
if ( i 2 0) then 71 pa 82 aa1 92
z2(p) 101 print(z)
31
(3) Whole Execution Trace
Multiple streams of numbers.
32
Program Representations
  • Static program representations
  • Abstract syntax tree
  • Control flow graph
  • Program dependence graph
  • Call graph
  • Points-to relations.
  • Dynamic program representations
  • Control flow trace, address trace and value
    trace
  • Dynamic dependence graph
  • Whole execution trace

33
What is a slice?
  • S . f (v)
  • Slice of v at S is the set of statements
    involved in computing vs value at S.
  • Mark Weiser, 1982
  • Data dependence
  • Control dependence

Void main ( ) int I0 int sum0
while (IIadd(I,1) printf (sumd\n,sum)
printf(Id\n,I)
34
How to do slicing?
  • Static analysis
  • Input insensitive
  • May analysis
  • Dependence Graph
  • Characteristics
  • Very fast
  • Very imprecise

35
Why is a static slice imprecise?
  • All possible program paths

S1x
S2x
L1x
  • Use of Pointers static alias analysis is very
    imprecise

S1a
S2b
L1p
  • Use of function pointers hard to know which
    function is called, conservative expectation
    results in imprecision

36
Dynamic Slicing
  • Korel and Laski, 1988
  • Dynamic slicing makes use of all information
    about a particular execution of a program and
    computes the slice based on an execution history
    (trace)
  • Trace consists control flow trace and memory
    reference trace
  • A dynamic slice query is a triple
  • Smaller, more precise, more helpful to the user

37
Dynamic Slicing Example -background
For input N2,
11 b0
b0 21 a2 31 for i 1 to N do
i1 41 if ( (i) 2 1) then
i1 51 aa1
a3 32 for i1 to N do
i2 42 if ( i2 1) then
i2 61 ba2
b6 71 zab
z9 81 print(z)
z9
1 b0 2 a2 3 for i 1 to N do 4 if
((i)21) then 5 a a1 else 6
b a2 endif done 7 z ab 8 print(z)
38
Issues about Dynamic Slicing
  • Precision perfect
  • Running history very big ( GB )
  • Algorithm to compute dynamic slice -
    slow and very high space requirement.

39
Backward vs. Forward
  • 1 main( )
  • 2
  • 3 int i, sum
  • 4 sum 0
  • 5 i 1
  • 6 while(i
  • 7
  • 8 sum sum 1
  • 9 i
  • 10
  • 11 Cout
  • 12 Cout
  • 13
  • An Example Program its forward slice w.r.t.

40
Comments
  • Want to know more?
  • Frank Tips survey paper (1995)
  • Static slicing is very useful for static analysis
  • Code transformation, program understanding, etc.
  • Points-to analysis is the key challenge
  • Not as useful in reliability as dynamic slicing
  • Dynamic slicing
  • Precise
  • good for defect analysis.
  • Solution space is much larger.
  • There exist hybrid techniques.

41
Efficiency
  • How are dynamic slices computed?
  • Execution traces
  • control flow trace -- dynamic control dependences
  • memory reference trace -- dynamic data
    dependences
  • Construct a dynamic dependence graph
  • Traverse dynamic dependence graph to compute
    slices

42
How to Detect Dynamic Dependence
  • Dynamic Data Dependence
  • Shadow space (SS)
  • Addr ? Abstract State

Virtual Space
Shadow Space
s1x
r1
s1x ST r1, r2
SS(r2)s1x
s2y ? SS(r1)s1x
s2y LD r1, r2
Dynamic control dependence is more tricky!
Write a Comment
User Comments (0)
About PowerShow.com