Interprocedural Dataflow Analysis in the Presence of Large Libraries - PowerPoint PPT Presentation

About This Presentation
Title:

Interprocedural Dataflow Analysis in the Presence of Large Libraries

Description:

Seton Hall University. 2. CC 2006, Scott Kagan, PRESTO Research Group ... e.g. dependence analysis for program slicing, change impact analysis, refactoring, etc. ... – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 18
Provided by: scott219
Category:

less

Transcript and Presenter's Notes

Title: Interprocedural Dataflow Analysis in the Presence of Large Libraries


1
Interprocedural Dataflow Analysis in the Presence
of Large Libraries
  • Atanas (Nasko) Rountev
  • Scott Kagan
  • Ohio State University
  • Thomas Marlowe
  • Seton Hall University

2
Uses of Interprocedural Dataflow Analysis
  • Performance optimizations in compilers
  • Software understanding and transformation
  • e.g. dependence analysis for program slicing,
    change impact analysis, refactoring, etc.
  • Software testing
  • e.g. dataflow-based testing testing of object
    interactions in OO software
  • Software checking
  • e.g. object protocols open(readwrite)close

3
Model for Interprocedural Whole-Program Analysis
Engine for Whole- Program Dataflow Analysis
code for C1
dataflow solution for C1 C2 Cn
code for C2

code for Cn
  • Components C1, C2, , Cn form a complete program
  • Assumption it is possible and desirable to
    analyze the source code of the entire program

4
A Specific Case Main Lib
code for Main
Engine for Whole- Program Dataflow Analysis
dataflow solution for Main Lib
code for Lib
  • Main Lib form a complete program
  • What if we are using large libraries that need to
    be re-analyzed from scratch?
  • e.g. the standard Java libraries contain about
    10,000 classes and 80,000 methods
  • need to be re-analyzed with every new Main
    component

5
Example Methods in Java Programs
6
A Specific Case Main Lib
Summary Generation Analysis
code for Lib
summary for Lib
Engine for Whole-Program Dataflow Analysis
dataflow solution for Main
code for Main
summary for Lib
  • Goal the solution for Main should be as good as
    the solution that would have been computed by a
    whole-program analysis (no loss of precision)

7
Functional Approach to Whole-Program Analysis
  • Sharir-Pnueli 1981
  • Dataflow lattice L
  • Edge function f L ? L for effects of a statement
  • Path function f fn ? fn-1 ? ? f2 ? f1
  • Phase 1 summary functions fn L ? L
  • solution at node n as a function of the solution
    at the entry of ns procedure
  • Phase 2 solutions at start nodes of procedures
  • Phase 3 solutions at the remaining nodes

8
Example Functional Approach
f6 f13 ? f1 ? f0
f28 f8 ? f7 f21 f4 ? f5 ? (f28 ? f6) f13
(f21 ? f2) ? (f21 ? f3)
9
Callbacks
  • Callbacks
  • e.g. function pointers in C
  • e.g. virtual dispatch in C and Java
  • Can no longer determine f21 and f13 without code
    for ext

10
Library Summary
  • Idea run pieces of phase 1
  • Compute functions for sets of library-local paths

f id f f8 ? f7 ? f6 f f4 ? f5 f
f2 ? f3 f id
1416
1421
1721
7 11
1213
11
Library Summary Generation
  • Fixed call in the library
  • always invokes the same library procedure
    independent of code for main component
  • Fixed procedure in the library
  • makes no calls, or
  • makes only fixed calls, to fixed procedures
  • standard functional approach can be applied
  • For any other procedure, compute f
  • k is the start node, or
  • k is a return from a non-fixed call, or
  • k is a return from a fixed call to a non-fixed
    procedure

k n
12
Example Library Summary Generation
  • Fixed calls
  • 11-12 and 23-24
  • Non-fixed calls
  • 16-17
  • Fixed procedures
  • p3
  • Non-fixed procedures
  • p1 and p2
  • Contexts k for f
  • 7 and 14 start nodes
  • 17 return from a non-fixed call
  • 12 return from a fixed call to a non-fixed
    procedure

k n
13
The Condensed Graph
f id f f8 ? f7 ? f6 f f4 ? f5 f
f2 ? f3 f id
1416
1421
1721
7 11
1213
14
Analysis of a Main Component
  • Create a fake graph for the whole program
  • Run a whole-program analysis engine
  • Safe solutions for non-library nodes
  • precise for distributive problems

15
Original vs. Condensed Library CFGs Number of
Nodes
16
Original vs. Condensed Library CFGs Number of
Edges
17
Discussion
  • Flow and context insensitivity
  • Cost reduction time and memory
  • Compact representation of functions
  • IFDS, IDE
  • Use assumptions about the callback methods?
  • e.g. assume callback methods are good
Write a Comment
User Comments (0)
About PowerShow.com