Title: Component-Level Dataflow Analysis
1Component-Level Dataflow Analysis
Atanas (Nasko) Rountev Ohio State University
2Outline
- Interprocedural dataflow analysis
- Whole-program analysis limitations
- Problem making dataflow analysis usable and
useful for component-based software - Technical challenges
- Ongoing and future work
3Uses of Dataflow Analysis
- Software understanding tools
- e.g. dependence analysis for program slicing,
change impact analysis, refactoring, etc. - Software testing
- e.g. dataflow-based testing testing of object
interactions in OO software - Software checking
- e.g. object protocols open(readwrite)close
- Performance optimizations in compilers
4Model for Whole-Program Analysis
code for C1
Whole Program Dataflow Analysis
dataflow solution for C1 C2 Cn
code for C2
code for Cn
- C1 C2 Cn constitute a complete program
- Implicit assumption it is possible and desirable
to analyze the source code of the entire program
as a single unit
5Limitations of Whole-Program Analysis
- What if some of the components are only available
in binary form? - What if we are building a library?
- What if we are using large libraries that need to
be re-analyzed from scratch? - e.g. the standard Java libraries contain a few
thousand classes - What if one part of program changes?
- may have to re-analyze the entire program
6Outline
- Interprocedural dataflow analysis
- Whole-program analysis limitations
- Problem making dataflow analysis usable and
useful for component-based software - Dozens of existing analyses could potentially
become useful for component-based software - In tools for software understanding, testing,
checking, and optimization - Technical challenges
7A Simple Case Main Lib
Component Level Dataflow Analysis
dataflow solution for Lib
code for Lib
summary for Lib
Component Level Dataflow Analysis
code for Main
dataflow solution for Main
summary for Lib
- Goal the solution for Main should be as good as
the solution that would have been computed by a
whole-program analysis (no loss of precision)
8Component Model and Summary Info
- Component set of related procedures or classes
- Component interactions synchronous calls, shared
variables - Challenge more sophisticated component models
- Summary information is computed based only on the
source code of Lib - Challenge use info from component specifications
9Summary Functions
Main calls procedure Q
Main
Q
Summary function for Q fQ f1 ? f2 computed
by the analysis of Lib
Lib
path p1 dataflow function f1
path p2 dataflow function f2
10Open Questions
- Challenge compact representation of dataflow
functions and their transitive composition and
meet - Existing work solves this problem for some
analysis categories need generalizations - Challenge callbacks
- e.g., function pointers in C
- e.g., virtual dispatch in C and Java
- Fundamental problem, not addressed adequately by
existing work
11Callbacks
Main calls procedure Q during the call, Lib
calls R
Main
R
Q
Lib
The function for p2 cannot be computed until Main
is analyzed
Solution summary functions for
subpaths, computed during the analysis of
Lib Later, compose them with the functions from
Main
12Ongoing Work
- Goal 1 (achieved) theoretical model for
computing and using summary functions in the
presence of callbacks - Goal 2 (ongoing) instantiate the model to common
categories of analyses - dependence analysis, pointer analysis, etc.
- Goal 3 experimental evaluation
- e.g. how large are the summaries?
- Eclipse plug-in for call graph construction
needs summary info for all Java 1.4 libraries
13Future Work
- Beyond the traditional restrictions
- Use not only code, but also component
specifications e.g., sharpen the summary
functions based on preconditions - Higher-level of abstraction for component
interfaces and interactions - Right now low-level mechanisms such as procedure
calls and shared variables - Extensive experimental evaluation on real-world
software systems