Title: Integrating Tools for Practical Software Analysis
1Integrating Tools for Practical Software Analysis
- Aaron Bradley, Zohar Manna,
- Henny Sipma, Sarah Solter
- Stanford University
CUE Workshop, Vienna, October 7, 2004
2Objective
- Program Analysis
- From point solutions to a tool infrastructure
3Tools
software producing world
?
4Tool-tree
software producing world
intelligent compiler
5Outline
- Survey of tools
- Tool integration
- Integration of methodologies
- An attempt at integration
- Proposal
- Topics for discussion
6Survey of tools Overview
- Isolated tools
- state-based
- syntax-based
- language-based
- Supporting tools
- Parsers
- Foundational
- Language integration
7What is an isolated tool?
- Based on one research insight
- Mostly tightly integrated with other components
to make it work - Performs a single task
- Usually not a source for further research
- May be extensible
8Isolated Tools State-based
- Whole program analysis
- behavioral properties
- Historically known as formal methods
- User interaction moderate to extensive
- Types of analysis
- Model checking CBMC, CMC, JPF, Spin
- Predicate abstraction Bandera, Blast, Slam
- Abstract interpretation Astree
- Theorem proving ACL2, ESC, PVS, STeP
9Isolated tools Syntax-based
- Spectrum of analyses pattern-matching to
dataflow analysis over program statements - Bug finding (unsound and incomplete)
- Minimal user interaction
- Availability
- Java FindBugs
- C ESP, PREFix, xgcc
10Isolated tools language-based
- Based on programming language semantics
- Extend programming language semantics to check
deeper behavioral properties - type qualifiers Cqual
- Dataflow analyses
11What is a supporting tool?
- Does not solve a whole problem product is not
used by a person - Indirectly assists in other solving other
problems - Crucial to solving real problems
- Usually result of specialized research
- Usually incorporated via APIs
12Parsers
- Transforms source code into a form amenable to
analysis (AST, 3- or 4-address code, byte code) - Desirable features
- support for annotating code
- support for querying other analyses
- enforce standards of information presentation
- support incremental analysis
- Examples
- C CIL
- Java Flex, Soot, Barat, PMD
13Foundational tools
- Engines of analysis tools
- Results of specialized research
- Examples
- Decision procedures CVC, CVC-Lite, ICS
- SAT solvers chaff
- BDD packages
- Polyhedra manipulation PPL, Polka
- Algebraic packages Reduce, Mathematica
- Linear programming/linear algebra Matlab, GLPK
- Constraint solvers
14Tool Integration frameworks
- Formal conversion between different languages and
property representations - Examples
- VeriTech
- tools supported Murphi, SMV, Spin, STeP,.......
- maintaining semantics of properties proved
- SAL
- expressive language to allow compilation of other
languages - provides analysis
- Common intermediate representation for C, C,
Java? Byte code?
15Options for Integration
- Loose coupling
- via input/output language
- Tight coupling
- one monolithic system
- API coupling
- component-based
16Loose coupling
- Results of analyses are written to file and read
from file by other analyses - Properties proved by one tool must be translated
into other tools semantics may be different. - Examples Veritech, JPF version 1, STeP Spin,
Bandera
17Tight coupling
- Isolated tools monolithic systems
- All components dependent on shared
representations - Cannot be pulled apart
- Components cannot be used in isolation
- Sometimes can extend the system with new
analyses, e.g. PVS, BANE, REDUCE
18API coupling and layering
- Agreed upon standards for APIs and
representations - Layers
- user-level interfaces
- software analysis tools
- common dataflow and type inference framework,
etc. - parsers and foundational tools
- Each component can be made available to other
reseachers as is
19What about integrating methodologies?
- Classical
- traditionally compiler optimization
- programming language semantics
- State-space
- model checking/theorem proving
- program semantics
- Language-based
- sequences of statements
- language semantics
20Classical Program Analysis
- Foundations
- Type inference
- controlflow/dataflow analysis
- Exploits the semantics of programming languages
- Examples
- Alias analysis
- Class hierarchy analysis
- Traditional objective compiler optimization
- Requirement low complexity
- Can be borrowed by other tools
- Recent extension CQual
21State-space analysis
- Traditional objective verification and program
understanding - Model behaviors
- Examples
- invariant generation
- termination analysis
- verification condition generation (Floyd-Hoare)
- model checking
22Language perspective
- View the program as a sequence of statements
- Specify properties on these sequences, for
example ( xmalloc(...) free(x)) - Pattern matching on statements
- Successful in bugfinding
23state space
Blast, SLAM, ESC
classical
language
CQual
ESP
24state space
Blast, SLAM, ESC
predicate abstraction
class hierarchy analysis
decision procedures
dataflow framework
quantified LTL
alias analysis
alias analysis
classical
language
CQual
ESP
25A researchers experience
- Topic termination analysis
- objective prove that all loops in a program
terminate - approach polynomial analysis
26Termination analysis
Implementation in Mathematica
transition systems
27A researchers experience
- Topic termination analysis
- objective prove that all loops in a program
terminate - approach polynomial analysis
- claim scales to large programs
- claim is useful for practical programs
- Needs experiments on real programs
28Termination analysis
alias analysis
class hierarchy analysis
Implementation in Mathematica
C programs
CIL
CFG construction
invariant generation
dataflow framework
29A researchers experience
- Topic termination analysis
- objective prove that all loops in a program
terminate - approach polynomial analysis
- claim scales to large programs
- claim is useful for practical programs
- Needs experiments on real programs
- Successfully applies the analysis to several
large programs - Writes a paper with impressive test results
30Tools
software producing world
31Problems with current approach
- Large integration effort
- Efforts are repeated
- Multiple languages
- Tools not available to the user community
- Too many tools of unknown trustworthiness
32Wishlist
- Users
- single access point
- Tool developers
- easy access to other tools
- infrastructure to insert tools
33User level interface
- Single-access point
- Keep new versions hidden from the user
- Intelligent compiler
- prioritization and hierarchical presentation of
errors, warnings, proofs, presented like a
compiler - extensible for user interaction for more
sophisticated users
34Front end Eclipse
- Integrated development environment
- Provides
- single access point for users
- hooks for tool developers
- Lacks
- infrastructure to communicate analysis results
from one tool to another - Burden of integration of analysis tools and
methodologies still on the user - No real integration
- Does not solve problems of too many tools
35Wishlist contd
- Approach Component-based with API coupling and
layering - Infrastructure that enables insertion of tools
without the need for adaptation - Single intermediate language
- Spectrum of interfaces intelligent compiler
building of customized applications - Benchmarks for comparison
- Identification of strengths relative to
categories of applications - Refereed tools to ensure trustworthiness
36Tool-tree
software producing world
intelligent compiler
benchmarks
API
37Tool-tree
software producing world
intelligent compiler
38Tool-tree
software producing world
intelligent compiler
39Addition of new tool to tool tree
- Conform to API for performed analysis (or create
new analysis for new type of analysis) - Run standard benchmarks
- Compare performance with similar analyses
40Termination analysis
gcc t .......
loops proven
A 90 15 55
B 25 65 42
C 45 55 15
D 65 15 78
41Tool-tree
software producing world
Portal Resource selection Resource
discovery
database with test results
can be implemented as a webservice
42Tool tree
- Similar to the network protocol stack
- Provides a lot of functionality through a narrow
interface - Child APIs are a superset of their parents
- Richer APIs for more sophisticated/demanding
users - Can provide many alternatives for the same
analysis - Allows addition of new analyses
43How do we make it happen?
44Discussion topics
45Publication vs Implementation
- Considerations
- Is implementation of research results a
respectable research pursuit? - Is it worthwhile to spend time on implementation?
- Authorship often goes to the developer of an
idea implementors are considered secondary - Implementations require a lot of effort, but
decrease the effort of others. How to reward the
effort? - Ideas
- Conference that requires submissions to be
accompanied by implementations - Journal of foundational tool descriptions and
documentation
46Funding structure
- Considerations
- Funding agencies often require revolutionary new
approaches implementation and integration of
existing results are not considered - Ideas
- Follow-up grants specifically for implementation
and integration - Infrastructure grants
- Existing effort ESCHER?
47Working group to define APIs
- Considerations
- Who should participate?
- Process to reach agreement on APIs
- Core implementation academic pursuit?
- Compiler-like interface an industry pursuit?
- Funding?
- Ideas
- Integrate working group with tool conference
- Maintain a portal
- Tool submission is refereed, provides a source of
prestige and promotes use
48Enforcement or Voluntary Adoption?
- Considerations
- Analysis requirements imposed by regulatory
agencies (FAA, FDA,.....)? - Government intervention could put too many
restrictions on research activity - Economic concerns drive non-safety-critical
applications - Usability
- Product liability?
49Other topics
- Component/API based integration realistic?
- One intermediate language for C/C/Java?
- Collaboration vs competition