Title: Bandera: Extracting Finitestate Models from Java Code
1Bandera Extracting Finite-state Models from Java
Code
Students and Post-docs
Faculty
- James Corbett
- Matthew Dwyer
- John Hatcliff
Shawn Laubach Corina Pasareanu Robby Hongjun Zheng
Roby Joehanes Ritesh Desai Venkatesh
Ranganath Oksana Tkachuk
2Goal Increase Software Reliability
Trends
Size, complexity, concurrency, distributed
Cost of software engineer.
Cost of CPU cycle..
Future Automated Fault Detection
3The Dream
void add(Object o) bufferhead o head
(head1)size Object take()
tail(tail1)size return buffertail
OK
Program
or
Error trace
Checker
Property 1 Property 2
Requirement
4Model Checking
OK
Finite-state model
or
Error trace
Model Checker
(F W)
Line 5 Line 12 Line 15 Line 21 Line
25 Line 27 Line 41 Line 47
Temporal logic formula
5Spin Example
proctype A(chan in, out) byte mt / message
data / bit vr L1 mt (mt1MAX)
out!mt,1 goto L2 L2 in?vr if
(vr 1) goto L1 (vr 0) goto L3
printf(Error) goto L5 fi L3
out!mt,1 goto L2 L4 in?vr if
goto L1 printf(Error) goto L5
fi L5 out!mt,0 goto L4
Fragment of Alternating Bit Protocol
6Explicit State Model-checking
7Explicit State Model-checking
Conceptual View
Explored State-Space (computation tree)
L1, (mt1, vr1), .
Implementation
L2, (mt2, vr2), .
L1, (mt1, vr1), .
8Explicit State Model-checking
Conceptual View
Explored State-Space (computation tree)
L1, (mt1, vr1), .
L2, (mt2, vr2), .
Implementation
L3, (mt3, vr3), .
L1, (mt1, vr1), .
L5, (mt5, vr5), .
L2, (mt2, vr2), .
L1, (mt1, vr1), ..
9Explicit State Model-checking
Conceptual View
Explored State-Space (computation tree)
L1, (mt1, vr1), .
L2, (mt2, vr2), .
L3, (mt3, vr3), .
Implementation
L1, (mt1, vr1), .
L5, (mt5, vr5), .
L2, (mt2, vr2), .
L1, (mt1, vr1), ..
L3, (mt3, vr3), .
10Explicit State Model-checking
Conceptual View
Explored State-Space (computation tree)
L1, (mt1, vr1), .
L2, (mt2, vr2), .
L3, (mt3, vr3), .
L5, (mt5, vr5), .
Implementation
L1, (mt1, vr1), .
L1, (mt1, vr1), ..
L2, (mt2, vr2), .
L3, (mt3, vr3), .
L5, (mt5, vr5), .
11Why Try to Use Model Checking for Software?
- Automatically check, e.g.,
- invariants, simple safety liveness properties
- absence of dead-lock and live-lock,
- complex event sequencing properties,
Between the window open and the window close,
button X can be pushed at most twice.
- In contrast to testing, gives complete coverage
by exhaustively exploring all paths in system, - Its been used for years with good success in
hardware and protocol design
This suggests that model-checking can complement
existing software quality assurance techniques.
12What makes model-checking software difficult?
Problems using existing checkers
13Model Construction Problem
void add(Object o) bufferhead o head
(head1)size Object take()
tail(tail1)size return buffertail
Model Checker
Program
Model Description
Programming Languages
methods, inheritance, dynamic creation,
exceptions, etc.
Model Description Languages
automata
14What makes model-checking software difficult?
Problems using existing checkers
15Property Specification Problem
- Difficult to formalize a requirement in temporal
logic
Between the window open and the window close,
button X can be pushed at most twice.
is rendered in LTL as...
((open /\ ltgtclose) -gt ((!pushX /\ !close) U
(close \/ ((pushX /\ !close) U (close \/
((!pushX /\ !close) U (close \/ ((pushX
/\ !close) U (close \/ (!pushX U
close))))))))))
16Property Specification Problem
Forced to state property in terms of model rather
than source
- We want to write source level specifications...
Heap.b.head Heap.b.tail
- We are forced to write model level
specifications...
(((_collect(heap_b) 1)\
(BoundedBuffer_col.instance_index(heap _b).head
BoundedBuffer_col.instance_inde
x(heap _b).tail) )\ ((_collect(heap _b)
3)\ (BoundedBuffer_col_0.instance_index(
heap _b).head
BoundedBuffer_col_0.instance_index(heap
_b).tail) )\ ((_collect(heap _b) 0)
TRAP))
17What makes model-checking software difficult?
Problems using existing checkers
18State Explosion Problem
- Cost is exponential in the number of components
- Moores law and algorithm advances can help
- Holzmann 7 days (1980) gt 7 seconds (2000)
- Explosive state growth in software limits
scalability
19What makes model-checking software difficult?
Problems using existing checkers
20Output Interpretation Problem
void add(Object o) bufferhead o head
(head1)size Object take()
tail(tail1)size return buffertail
Model Description
Program
- Raw error trace may be 1000s of steps long
- Must map line listing onto model description
- Mapping to source is made difficult by
- Semantic gap clever encodings of complex
features - multiple optimizations and transformations
21BanderaAn open tool set for model-checking Java
source code
22Addressing the Model Construction Problem
- Numerous analyses, optimizations,two
intermediate languages, multiple back-ends - Slicing, abstract interpretation, specialization
- Variety of usage modes simple...highly tuned
23Addressing the Property Specification Problem
An extensible language based on field-tested
temporal property specification patterns
((open /\ ltgtclose) -gt ((!pushX /\ !close) U
(close \/ ((pushX /\ !close) U (close \/
((!pushX /\ !close) U (close \/ ((pushX
/\ !close) U (close \/ (!pushX U
close))))))))))
24Addressing the State Explosion Problem
void add(Object o) bufferhead o head
(head1)size
Java Source
Model Descriptions
Model Compiler
- Aggressive customization via slicing, abstract
interpretation, program specialization
25Addressing the Output Interpretation Problem
Model Description
Intermediate Representations
Model Checker
Model Compiler
Error trace
- Run error traces forwards and backwards
- Program state queried
- Heap structures navigated
- Locks, wait sets, blocked sets displayed
26Bandera Architecture
27Front End
- Translates Java source to Jimple IR
- Supports specification of property
- Provides debugger-like step facilities for error
traces
Label1 if (x lt 0) goto Label2
t0 y 2 x t0 Label2
if (x gt 0) x y 2
Java
Jimple
28Property Specification
/ observable EXP Full (head
tail) / class BoundedBuffer Object
buffer int head, tail, bound public
synchronized void add(Object o)
public synchronized Object take ()
29Property-directed Slicing
Source program
- slicing criterion generated automatically from
observables mentioned in the property
- backwards slicing automatically finds all
components that might influence the observables.
30Property-directed Slicing
/ _at_observable EXP Full (head tail)
/ class BoundedBuffer Object buffer_
int bound int head, tail public
synchronized void add(Object o) while (
tail head ) try wait() catch (
InterruptedException ex) buffer_head
o head (head1) bound notifyAll()
...
31AbstractionSpecializer
Collapses data domains via abstract
interpretation
Data domains
Code
int x 0 if (x 0) x x 1
32Abstraction Component Functionality
x
int
Signs
y
int
Signs
Signs
done
bool
Bool
Abstraction Library
count
int
intAbs
.
.
o
Object
Point
b
Buffer
Buffer
33Abstraction Specification
abstraction Signs abstracts int begin TOKENS
NEG, ZERO, POS abstract(n) begin
n lt 0 -gt NEG n 0 -gt
ZERO n gt 0 -gt POS end
operator add begin (NEG , NEG) -gt NEG
(NEG , ZERO) -gt NEG (ZERO, NEG) -gt
NEG (ZERO, ZERO) -gt ZERO (ZERO,
POS) -gt POS (POS , ZERO) -gt POS
(POS , POS) -gt POS (_,_)-gt NEG, ZERO,
POS / case (POS,NEG), (NEG,POS) / end
public class Signs public static final int
NEG 0 // mask 1 public static final int
ZERO 1 // mask 2 public static final int POS
2 // mask 4 public static int
abstract(int n) if (n lt 0) return NEG
if (n 0) return ZERO if (n gt 0) return
POS public static int add(int arg1, int
arg2) if (arg1NEG arg2NEG) return
NEG if (arg1NEG arg2ZERO) return
NEG if (arg1ZERO arg2NEG) return
NEG if (arg1ZERO arg2ZERO) return
ZERO if (arg1ZERO arg2POS) return
POS if (arg1POS arg2ZERO) return
POS if (arg1POS arg2POS) return
POS return Bandera.choose(7) / case
(POS,NEG), (NEG,POS) /
34Specification Creation Tools
abstraction Signs abstracts int begin TOKENS
NEG, ZERO, POS abstract(n) begin
n lt 0 -gt NEG n 0 -gt
ZERO n gt 0 -gt POS end
operator add begin (NEG , NEG) -gt NEG
(NEG , ZERO) -gt NEG (ZERO, NEG) -gt
NEG (ZERO, ZERO) -gt ZERO (ZERO,
POS) -gt POS (POS , ZERO) -gt POS
(POS , POS) -gt POS (_,_)-gt NEG, ZERO,
POS end
35Back End
- Bandera Intermediate Representation (BIR)
- guarded command language
- includes locks, threads, references, heap
- info to help translators (live vars, invisible)
loc s5 live r0, r1 when lockAvail(r0.lock)
do lock(r0.lock) goto s6 loc s6 live
r1 when true do invisible r1.count
0 goto s7
entermonitor r0 r1.count 0
Jimple
BIR
36Bounded Buffer BIR
process BoundedB() BoundedBuffer_ref ref
BoundedBuffer_col, BoundedBuffer_col_0
BoundedBuffer_rec record bound_
range -1..4 head_ range -1..4
tail_ range -1..4 BIRLock
lock wait reentrant BoundedBuffer_col
collection 3 of BoundedBuffer_rec
BoundedBuffer_col_0 collection 3 of
BoundedBuffer_rec . . loc s34 live b2,
b1, add_JJJCTEMP_0, add_JJJCTEMP_6,
add_JJJCTEMP_8 when true do invisible
add_JJJCTEMP_8 (add_JJJCTEMP_6
add_JJJCTEMP_8) goto s35 loc s35 live b2,
b1, add_JJJCTEMP_0, add_JJJCTEMP_8 when true
do add_JJJCTEMP_0.head_ add_JJJCTEMP_8
goto s36 loc s36 live b2, b1, add_JJJCTEMP_0
when true do notifyAll(add_JJJCTEMP_0.BIRLo
ck) goto s37 loc s37 live b2, b1,
add_JJJCTEMP_0 when true do
unlock(add_JJJCTEMP_0.BIRLock) goto s38
37Bounded Buffer Promela
typedef BoundedBuffer_rec type_8 bound_
type_8 head_
type_8 tail_
type_18 BIRLock loc_25 atomic
printf("BIR 25 0 1 OK\n") if
(_collect(add_JJJCTEMP_0) 1) -gt
add_JJJCTEMP_8 BoundedBuffer_col.
instance_index(add_JJJCTEMP_0).t
ail_ (_collect(add_JJJCTEMP_0) 2) -gt
add_JJJCTEMP_8 BoundedBuffer_col_0.
instance_index(add_JJJC
TEMP_0).tail_ else -gt
printf("BIR 25 0 1 NullPointerException\n")
assert(0) fi goto loc_26
38Translators
- Plug-in component that interfaces to specific
model checker - Translates BIR to checker input language
- Parses output of checker for error trace
- Currently
- SPIN, dSPIN, SMV translators complete
- JPF (from NASA Ames) integrated
- XMC, FDR translators in progress
39Case Studies
- Small examples thus far (lt 2000 loc)
- illustrating use of property-pattern system and
other components - Scheduler from DEOS real-time OS kernel
- (1600, 22 classes, seven tasks)
- Now trying systems up to 20,000 loc
- collection of 15 open-source 100 pure Java
- Jigsaw web-server from W3C
- Tomcat, James (from Apache/Jakarta)
- In general, 1-2 minutes for model extraction on
(2000k systems) - State space reductions can dramatically reduce
cost
40Summary
- Bandera provides an open platform for
experimentation - Separates model checking from extraction
- uses existing model checkers
- supports multiple model checkers
- Specialize models for specific properties using
automated support for slicing, abstraction, etc. - Designed for extensibility
- well-defined internal representations and
interfaces - We hope this will contribute to the definition of
APIs for software model-checkers
41Context of Project
- Researchers with different backgrounds
(programming languages, static analysis,
verification of concurrent systems, software
engineering) - Started on Bandera in November 1998 (previously
built verification tools for Ada) - Funding from NASA, National Science Foundation,
Honeywell, US Air Force
42Current Status
- A reasonable subset of concurrent Java
- not handled recursive methods, exceptions, inner
classes, native methods, libraries() - You can play around with a pre-alpha version of
the tools accompanied by a draft tutorial - Public release October 2000
http//www.cis.ksu.edu/santos/bandera
43Schedule of BRICS Mini-Course
- Monday -- Overview
- overview talk
- basic demo
- Tuesday -- Specifying Temporal Properties of
Software - overview of temporal specification
- review of CTL, LTL
- temporal specification design patterns
- example driven presentation of Banderas
specification tools - Wednesday -- Details of Bandera Components
- slicing concurrent Java programs
- Bandera abstraction tools
- model generation via Banderas back-end
- summary of case studies (e.g., space-craft
controller examples)