Title: Model Checking Software Artifacts
1Model Checking Software Artifacts
SAnToS Laboratory, Kansas State University, USA
http//www.cis.ksu.edu/cadena
http//www.cis.ksu.edu/bandera
http//www.cis.ksu.edu/bogor
Principal Investigators
Students
Matt Dwyer John Hatcliff Gurdip Singh
William Deng Georg Jung Oksana Tkachuk
Robby Venkatesh Ranganath Jesse Greenwald Todd
Wallentine
Support
US National Science Foundation (NSF) US National
Aeronautics and Space Agency (NASA) US Department
of Defense Advanced Research Projects
Agency (DARPA) US Army Research Office (ARO)
Rockwell-Collins ATC Honeywell Technology Center
and NASA Langley Sun Microsystems Intel
2For the past decade
- Weve been developing program analysis frameworks
- Standard tensions
- Scalable versus Precise
- How semantic is the analysis?
- Property-specific versus Language-based
- How rich are the properties?
- Push-button versus Configurable
- How usable is the technology?
3Analyzing Source Code
- Worked on a broad range of case studies
- SPIN, SMV,
- Extracting models by hand
- Developed a series of tool frameworks for
analyzing safety properties of concurrent
programs - FLAVERS (Ada)
- INCA Translators (Ada)
- Bandera (Java)
4Succesful?
- Tools are widely used
- For education
- As a basis for further work by us and others
- Tools have been used to find bugs in real systems
- 1000-10000 LOC
- lt10 threads
- Bugs that eluded significant testing efforts
5Whole Program Analysis
- will never scale to large code bases
- even for highly abstract analyses (e.g., control
flow) - even for simple properties (e.g., def-use)
- Must perform modular analyses
- Hard to do for truly global properties?
- Hard to do in presence of concurrency?
- What are the natural module boundaries?
- How big can a module be?
6A Solution
- Target the full range of software artifacts
- Requirements models
- Architectural descriptions
- Designs (at various levels of refinement)
- Code
- Use semantic analyses
- within artifacts (properties)
- across different artifacts (conformance)
7Features of our Vision
- Early and varied semantic modeling
- structural modeling is useful as well
- Analysis driven feedback and refinement
- Artifact generating analyses
- Proofs, reachable modes,
- Synthesize code wherever possible
- Aspects of an agile process
- continuous delivery of working artifacts
- Exploit "domain information" throughout
- ultimately meta-tools may be useful
8Development Flow
Users informal requirements
Query checker, Visualization tools
Requirements Model
Consistency, Completeness, checker
Requirements Model
Requirements Model
Requirements Model
9Development Flow
Users informal requirements
Model-specific analysis
Inter-model consistency, completeness, checking
10Development Flow
Conformance checker(s)
Design Model
Design Model
Design Model
Design Model
11Development Flow
Multi-layer conformance checking
Structural Design Model
Synchronization Policy Spec
Abstract Behavioral Model
Quality of Service Spec
12Development Flow
Structural Design Model
Structural Design Model
Structural Design Model
Structural Design Model
Synchronization Policy Spec
13Development Flow
Structural Design Model
Synchronization Policy Spec
Abstract Behavioral Model
Quality of Service Spec
14Development Flow
Conformance checker(s)
Code
15Development Flow
Domain-appropriate Implementation Framework
Model/spec dependent synthesis procedures (proof
generating)
16Lessons
- Adapt methods to developers
- Ease of use, leverage domain abstractions
- Use layered, incremental methods
- Low entry barrier, early and focused feedback
- Focus technology on the hard part
- Synchronization, timing, global properties
- Synthesize as much code as possible
- Developer buyin, reduce code-level reasoning
- Developers wont write specs, so tell them they
are writing code
17and now for Bogor
18Model Checking in Cadena
- Steps toward our vision
- Hard problems here are
- not component coding (localized)
- Inter-component coordination (sequencing,
synchronization, timing, ) - Theme
- exploit domain semantics
- exploit implementation infra-structures
19An Overview of
- Component modeling
- Middle-ware modeling
- Develop an abstract model that captures semantics
of actual middle-ware - Environment modeling
- Exploit environment information to reduce state
space - Property specification
- Structural reductions
- Exploit structure of state space of periodic RT
systems
20Modal SP
21Component Behavior
input ports
component BMModal uses ReadData dataIn
consumes DataAvailable inDataAvailable
publishes DataAvailable outDataAvailable
provides ReadData dataOut provides
ChangeMode modeChange enum Modes
enabled,disabled Modes m behavior
handles dataInReady (DataAvailable e)
case m of enabled
dataOutdata lt- dataIn.getData()
push dataOutReady
disabled
22Component Behavior
component BMModal uses ReadData dataIn
consumes DataAvailable inDataAvailable
publishes DataAvailable outDataAvailable
provides ReadData dataOut provides
ChangeMode modeChange enum Modes
(enabled,disabled) Modes m behavior
handles dataInReady (DataAvailable e)
case m of enabled
dataOutdata lt- dataIn.getData()
push dataOutReady
disabled
output ports
23Component Behavior
component BMModal uses ReadData dataIn
consumes DataAvailable inDataAvailable
publishes DataAvailable outDataAvailable
provides ReadData dataOut provides
ChangeMode modeChange enum Modes
(enabled,disabled) Modes m behavior
handles dataInReady (DataAvailable e)
case m of enabled
dataOutdata lt- dataIn.getData()
push dataOutReady
disabled
mode declaration using CORBA IDL
24Component Behavior
component BMModal uses ReadData dataIn
consumes DataAvailable inDataAvailable
publishes DataAvailable outDataAvailable
provides ReadData dataOut provides
ChangeMode modeChange enum Modes
(enabled,disabled) Modes m behavior
handles dataInReady (DataAvailable e)
case m of enabled
dataOutdata lt- dataIn.getData()
push dataOutReady
disabled
behavior for events on dataInReady port
25Component Behavior
component BMModal uses ReadData dataIn
consumes DataAvailable inDataAvailable
publishes DataAvailable outDataAvailable
provides ReadData dataOut provides
ChangeMode modeChange enum Modes
(enabled,disabled) Modes m behavior
handles dataInReady (DataAvailable e)
case m of enabled
dataOutdata lt- dataIn.getData()
push dataOutReady
disabled
behavior mode cases
26Component Behavior
component BMModal uses ReadData dataIn
consumes DataAvailable inDataAvailable
publishes DataAvailable outDataAvailable
provides ReadData dataOut provides
ChangeMode modeChange enum Modes
(enabled,disabled) Modes m behavior
handles dataInReady (DataAvailable e)
case m of enabled
dataOutdata lt- dataIn.getData()
push dataOutReady
disabled
data flow specification
27Component Behavior
component BMModal uses ReadData dataIn
consumes DataAvailable inDataAvailable
publishes DataAvailable outDataAvailable
provides ReadData dataOut provides
ChangeMode modeChange enum Modes
(enabled,disabled) Modes m behavior
handles dataInReady (DataAvailable e)
case m of enabled
dataOutdata lt- dataIn.getData()
push dataOutReady
disabled
publish event
28Towards a Complete Model
We have transition semantics for intra-component
behavior.
29Middleware/Service Semantics
- Weak CCM and Event Services Specs (OMG)
- Informal English and examples
- Intentionally under-specified to allow
implementor freedom - Looked at implemented semantics of existing ORBs
and Event Services - ACE/TAO, OpenCCM, K-State
- Developed a family of semantic models that
captured their behavior
30Outline of Real System
Event channel with internal thread pool
Thread Pool
60Hz
20Hz
5Hz
1Hz
correlation filtering
proxy consumer holds list of consumer references
31Lots of details
- What events are users interested in reasoning
about? - publish, dispatch, timeouts,
- What state information?
- modes, dispatch queues, correlation automata,
- Minimize detail, but retain
- ability to express properties of interest
- correspondance with implementation semantics
32System Observations
Event channel with internal thread pool
Thread Pool
60Hz
20Hz
5Hz
1Hz
correlation filtering
33Parts of Model
- Components
- Modes and attributes defined over simple types
- Handler/method state machines
- Method calls
- Middleware
- Subscriber lists, correlators, dispatch queues
- Scheduler, thread pool
- Environment
- Time-triggering of events
- Data from devices, etc.
34Modeling Strategy
Event channel with internal thread pool
Thread Pool
60Hz
20Hz
5Hz
1Hz
correlation filtering
35Modeling Strategy
Event Channel Model
Component Models
Component Models
Connection Models
Environment Model
36Model Checker Support
- Ability to define efficient building blocks
- Sets, queues (with symmetry support)
- Flexible atomicity control
- Programmable scheduler
- Static data definition
- State compaction
- Flexibility in search strategies
- Full-state space
- Bounded/heuristic search
- State-less search
- BOGOR has all this and more
37Modeling of Components
handles dataInReady (DataAvailable e) case m
of enabled dataOutdata lt-
dataIn.getData() push dataOutReady
disabled
function tacticalSteering_push_inDataAvailable(CAD
.Event event) Data d loc loc0 live
when tacticalSteeringMode do goto
loc1 when !tacticalSteeringMode do
return loc loc1 lived when true
do d CAD.getFieldltDatagt(AirFram
e, "ReadData.data") goto loc2
loc loc2 live when true do
CAD.setFieldltDatagt(TacticalSteering,
dataOut.data", d) goto loc3
loc loc3 live invoke
pushOfProxy(TacticalSteering, dataOutReady)
return
38Modeling of Connections
instance AirFrame of BMLazyActive on l2
connect dataAvailable to
GPS.dataCurrent atRate 20 connect dataIn
to GPS.dataOut
Modeled very directly in BOGOR
CAD.connectEvent(GPS, dataCurrent",
AirFrame,"inDataAvailable", 20, false)
39Modeling Middleware (Threads)
Dispatch queue polling
- thread threadgroup5()
- Pair.typeltEventHandlerEnum, CAD.Eventgt pair
- EventHandlerEnum handler
- CAD.Event event
- loc loc0 live handler, event
- when Queue.sizeltPair.typeltEventHandlerEnum,
CAD.Eventgtgt(Q5) gt 0 - do invisible
- pair Queue. getFrontltPair.typeltEventHandl
erEnum, CAD.Eventgt gt(Q5) - Queue.dequeueltPair.typeltEventHandlerEnum,
CAD.Eventgt gt(Q5) - handler Pair. firstltEventHandlerEnum,
CAD.Eventgt(pair) - event Pair.secondltEventHandlerEnum,
CAD.Eventgt(pair) - goto loc1
- loc loc1 live
- invoke virtual f(handler, event)
- goto loc0
-
40Modeling Middleware (Queues)
- extension Queue for edu.ksu.cis.cadena.bogor.ext.Q
ueue - typedef typelt'agt
- expdef int sizelt'agt(Queue.typelt'agt
) - expdef int capacitylt'agt(Queue.type
lt'agt) - expdef boolean isFulllt'agt(Queue.ty
pelt'agt) - expdef boolean isEmptylt'agt(Queue.t
ypelt'agt) - expdef Queue.typelt'agt
createlt'agt(int) - actiondef enqueuelt'agt(Queue.typelt'
agt, 'a) - expdef 'a getFrontlt'agt(Queue.typelt
'agt) - actiondef dequeuelt'agt(Queue.typelt'
agt) - expdef boolean containsPairlt'agt(Qu
eue.typelt'agt,'a) -
Data in state space, operations implemented as
Java code
41Modeling Middleware (Scheduling)
- Typically model checkers use non-deterministic
scheduling - i.e., choose from set of enabled transitions in
a state - set of all such schedules contains all real
schedules
Thread Pool
60Hz
20Hz
5Hz
1Hz
- Bold Stroke Systems are scheduled based on RMA
- run highest-priority (i.e., rate) enabled action
- many fewer schedules, contains all real
schedules - BOGOR allows encoding specific schedules
- Java plugin filters enabled actions in state
exploration algorithm
42Modeling of Environment
System behavior is driven by periodic
time-triggered events
- Model time directly
- expensive (state space becomes acyclic)
- hard to get accurate timing info (platform
specific) - Boeing isnt very interested in real-time
properties other than schedulability (?) - Abstract time modeling strategies
- Timeouts can happen at any time
- Bound number of timeouts in hyper-period
- Bound relative number of timeouts in adjacent
rate groups - Approximate passage of time
43Modeling of Environment
- The maximum rate timeout count (tc) keeps track
of the number of timeouts that have occurred
during a hyper-period - Assume a group of harmonic rates (R1, , Rmax),
where Rmax is the maximum rate - Let p(Ri) be the normalized period of Ri with
respect to Rmax - e.g., 5,10,20 gt p(5) 200ms/50ms, p(10)
100ms/50ms, - p(20) 50ms/50ms
- For rate Ri, timeouts are issued whenever
- tc p(Ri) 0
- Increment tc
- between 0 and p(R1)
- after every timeout of rate Rmax is issued
44Relative Timeout Counts
R1, R2, R3
R1
R1, R2
R1
R1 R2 R3
- Assume that worst case execution time of Ri work
can be performed in the period of Ri - There is a pattern to the number of timeout
counts in a frame - e.g., in frame of Ri there are two timeouts of
Ri-1
45Relative Timeout Counts
R1, R2
- Enforce only the relative of timeouts for
adjacent rates - Timeout for Ri is enabled after
- work for Ri is complete
- proper number of timeouts for Ri-1 are
performed
R1 R2
Problem Dont know how long R2 work takes?
46Relative Timeout Counts
R1, R2
- Enforce only the relative of timeouts for
adjacent rates - Timeout for Ri is enabled after
- work for Ri is complete
- proper number of timeouts for Ri-1 are
performed
R1 R2
Problem Dont know how long R2 work takes?
Must consider all interleavings of R1 timeout and
actions performed in R2 work (or R3 work, )
47Relative Timeout Counts
R1, R2
R1 R2
48Modeling of Environment
- Previous model does not relate component
execution with passage of time - Assume we have worst-case execution bounds for
event handlers - e.g., from schedulability analysis
- Keep track of intra-hyper-period time (ihp)
normalized by duration of shortest action - Increment ihp by duration bounds for handlers as
they are executed - One tricky issue
49Lazily-Timed Components
ihp
thandler
50Preliminary Results (Full Search)
System
ND Priority Lazily-Timed
Basic 1 rate, 3 components, 2 events
per hyper-period Multi-Rate 2 rates, 6
components, 6 events per
hyper-period Modal 3 rates, 8 components, 125
events per hyper-period Medium 2 rates, 50
components, 820 events per hyper-period
20, .12s 12, .11s 14, .11s
120k, 5m 100, .38s 33, .19s
3M, ? 9.1k, 8.6s 900, 1.3s
13M, ? 740k, 29m 4k, 8.6s
51Functional Properties
Property I System never reaches a state where
TacticalSteering and NavSteering are both
disabled
Property II If navSteering is enabled when 20Hz
timeout occurs, then airFrame should fetch
navSteering data before end of frame
52Lack of Model Analysis
Boeing OEP Challenge Problems
If component 1 is in mode A when component 2
produces event E, then component 3 will consume
event F (Section 4.1.5.3.6)
A temporal property well-suited for
model-checking!
53Event-based Specifications
- Many properties of interest in this domain are
event oriented - Some require access to state information
- A state qualified event (written e f)
- defines the occurrence of an observable event, e,
in a state that satisfies a given formula, f. - For example,
- If component c1 is in mode A when component c2
produces event E, them component c3 will consume
event F
54Exploiting System Structure
- Symmetry reductions
- Equivalence between states
- Partial order reductions
- Commutativity between transitions
- Collapse compression
- Sharing between state components
- Can we exploit the time-triggered nature of
real-time systems?
55A Simple Transition System
- l1 y 0 goto l2
- l2 x 0 goto l3
- l3 true -gt x 2 goto l4
- true -gt x 3 goto l4
- l4 y y x goto l5
- l5 ygt5 -gt skip goto end
- ylt5 -gt skip goto l2
- end
56System State Space
57State Space Decomposition
58Synopsis
- A system is quasi-cyclic if
- A subset of its state variables repeatedly reach
certain fixed values (e.g., initial values) - The rest of the variables can vary freely
- Decompose DFS of quasi-cyclic system
- BFS of quasi-cyclic regions
- DFS within regions
- Memory bounded by region DFS
- Time penalty due to redundant state visits
59Cadena models are quasi-cyclic
60Parallel Quasi-cyclic Search
- Region DFS are completely independent
- Embarassingly parallel
- Naïve implementation on 4 processors overcomes
overhead - This summer
- check realistic scenario
- 400 components, 130 modal components
61Ultimate Modeling View
CCM IDL Model Layer
Check mode behaviors, temporal properties, timing
constraints
Generate code, fill-in skeletons, check for
refinement
We dont do all of this yet!
62Some Ongoing Work
- Integration of specification support into Cadena
- Model generation from Cadena
- Counter-example display
- Refinement checking (e.g., dependences against
state machines) - Incorporating synchronization info
- Modeling distributed nodes
- Incorporating time into spec language