Title: Powerpoint template for scientific posters (Swarthmore College)
 1Formal Verification Methods applied to Compute 
Cluster Software Ganesh Gopalakrishnan, Mike 
Kirby, Robert Palmer, Yu Yang, Salman Pervez, 
Geof Sawaya Subodh Sharma, Igor Melatti, Sonjong 
Hwang, Michael DeLisi School of Computing and SCI 
Institute, University of Utah, Salt Lake City, UT 
84112
Introduction In the process of writing or 
optimizing High Performance Computing software, 
mostly using MPI these days, designers can 
inadvertently introduce errors due to (i) the 
richness of MPI, (ii) the associated complexity 
of the MPI semantics, (iii) and the well known 
fact that concurrent programming is tricky. 
Traditional debugging methods (such as software 
testing and running incomplete execution 
analyzers) are insufficiently incisive. The Utah 
Gauss group is researching several pragmatic 
adaptations of Formal Methods, and is developing 
methodological and tool support that enhances 
designer understanding of MPI  parallel and 
distributed program semantics helps them develop 
more efficient MPI programs by reducing 
communication overheads and helps detect bugs 
such as deadlocks and other invariant violations 
early. Preliminary results include a formal 
semantics of (a growing number of) MPI 
primitives a program analysis framework based on 
the Microsoft Phoenix compiler that can analyze 
and help debug MPI programs case studies 
involving tricky uses of MPI one-sided 
communication and an in-situ model checker that 
can directly run MPI programs as if in a model 
checker. 
Conclusions As the stakes grow higher in HPC 
(expensive clusters, shortness of active 
professional lives), more incisive specification 
and verification techniques are essential. HPC is 
a rapidly growing area, being directly tied to 
growth in compute capabilities (e.g. threads and 
multicores) and simulation aspirations going 
toward Petaflop computing. Informal specification 
techniques and ad hoc validation techniques 
cannot serve as the foundation for build 
debugging tools in such a critical area. On the 
other hand, the use of non-scalable formal 
verification methods is also ineffectual. The 
Utah Gauss Group has expertise in formal methods 
and HPC. It collaborates with experts (e.g. 
Argonne National Labs). It is examining how best 
to make an impact using formal methods in HPC. 
Our experience has been that in such a fast 
moving area, agile de facto standards such as MPI 
provide ample opportunities to pursue the formal 
route. We recently formally verified a tricky 
byte-range locking algorithm that uses MPI 
1-sided communication using model checking (one 
of three EuroPVM/MPI 06 distinguished papers). 
Our formalization of MPI has already revealed 
several imprecise specifications in the standard. 
Our long-term goal is the demonstration, and 
ultimate incorporation of these ideas into the 
Microsoft Compute Cluster Software and also other 
open source releases of MPI. Formal Methods for 
handling Multithreading in MPI, as well as formal 
test generation for MPI libraries are also 
planned.
What is Model Checking?
Abstraction / Refinement / Model Checking
Execution Checking
Navier-Stokes Equations are a mathematical model 
of fluid flow physics VV  Validation and 
Verification Validate Models, Verify Codes
proctype MPI_Send(chan out, int c) 
out!c  proctype MPI_Bsend(chan out, int c) 
out!c  proctype MPI_Isend(chan out, int c) 
out!c  typedef MPI_Status int MPI_SOURCE 
int MPI_TAG int MPI_ERROR  
MPI LibraryModel
 int y active proctype T1() int x x  1 
if  x  0  x  2 fi y  
x  active proctype T2() int x x  2 
if  y  x  1  y  0 fi assert( y 
 0 ) 
Compiler
ProgramModel
Model Generator
Formal models can be generated 
either automatically or by a modeler 
which translate and abstract algorithms and 
implementations. 
EnvironmentModel
Model Checking
Error Simulator
Refinement
Abstractor
Model Checker
Result Analyzer
MC Client
MC Client
MC Client
MC Client
MC Client
MC Client
OK
MC Client
MC Client
MC Client
Case Study 1  Byte Range Locking using MPI 
1-sided communication (EuroPVM/MPI 2006)
Materials and methods We begin with a formal 
and executable specification of MPI communication 
semantics expressed in TLA (Lamport, MSR). Using 
Phoenix, we extract a control skeleton of the MPI 
program, with its constituent MPI calls 
represented using the descriptions in our formal 
semantics. We are developing many abstraction 
methods that can reduce the fraction of the 
state-space that must be visited to verify 
properties about these models. Using TLC (a model 
checker for TLA from MSR), we apply finite-state 
model checking methods that can detect deadlocks 
and assertion violations. Many of these 
assertions are automatically inserted (e.g., is 
every MPI_Isend followed by an MPI_Wait or 
MPI_Test). An in-situ model checker (under 
construction) arranges to interrupt control-flow 
before any MPI call is attempted, and transfers 
control to a scheduler. In effect, each MPI 
process is forced to seek permission from this 
scheduler as to when it may go ahead and make 
this MPI call. The scheduler permits an 
interleaved execution to manifest thereafter, it 
runs enough execution interleaving order 
variants to exhaust the execution space up to a 
certain depth bound. Abstraction procedures 
(under development) are expected to generate 
simpler programs to verify, containing the same 
classes of error. Ways to parallelize in-situ 
model checking are also under investigation.
Lock Acquire
Lock Release
Deadlock!
lock_acquire (start, end)  Stage 1 1 val0  1 / flag / val1  start val2  end 2 while(1)  3 lock_win 4 place val in win 5 get values of other processes from win unlock_win 7 for all i, if (Pi conflicts with my range) 8 conflict  1 Stage 2 9 if(conflict)  10 val0  0 11 lock_win 12 place val in win 13 unlock_win 14 MPI_Recv(ANY_SOURCE) 15  16 else 17 / lock is acquired / break  20 //end while
lock_release (start, end)  val0  0 / 
flag / val1  -1 val2  -1 lock_win 
 place val in win get values of other 
processes from win unlock_win for all i, 
if (Pi conflicts with my range) 
MPI_Send(Pi) 
Process 0 Process 1
lock_acquire(3,5) lock_release() lock_acquire(3,5) lock_acquire(3,5)
Literature cited Robert Palmer, Ganesh 
Gopalakrishnan, and Robert M. Kirby, The 
Communication Semantics of the Message Passing 
Interface, Technical Report UUCS-06-012, School 
of Computing, University of Utah, 2006. Robert 
Palmer, Steve Barrus, Yu Yang, Ganesh 
Gopalakrishnan, and Robert M. Kirby, Gauss A 
framework for verifying scientific computing 
software, Workshop on Software Model Checking, 
2005. Electronic Notes on Theoretical Computer 
Science (ENTCS), No. 953. Salman Pervez, Ganesh 
Gopalakrishnan, Robert M. Kirby, Rajeev Thakur, 
and William Gropp, Formal verification of 
programs that use MPI one-sided communication. 
Recent Advances in Parallel Virtual Machine and 
Message Passing Interface (EuroPVM/MPI, LNCS 
4192, pages 30--39, 2006. (Outstanding Paper)
DEADLOCK
Deduces Conflict  Stage 2 Block on Receive
Deduces Conflict  Stage 2 Block on Receive
Window
flag start end
 0 -1 -1 0 -1 -1 
 0 -1 -1
Window
flag start end
 0 -1 -1 0 -1 -1 
 0 -1 -1
 0 3 5
 0 3 5 0 -1 -1 
 0 -1 -1
P0
P1
P0
P1
P0
P1
Acknowledgments Supported in part by Microsoft 
Corporation under the Microsoft HPC Institutes 
program, and NSF CNS 0509379. We also acknowledge 
our collaborations with Argonne National Labs (R. 
Thakur and W. Gropp).
Case Study 2  Parallel / Distributed Model 
Checking in Eddy (SPIN 2006)
MPI Thread Funneled, Dist Mem Model Checking
Results from Parallel / Distributed Model 
Checking
Worker Thread
Communication Thread
MPI User Program Instrumented at MPI Functions 
 (e.g., MPI_SEND, RECV, WIN_LOCK)
Scheduler that receives process requests, and 
permits one interleaving at a time
Consumption Queue
Evolution of Line State
For further information Please check our project 
website http//www.cs.utah.edu/formal_verificatio
n Email contacts ganesh, kirby _at_ 
cs.utah.edu School of Computing and SCI 
Institute, University of Utah
Receive and process inbound Messages Initiate 
Isends Check completion of Isends 
Take State Off Consumption Queue Expand 
State (get new set of states) Make decision 
about Set of states 
Hash
Permesso?
P1
P2
P1
P2
WTBA
Active
WTBS
CBS
Avanti!
Communication Queue