Title: Specifying Multithreaded Java semantics for Program Verification
1Specifying Multithreaded Java semantics for
Program Verification
- Abhik Roychoudhury
- National University of Singapore
- (Joint work with Tulika Mitra)
2Java Multithreading
- Threads communicate via shared variables.
- Threads run on uni- or multi-processors
- Semantics given as abstract rules Java Memory
Model (JMM). Any multithreading implementation
must respect JMM. - Supports locking via synchronized statements
- Locking may be avoided for performance.
3Unsychronized code
- synchronized (Singleton.class)
- if (inst null)
- inst new Singleton()
- return inst
-
- if (inst null)
- synchronized (Singleton.class)
- if (inst null)
- inst new Singleton()
-
return inst
4Shared variable access without locks
Initially A 0, B 0
while (A ! 1) return B
B 1 A 1
Expected returned value 1
5What may happen
B 1 A 1
while (A ! 1) return B
May happen if threads are executed on different
processors, or even by compiler re-ordering
A 1 return B B 1
Returned value 0
6Sequential Consistency
Programmer expects statements within a thread to
complete in program order Sequential Consistency
- Each thread proceeds in program order
- Operations across threads are interleaved
- Op1 Op Op2 Op violates SC
Op Op
Op1 Op2
7Is this a problem ?
- Programmers expect SC
- Verification techniques assume SC execution
- SC not guaranteed by execution platforms
- Not demanded by Java language spec.
- Unrealistic for any future spec. to demand
- YES !!
8Organization
- Shared variable access without locks
- Candidate solutions
- Specifying the Java Memory Model (JMM)
- Using JMM for verification
91. Program with caution
- Always synchronize (and acquire lock) before
writing a shared object - For these programs, any execution is SC
- Unacceptable performance overheads for low-level
libraries - Software libraries from other sources cannot be
guaranteed to be properly synchronized
102. Change the Semantics
- Current semantics called Java Memory Model
(JMM). Part of Java language spec. - Weaker than Sequential Consistency. Allows
certain re-orderings within threads - Currently being considered for revision
- Existing/future JMM bound to be weaker than SC
Does not solve the problem
113. Use the semantics
- Develop a formal executable description of the
Java Memory Model - Use it for program debugging, verification
- JMM captures all possible behaviors for any
implementation Platform independent reasoning - Adds value to existing program verification
techniques
12Organization
- Shared variable access without locks
- Candidate solutions
- Specifying the Java Memory Model (JMM)
- Using JMM for verification
13JMM Overview
- Each shared variable v has
- A master copy
- A local copy in each thread
- Threads read and write local/master copies by
actions - Imposes ordering constraints on actions
- Not multithreaded implementation guide
14JMM Actions
read(v,t)
load(v,t)
Master copy of v
Local copy of v in t
write(v,t)
store(v,t)
Actions invoked by Program Execution use/assign(t
,v) Read from/ Write into local copy of v in
t lock/unlock(t) Acquire/Release lock on shared
variables
15Formal Specification
- Asynchronous concurrent composition
- Th1 Th2 Thn MM
- Local state of each thread modeled as cache
- ( Local copy, Stale bit, Dirty bit )
- Queues for incomplete reads/writes
- Local state of MM
- ( Master copies, Lock ownership info )
16Specifying JMM Actions
- Each action is a guarded command G ? B
- Evaluate G If true execute B atomically
- Example Use of variable v by thread t
- ? stalet,v ? return local_copyt,v
- Applicability of action stated as guard
- In rule-based description, several rules
determine the applicability
17Understanding JMM
assign(t,v) lt store(t,v) lt load(t,v)
assign(t, v) lt load(t,v)
?
A store must intervene between an assign and a
load action for a variable v by thread t
18Understanding JMM
assign(t, v) lt store(t,v) lt load(t,v)
assign(t,v) lt store(t,v) lt write(t,v) lt
read(t,v) lt load(t,v)
assign(t,v) lt load(t,v)
?
?
read/write actions on the master copy of v by
thread t are performed in the order of
corresponding load/store actions
19Understanding JMM
- Applicability of an action depends on several
rules in the rule-based description - This is what makes the model hard to understand
- Operation description specifies applicability as
guard
assign(t,v) empty(read_queuet,v) -gt .
20Executable model
- Java threads invoke use,assign,lock,unlock
- Action executed by corresponding commands
- Threads block if the next action not enabled
- To enable these, store,write,load,read are
executed in any order provided guard holds - Example To enable assign, a load is executed (if
there was an earlier read)
21Organization
- Shared variable access without locks
- Candidate solutions
- Specifying the Java Memory Model (JMM)
- Using JMM for verification
22Verifying Unsynchronized Code
assign(B,1) assign(A,1)
While (use(A) !1) use(B)
- use/assign invoke corresponding guarded commands
- load/store/read/write executed to enable
use/assign - Exhaustive state space exploration shows use(B)
may return 1 or 0
23Program verification
- Composing executable JMM allows search
- The state space explosion problem ?
- Most Java programs are properly synchronized
and hence SC execution - Unsynchronized code appears in low-level
fragments which are frequently executed - These programs are small, so ?
24One possibility
- User chooses one program path in each thread
(Creative step) - Exhaustively check all possible execution traces
from chosen program paths (Automated state space
exploration Can verify invariants) - Choosing program paths requires understanding
source code, not JMM
25Case Study Double Checked Locking
- Idiom for lazy instantiation of a singleton
- Check whether garbage data-fields can be returned
by this object initialization routine - Verification by composing the JMM reveals
- Incompletely instantiated object can be returned
- Due to re-ordering of writes within constructor
- Detected by prototype invariant checker in 0.15
sec
26 Summary
- Executable specification of Multithreaded Java
semantics - Using the specification for debugging
multithreaded programs - Similar approach has been studied before in the
context of hardware multiprocessors - How to correct the bugs found (the harmful
re-orderings) ?