Title: RELAY: Static Race Detection on Millions of Lines of Code
1RELAY Static Race Detectionon Millions of Lines
of Code
speaker
Jan Voung, Ranjit Jhala, and Sorin Lerner UC San
Diego
2Definition of data race
- It is an event
- between 2 threads, where there are
- unordered accesses to the same memory location
- and at least one of the accesses is a write
thread 2
thread 1
temp g
g V
3Bugs from races
- Example bug
- incorrect resource bounds
thread 2
thread 1
temp size
free() size size - n
4RELAY against data races
- RELAY finds data races
- RELAY is a static tool
- analyzes the program before it runs
- RELAY is scalable
- analyzed the Linux kernel (4.5 million LOC)
- in 5 hours on 32 cpus
- Found 53 races in a sample of 149 warnings
5Outline
- Introduction
- Computing locksets recording accesses
- Relative locksets
- Guarded accesses
- Experiments with Linux
- Categorizing warnings false positives
- Filters targeting categories
6Checking with Locksets
- Locks are a mechanism for mutual exclusion
- Only one thread holds a particular lock at a
time. - No race if the same lock must have been acquired
for each of two different shared accesses
common
locks held
locks held
thread 1
thread 2
lock(l) temp g unlock(l) read(g)
lock(l) g 0
l
l
l
l
l
7A more realistic (but simple) example
work (void d) o d-gtpriv lock(o-gtlock)
read_stats(o)
No race
read_stats(x) if (x-gtf1 lt 10)
unlock(x-gtlock) x-gtstats ... else
unlock(x-gtlock)
read/write with lock
Race
write without lock
8Key components of RELAY
work (void d) o d-gtpriv lock(o-gtlock)
read_stats(o)
GOAL scalability KEY modularity
work ( )
read_stats(x) if (x-gtf1 lt 10)
unlock(x-gtlock) x-gtstats ... else
unlock(x-gtlock)
work
read_stats ( )
read_stats
9Key components of RELAY
4) Symbolic execution what is the same memory
location? Normalize to globals and formals.
work (void d) o d-gtpriv lock(o-gtlock)
read_stats(o)
L d-gtpriv-gtlock, L-
1) Relative locksets locks acq./rel. in
function caller handles locks before
L MUST have been acq. L- MAY have been rel.
read_stats(x) if (x-gtf1 lt 10)
unlock(x-gtlock) x-gtstats ... else
unlock(x-gtlock)
L , L-
L , L- x-gtlock
2) Guarded accesses pair accesses with relative
locksets to catch races
L , L- x-gtlock
3) Summaries
10How RELAY runs
1) Assume symbolic execution ran. 2) Compute
relative locksets 3) Compute guarded accesses
work (void d) o d-gtpriv lock(o-gtlock)
read_stats(o)
read_stats(x) if (x-gtf1 lt 10)
unlock(x-gtlock) x-gtstats ... else
unlock(x-gtlock)
x-gtf1 L , L-
L , L-
x-gtstats L , L- x-gtlock
L , L- x-gtlock
summary
L , L- x-gtlock
x-gtf1 L , L-
L , L- x-gtlock
x-gtstats L , L- x-gtlock
11How RELAY runs
work (void d) o d-gtpriv lock(o-gtlock)
read_stats(o)
read_stats(x)
summary
L , L- x-gtlock
x-gtf1 L , L-
summary
x-gtstats L , L- x-gtlock
L , L- x-gtlock
x-gtf1 L , L-
x-gtstats L , L- x-gtlock
12Applying summaries
work (void d) o d-gtpriv lock(o-gtlock)
read_stats(o)
L , L-
BEFORE
L d-gtpriv-gtlock, L-
AFTER
L , L- d-gtpriv-gtlock
summary
L , L- d-gtpriv-gtlock
read_stats(x)
summary
DIFFERENCE
L , L- x-gtlock
x-gtf1 L , L-
x-gtstats L , L- x-gtlock
13Applying summaries
work (void d) o d-gtpriv lock(o-gtlock)
read_stats(o)
L , L-
BEFORE
L d-gtpriv-gtlock, L-
L d-gtpriv-gtlock, L-
L , L- d-gtpriv-gtlock
summary
L , L- d-gtpriv-gtlock
AFTER
d-gtpriv-gtf1 L d-gtpriv-gtlock, L-
d-gtpriv L , L-
d-gtpriv-gtf1 L d-gtpriv-gtlock, L-
read_stats(x)
d-gtpriv-gtstats L , L- d-gtpriv-gtlock
summary
L , L- x-gtlock
DIFFERENCE
x-gtf1 L , L-
x-gtstats L , L- x-gtlock
14Checking for Races
work (void d) o d-gtpriv lock(o-gtlock)
read_stats(o)
work (void d)
L , L-
L d-gtpriv-gtlock, L-
L d-gtpriv-gtlock, L-
L , L- d-gtpriv-gtlock
summary
L , L- d-gtpriv-gtlock
d-gtpriv L , L-
summary
summary
d-gtpriv-gtf1 L d-gtpriv-gtlock, L-
d-gtpriv L , L-
d-gtpriv L , L-
d-gtpriv-gtstats L , L- d-gtpriv-gtlock
d-gtpriv-gtf1 L d-gtpriv-gtlock, L-
d-gtpriv-gtf1 L d-gtpriv-gtlock, L-
d-gtpriv-gtstats L , L- d-gtpriv-gtlock
d-gtpriv-gtstats L , L- d-gtpriv-gtlock
row 1 reads only gt no race
row 2 common lock gt no race
row 3 no common lock gt race
15Modular Unsoundness
- Pointer-arithmetic corner cases
- Accesses in assembly code
- Function pointers
- Not enforcing must-alias for lockset intersection
- Filters
Revisit each and improve
16Outline
- Introduction
- Computing locksets recording accesses
- Relative locksets
- Guarded accesses
- Experiments with Linux
- Categorizing warnings false positives
- Filters targeting categories
17Linux experiments
- 5000 warnings
- Sample 90 and categorize
- Design and apply filters to zoom-in on races
18Categories of false positives
- Initialization thread allocates object and
initializes it before sharing - Aliasing mixed up different data structures
- Unsharing objects removed from shared structures
- Recursive locks Big kernel lock
- Non-lock synchronization spawn, wait, signal,
etc. - Conditional locking locking correlated with
return value, conditionals, etc.
19Example filter Thread ownership
- To reduce initialization false positives
- remove accesses originating from the thread that
allocated the object.
thread 3
thread 1
thread 2
x malloc()? init(x)? share(x)? update(x)?
x get()? update(x)?
x get()? update(x)?
filtered
20Before filters 11 data races
21After filters 80 data races
22The absolute numbers
initialization
non-aliasing, unsharing
recursive locks
non-lock sync.
23Related Work
- Dynamic techniques
- Locksets and extensions Savage et al. 97, Choi
et al. 02, Yu et al. 05, Elmas et al. 07 - Atomicity Flanagan et al. 04, Wang et al. 06
- Benign vs. harmful Narayanasamy et al. 07
- Static techniques for Java
- Type systems Flanagan et al. 99, Boyapati et al.
02 - Aliasing, must-not aliasing Naik et al. 06, 07
- Static techniques for C
- Scalability, ranking Engler et al. 03
- Aliasing and sharing Pratikakis et al. 06,
Kahlon et al. 07
24Summary
- Relative locksets Modular summary-based analysis
- Can analyze 46K functions of Linux kernel
- modular gt parallelizable
- on a grid of 32 cpus approx. 5 hours
- Modular unsoundness
- finds 53 races (or 25 after all filters)
- future work better analyses, better filters
- whether races are benign or not, is another
question!