RELAY: Static Race Detection on Millions of Lines of Code - PowerPoint PPT Presentation

About This Presentation
Title:

RELAY: Static Race Detection on Millions of Lines of Code

Description:

Modular Unsoundness. Pointer-arithmetic corner cases. Accesses in assembly code. Function pointers ... Modular unsoundness. finds 53 races (or 25 after all filters) ... – PowerPoint PPT presentation

Number of Views:294
Avg rating:3.0/5.0
Slides: 25
Provided by: janv7
Learn more at: https://cseweb.ucsd.edu
Category:

less

Transcript and Presenter's Notes

Title: RELAY: Static Race Detection on Millions of Lines of Code


1
RELAY Static Race Detectionon Millions of Lines
of Code
speaker
Jan Voung, Ranjit Jhala, and Sorin Lerner UC San
Diego
2
Definition of data race
  • It is an event
  • between 2 threads, where there are
  • unordered accesses to the same memory location
  • and at least one of the accesses is a write

thread 2
thread 1
temp g
g V
3
Bugs from races
  • Example bug
  • incorrect resource bounds

thread 2
thread 1
temp size
free() size size - n
4
RELAY against data races
  • RELAY finds data races
  • RELAY is a static tool
  • analyzes the program before it runs
  • RELAY is scalable
  • analyzed the Linux kernel (4.5 million LOC)
  • in 5 hours on 32 cpus
  • Found 53 races in a sample of 149 warnings

5
Outline
  • Introduction
  • Computing locksets recording accesses
  • Relative locksets
  • Guarded accesses
  • Experiments with Linux
  • Categorizing warnings false positives
  • Filters targeting categories

6
Checking with Locksets
  • Locks are a mechanism for mutual exclusion
  • Only one thread holds a particular lock at a
    time.
  • No race if the same lock must have been acquired
    for each of two different shared accesses

common
locks held
locks held
thread 1
thread 2
lock(l) temp g unlock(l) read(g)
lock(l) g 0
l
l
l
l
l
7
A more realistic (but simple) example
work (void d) o d-gtpriv lock(o-gtlock)
read_stats(o)
No race
read_stats(x) if (x-gtf1 lt 10)
unlock(x-gtlock) x-gtstats ... else
unlock(x-gtlock)
read/write with lock
Race
write without lock
8
Key components of RELAY
work (void d) o d-gtpriv lock(o-gtlock)
read_stats(o)
GOAL scalability KEY modularity
work ( )
read_stats(x) if (x-gtf1 lt 10)
unlock(x-gtlock) x-gtstats ... else
unlock(x-gtlock)
work
read_stats ( )
read_stats
9
Key components of RELAY
4) Symbolic execution what is the same memory
location? Normalize to globals and formals.
work (void d) o d-gtpriv lock(o-gtlock)
read_stats(o)
L d-gtpriv-gtlock, L-
1) Relative locksets locks acq./rel. in
function caller handles locks before
L MUST have been acq. L- MAY have been rel.
read_stats(x) if (x-gtf1 lt 10)
unlock(x-gtlock) x-gtstats ... else
unlock(x-gtlock)
L , L-
L , L- x-gtlock
2) Guarded accesses pair accesses with relative
locksets to catch races
L , L- x-gtlock
3) Summaries
10
How RELAY runs
1) Assume symbolic execution ran. 2) Compute
relative locksets 3) Compute guarded accesses
work (void d) o d-gtpriv lock(o-gtlock)
read_stats(o)
read_stats(x) if (x-gtf1 lt 10)
unlock(x-gtlock) x-gtstats ... else
unlock(x-gtlock)
x-gtf1 L , L-
L , L-
x-gtstats L , L- x-gtlock
L , L- x-gtlock
summary
L , L- x-gtlock
x-gtf1 L , L-
L , L- x-gtlock
x-gtstats L , L- x-gtlock
11
How RELAY runs
work (void d) o d-gtpriv lock(o-gtlock)
read_stats(o)
read_stats(x)
summary
L , L- x-gtlock
x-gtf1 L , L-
summary
x-gtstats L , L- x-gtlock
L , L- x-gtlock
x-gtf1 L , L-
x-gtstats L , L- x-gtlock
12
Applying summaries
work (void d) o d-gtpriv lock(o-gtlock)
read_stats(o)
L , L-
BEFORE
L d-gtpriv-gtlock, L-
AFTER
L , L- d-gtpriv-gtlock
summary
L , L- d-gtpriv-gtlock
read_stats(x)
summary
DIFFERENCE
L , L- x-gtlock
x-gtf1 L , L-
x-gtstats L , L- x-gtlock
13
Applying summaries
work (void d) o d-gtpriv lock(o-gtlock)
read_stats(o)
L , L-
BEFORE
L d-gtpriv-gtlock, L-
L d-gtpriv-gtlock, L-
L , L- d-gtpriv-gtlock
summary
L , L- d-gtpriv-gtlock
AFTER
d-gtpriv-gtf1 L d-gtpriv-gtlock, L-
d-gtpriv L , L-
d-gtpriv-gtf1 L d-gtpriv-gtlock, L-
read_stats(x)
d-gtpriv-gtstats L , L- d-gtpriv-gtlock
summary
L , L- x-gtlock
DIFFERENCE
x-gtf1 L , L-
x-gtstats L , L- x-gtlock
14
Checking for Races
work (void d) o d-gtpriv lock(o-gtlock)
read_stats(o)
work (void d)
L , L-
L d-gtpriv-gtlock, L-
L d-gtpriv-gtlock, L-
L , L- d-gtpriv-gtlock
summary
L , L- d-gtpriv-gtlock
d-gtpriv L , L-
summary
summary
d-gtpriv-gtf1 L d-gtpriv-gtlock, L-
d-gtpriv L , L-
d-gtpriv L , L-
d-gtpriv-gtstats L , L- d-gtpriv-gtlock
d-gtpriv-gtf1 L d-gtpriv-gtlock, L-
d-gtpriv-gtf1 L d-gtpriv-gtlock, L-
d-gtpriv-gtstats L , L- d-gtpriv-gtlock
d-gtpriv-gtstats L , L- d-gtpriv-gtlock
row 1 reads only gt no race
row 2 common lock gt no race
row 3 no common lock gt race
15
Modular Unsoundness
  • Pointer-arithmetic corner cases
  • Accesses in assembly code
  • Function pointers
  • Not enforcing must-alias for lockset intersection
  • Filters

Revisit each and improve
16
Outline
  • Introduction
  • Computing locksets recording accesses
  • Relative locksets
  • Guarded accesses
  • Experiments with Linux
  • Categorizing warnings false positives
  • Filters targeting categories

17
Linux experiments
  • 5000 warnings
  • Sample 90 and categorize
  • Design and apply filters to zoom-in on races

18
Categories of false positives
  • Initialization thread allocates object and
    initializes it before sharing
  • Aliasing mixed up different data structures
  • Unsharing objects removed from shared structures
  • Recursive locks Big kernel lock
  • Non-lock synchronization spawn, wait, signal,
    etc.
  • Conditional locking locking correlated with
    return value, conditionals, etc.

19
Example filter Thread ownership
  • To reduce initialization false positives
  • remove accesses originating from the thread that
    allocated the object.

thread 3
thread 1
thread 2
x malloc()? init(x)? share(x)? update(x)?
x get()? update(x)?
x get()? update(x)?
filtered
20
Before filters 11 data races
21
After filters 80 data races
22
The absolute numbers
initialization
non-aliasing, unsharing
recursive locks
non-lock sync.
23
Related Work
  • Dynamic techniques
  • Locksets and extensions Savage et al. 97, Choi
    et al. 02, Yu et al. 05, Elmas et al. 07
  • Atomicity Flanagan et al. 04, Wang et al. 06
  • Benign vs. harmful Narayanasamy et al. 07
  • Static techniques for Java
  • Type systems Flanagan et al. 99, Boyapati et al.
    02
  • Aliasing, must-not aliasing Naik et al. 06, 07
  • Static techniques for C
  • Scalability, ranking Engler et al. 03
  • Aliasing and sharing Pratikakis et al. 06,
    Kahlon et al. 07

24
Summary
  • Relative locksets Modular summary-based analysis
  • Can analyze 46K functions of Linux kernel
  • modular gt parallelizable
  • on a grid of 32 cpus approx. 5 hours
  • Modular unsoundness
  • finds 53 races (or 25 after all filters)
  • future work better analyses, better filters
  • whether races are benign or not, is another
    question!
Write a Comment
User Comments (0)
About PowerShow.com