Title: Program Analysis and Design Conformance
1Program Analysis and Design Conformance
- Martin Rinard
- Laboratory for Computer Science
- Massachusetts Institute of Technology
2Research Overview
- Program Analysis
- Commutativity Analysis for C Programs PLDI96
- Memory Disambiguation for Multithreaded C
Programs - Pointer Analysis PLDI99
- Region Analysis PPoPP99, PLDI00
- Pointer and Escape Analysis for Multithreaded
Java Programs OOPSLA99, PLDI01, PPoPP01
3Research Overview
- Transformations
- Automatic Parallelization
- Object-Oriented Programs with Linked Data
Structures PLDI96 - Divide and Conquer Programs PPoPP99, PLDI00
- Synchronization Optimizations
- Lock Coarsening POPL97,PLDI98
- Synchronization Elimination OOPSLA99
- Optimistic Synchronization Primitives PPoPP97
- Memory Management Optimizations
- Stack Allocation OOPSLA99,PLDI01
- Per-Thread Heap Allocation
4Research Overview
- Verifications of Safety Properties
- Data Race Freedom PLDI00
- Array Bounds Checks PLDI00
- Correctness of Region-Based Allocation PPoPP01
- Credible Compilation RTRV99
- Correctness of Dataflow Analysis Results
- Correctness of Standard Compiler Optimizations
5Talk Overview
- Memory Disambiguation
- Goal Verify Data Race Freedom for Multithreaded
Divide and Conquer Programs - Analyses
- Pointer Analysis
- Accessed Region Analysis
- Experience integrating information from the
developer into the memory disambiguation analysis - Role Verification
- Design Conformance
6Basic Memory Disambiguation Problem
p v (write v into the memory location that p
points to) What memory locations may pv access?
Without Any Analysis
pv may access any location
p v
7Basic Memory Disambiguation Problem
p v (write v into the memory location that p
points to) What memory location may pv access?
With Analysis
pv may access this location
pv does not access these memory locations !
p v
pv may access this location
8Static Memory Disambiguation
- Analyze the program to characterize the
memory locations that statements in the program
read and write - Fundamental problem in program
- analysis with many applications
9Application Verify Data Race Freedom
Program Does This
NOT This
p v1
p v1
p v1 q v2
q v2
q v2
10Example - Divide and Conquer Sort
4
7
6
1
5
3
8
2
11Example - Divide and Conquer Sort
4
7
6
1
5
3
8
2
Divide
12Example - Divide and Conquer Sort
4
7
6
1
5
3
8
2
Divide
2
8
5
3
1
6
7
4
Conquer
13Example - Divide and Conquer Sort
4
7
6
1
5
3
8
2
Divide
2
8
5
3
1
6
7
4
Conquer
4
1
6
7
3
2
5
8
Combine
14Example - Divide and Conquer Sort
4
7
6
1
5
3
8
2
Divide
2
8
5
3
1
6
7
4
Conquer
4
1
6
7
3
2
5
8
Combine
2
1
3
4
6
5
7
8
15Divide and Conquer Algorithms
- Lots of Generated Concurrency
- Solve Subproblems in Parallel
16Divide and Conquer Algorithms
- Lots of Recursively Generated Concurrency
- Recursively Solve Subproblems in Parallel
17Divide and Conquer Algorithms
- Lots of Recursively Generated Concurrency
- Recursively Solve Subproblems in Parallel
- Combine Results in Parallel
18Sort n Items in d, Using t as Temporary Storage
void sort(int d, int t, int n) if (n gt
CUTOFF) spawn sort(d,t,n/4) spawn
sort(dn/4,tn/4,n/4) spawn sort(d2(n/2),t2(
n/2),n/4) spawn sort(d3(n/4),t3(n/4),n-3(n/
4)) sync spawn merge(d,dn/4,dn/2,t) spawn
merge(dn/2,d3(n/4),dn,tn/2) sync merge(t,
tn/2,tn,d) else insertionSort(d,dn)
19Sort n Items in d, Using t as Temporary Storage
void sort(int d, int t, int n) if (n gt
CUTOFF) spawn sort(d,t,n/4) spawn
sort(dn/4,tn/4,n/4) spawn sort(d2(n/2),t2(
n/2),n/4) spawn sort(d3(n/4),t3(n/4),n-3(n/
4)) sync spawn merge(d,dn/4,dn/2,t) spawn
merge(dn/2,d3(n/4),dn,tn/2) sync merge(t,
tn/2,tn,d) else insertionSort(d,dn)
Divide array into subarrays and recursively sort
subarrays in parallel
20Sort n Items in d, Using t as Temporary Storage
void sort(int d, int t, int n) if (n gt
CUTOFF) spawn sort(d,t,n/4) spawn
sort(dn/4,tn/4,n/4) spawn sort(d2(n/2),t2(
n/2),n/4) spawn sort(d3(n/4),t3(n/4),n-3(n/
4)) sync spawn merge(d,dn/4,dn/2,t) spawn
merge(dn/2,d3(n/4),dn,tn/2) sync merge(t,
tn/2,tn,d) else insertionSort(d,dn)
Subproblems Identified Using Pointers Into
Middle of Array
d
dn/4
dn/2
d3(n/4)
21Sort n Items in d, Using t as Temporary Storage
void sort(int d, int t, int n) if (n gt
CUTOFF) spawn sort(d,t,n/4) spawn
sort(dn/4,tn/4,n/4) spawn sort(d2(n/2),t2(
n/2),n/4) spawn sort(d3(n/4),t3(n/4),n-3(n/
4)) sync spawn merge(d,dn/4,dn/2,t) spawn
merge(dn/2,d3(n/4),dn,tn/2) sync merge(t,
tn/2,tn,d) else insertionSort(d,dn)
Sorted Results Written Back Into Input Array
d
dn/4
dn/2
d3(n/4)
22Merge Sorted Quarters of d Into Halves of t
void sort(int d, int t, int n) if (n gt
CUTOFF) spawn sort(d,t,n/4) spawn
sort(dn/4,tn/4,n/4) spawn sort(d2(n/2),t2(
n/2),n/4) spawn sort(d3(n/4),t3(n/4),n-3(n/
4)) sync spawn merge(d,dn/4,dn/2,t) spawn
merge(dn/2,d3(n/4),dn,tn/2) sync merge(t,
tn/2,tn,d) else insertionSort(d,dn)
d
t
tn/2
23Merge Sorted Halves of t Back Into d
void sort(int d, int t, int n) if (n gt
CUTOFF) spawn sort(d,t,n/4) spawn
sort(dn/4,tn/4,n/4) spawn sort(d2(n/2),t2(
n/2),n/4) spawn sort(d3(n/4),t3(n/4),n-3(n/
4)) sync spawn merge(d,dn/4,dn/2,t) spawn
merge(dn/2,d3(n/4),dn,tn/2) sync merge(t,
tn/2,tn,d) else insertionSort(d,dn)
d
t
tn/2
24Use a Simple Sort for Small Problem Sizes
void sort(int d, int t, int n) if (n gt
CUTOFF) spawn sort(d,t,n/4) spawn
sort(dn/4,tn/4,n/4) spawn sort(d2(n/2),t2(
n/2),n/4) spawn sort(d3(n/4),t3(n/4),n-3(n/
4)) sync spawn merge(d,dn/4,dn/2,t) spawn
merge(dn/2,d3(n/4),dn,tn/2) sync merge(t,
tn/2,tn,d) else insertionSort(d,dn)
d
dn
25Use a Simple Sort for Small Problem Sizes
void sort(int d, int t, int n) if (n gt
CUTOFF) spawn sort(d,t,n/4) spawn
sort(dn/4,tn/4,n/4) spawn sort(d2(n/2),t2(
n/2),n/4) spawn sort(d3(n/4),t3(n/4),n-3(n/
4)) sync spawn merge(d,dn/4,dn/2,t) spawn
merge(dn/2,d3(n/4),dn,tn/2) sync merge(t,
tn/2,tn,d) else insertionSort(d,dn)
d
dn
26What Do You Need To Know To Verify Data Race
Freedom?
Points-to Information (data blocks that pointers
point into) Region Information (accessed regions
within data blocks)
27Information Needed To Verify Race Freedom
- d and t point to different memory blocks
- Calls to sort access disjoint parts of d and t
- Together, calls access d,dn-1 and t,tn-1
- sort(d,t,n/4)
- sort(dn/4,tn/4,n/4)
- sort(dn/2,tn/2,n/4)
- sort(d3(n/4),t3(n/4),
- n-3(n/4))
-
d
dn-1
t
tn-1
d
dn-1
t
tn-1
d
dn-1
t
tn-1
d
dn-1
t
tn-1
28Information Needed To Verify Race Freedom
- d and t point to different memory blocks
- First two calls to merge access disjoint parts of
d,t - Together, calls access d,dn-1 and t,tn-1
- merge(d,dn/4,dn/2,t)
- merge(dn/2,d3(n/4),
- dn,tn/2)
- merge(t,tn/2,tn,d)
-
d
dn-1
t
tn-1
d
dn-1
t
tn-1
d
dn-1
t
tn-1
29Information Needed To Verify Race Freedom
Calls to insertionSort access d,dn-1
insertionSort(d,dn)
d
dn-1
30What Do You Need To Know To Verify Data Race
Freedom?
Points-to Information (d and t point to different
data blocks) Symbolic Region Information (accesse
d regions within d and t blocks)
31How Hard Is It To Figure These Things Out?
32How Hard Is It For the Program Analysis To Figure
These Things Out?
Challenging
33How Hard Is It For the Program Analysis To Figure
These Things Out?
- void insertionSort(int l, int h)
- int p, q, k
- for (p l1 p lt h p)
- for (k p, q p-1 l lt q k lt q q--)
- (q1) q
- (q1) k
-
-
- Not immediately obvious that
- insertionSort(l,h) accesses l,h-1
34How Hard Is It For the Program Analysis To Figure
These Things Out?
void merge(int l1, intm, int h2, int d)
int h1 m int l2 m while ((l1 lt h1)
(l2 lt h2)) if (l1 lt l2) d l1 else
d l2 while (l1 lt h1) d
l1 while (l2 lt h2) d l2 Not
immediately obvious that merge(l,m,h,d) accesses
l,h-1 and d,d(h-l)-1
35Issues
- Heavy Use of Pointers
- Pointers into Middle of Arrays
- Pointer Arithmetic
- Pointer Comparison
- Multiple Procedures
- sort(int d, int t, n)
- insertionSort(int l, int h)
- merge(int l, int m, int h, int t)
- Recursion
- Multithreading
36Pointer Analysis
- For each program point, computes where each
pointer may point - e.g. p ? x before statement p 1
- Complications
- 1. Statically unbounded number of locations
- recursive data structures (lists, trees)
- dynamically allocated arrays
- 2. Multiple possible executions of the program
- may create different dynamic data structures
37Memory Abstraction
Stack
Heap
p
head
j
Physical Memory
i
r
q
v
j
p
head
Abstract Memory
i
q
v
r
Allocation block for each variable declaration
Allocation block for each memory allocation site
38Memory Abstraction
Stack
Heap
p
head
j
Physical Memory
i
r
q
v
j
p
head
Abstract Memory
i
q
v
r
Allocation block for each variable declaration
Allocation block for each memory allocation site
39Pointer Analysis Summary
- Key Challenge for Multithreaded Programs
Analyzing interactions between threads - Solution Interference Edges
- Record edges generated by each thread
- Captures effect of parallel threads on points-to
information of other threads
40What Pointer Analysis Gives Us
- Disambiguation of Memory Accesses Via Pointers
- Pointer-based loads and stores use pointer
analysis results to derive the allocation block
that each pointer-based load or store statement
accesses - MOD-REF or READ-WRITE SETS Analysis
- All loads and stores
- Procedures use the memory access information for
loads and stores to compute the allocation blocks
that each procedure accesses
41Is This Information Enough?
42Is This Information Enough?
- NO
- Necessary but not Sufficient
- Parallel Tasks Access (Disjoint) Regions of Same
Allocated Block of Memory
43Structure of Analysis
Pointer Analysis
Disambiguate Memory at the Granularity of
Allocation Blocks
Symbolic Upper and Lower Bounds for Each Memory
Access in Each Procedure
Bounds Analysis
Symbolic Regions Accessed By Execution of Each
Procedure
Region Analysis
Data Race Freedom
Check that Parallel Threads Are Independent
44Running Example Array Increment
- void f(char p, int n)
- if (n gt CUTOFF)
- spawn f(p, n/2) / increment first half
/ - spawn f(pn/2, n/2) / increment second half
/ - sync
- else
- / base case increment small array /
- int i 0
- while (i lt n) (pi) 1 i
-
45Intra-procedural Bounds Analysis
Pointer Analysis
Symbolic Upper and Lower Bounds for Each Memory
Access in Each Procedure
Bounds Analysis
Region Analysis
Data Race Detection
46Intraprocedural Bounds Analysis
- GOAL For each pointer and array index variable
at each program point, derive lower and upper
bounds - E.g. 0 ? i ? n-1 at statement (pi) 1
- Bounds are symbolic expressions
- variables represent initial values of parameters
of enclosing procedure - bounds are combinations of variables
- example expression for f(p,n) p(n/2)-1
47Intraprocedural Bounds Analysis
- What are upper and lower bounds for i
- at each program point in base case?
- int i 0
- while (i lt n) (pi) 1 i
48Bounds Analysis, Step 1
Build control flow graph
i 0
i lt n
(pi) 1 i i1
49Bounds Analysis, Step 2
Set up bounds at beginning of basic blocks
l1 ? i ? u1
i 0
l2 ? i ? u2
i lt n
l3 ? i ? u3
(pi) 1 i i1
50Bounds Analysis, Step 3
Compute transfer functions
l1 ? i ? u1
i 0
0 ? i ? 0
l2 ? i ? u2
i lt n
l3 ? i ? u3
(pi) 1 i i1
l3 ? i ? u3
l31 ? i ? u31
51Bounds Analysis, Step 3
Compute transfer functions
l1 ? i ? u1
i 0
0 ? i ? 0
l2 ? i ? u2
i lt n
l2 ? i ? n-1 n ? i ? u2
l3 ? i ? u3
(pi) 1 i i1
l3 ? i ? u3
l31 ? i ? u31
52Bounds Analysis, Step 4
Key Step set up constraints for bounds
l1 ? i ? u1
i 0
Build Region Constraints 0, 0 ? l2 , u2
l31, u31 ? l2 , u2 l2 , n-1 ?
l3 , u3
0 ? i ? 0
l2 ? i ? u2
i lt n
l2 ? i ? n-1 n ? i ? u2
l3 ? i ? u3
(pi) 1 i i1
l3 ? i ? u3
l31 ? i ? u31
53Bounds Analysis, Step 4
Key Step set up constraints for bounds
l1 ? i ? u1
i 0
Build Region Constraints 0, 0 ? l2 , u2
l31, u31 ? l2 , u2 l2 , n-1 ?
l3 , u3
0 ? i ? 0
l2 ? i ? u2
i lt n
l2 ? i ? n-1 n ? i ? u2
l3 ? i ? u3
(pi) 1 i i1
l3 ? i ? u3
l31 ? i ? u31
54Bounds Analysis, Step 4
Key Step set up constraints for bounds
l1 ? i ? u1
i 0
Build Region Constraints 0, 0 ? l2 , u2
l31, u31 ? l2 , u2 l2 , n-1 ?
l3 , u3
0 ? i ? 0
l2 ? i ? u2
i lt n
l2 ? i ? n-1 n ? i ? u2
l3 ? i ? u3
(pi) 1 i i1
l3 ? i ? u3
l31 ? i ? u31
55Bounds Analysis, Step 4
Key Step set up constraints for bounds
-? ? i ??
i 0
Build Region Constraints 0, 0 ? l2 , u2
l31, u31 ? l2 , u2 l2 , n-1 ?
l3 , u3
0 ? i ? 0
l2 ? i ? u2
i lt n
l2 ? i ? n-1 n ? i ? u2
l3 ? i ? u3
(pi) 1 i i1
l3 ? i ? u3
l31 ? i ? u31
56Bounds Analysis, Step 4
Key Step set up constraints for bounds
-? ? i ??
i 0
Build Region Constraints 0, 0 ? l2 , u2
l31, u31 ? l2 , u2 l2 , n-1 ?
l3 , u3
0 ? i ? 0
l2 ? i ? u2
i lt n
l2 ? i ? n-1 n ? i ? u2
l3 ? i ? u3
(pi) 1 i i1
l3 ? i ? u3
l31 ? i ? u31
57Bounds Analysis, Step 4
Key Step set up constraints for bounds
-? ? i ??
i 0
Build Region Constraints 0, 0 ? l2 , u2
l31, u31 ? l2 , u2 l2 , n-1 ?
l3 , u3
0 ? i ? 0
l2 ? i ? u2
i lt n
l2 ? i ? n-1 n ? i ? u2
Inequality Constraints
l3 ? i ? u3
(pi) 1 i i1
l2 ? 0 l2 ? l31 l3 ? l2
0 ? u2 u31 ? u2 n-1 ? u3
l3 ? i ? u3
l31 ? i ? u31
58Bounds Analysis, Step 5
Generate symbolic expressions for bounds Goal
express bounds in terms of parameters
l2 c1p c2n c3 l3 c4p c5n c6
u2 c7p c8n c9 u3 c10p c11n c12
59Bounds Analysis, Step 5
Generate symbolic expressions for bounds Goal
express bounds in terms of parameters
l2 ? 0 l2 ? l31 l3 ? l2
l2 c1p c2n c3 l3 c4p c5n c6
0 ? u2 u31 ? u2 n-1 ? u3
u2 c7p c8n c9 u3 c10p c11n c12
60Bounds Analysis, Step 6
Substitute expressions into constraints
c1p c2n c3 ? 0 c1p c2n c3 ? c4p c5n
c6 1 c4p c5n c6 ? c1p c2n c3
0 ? c7p c8n c9 c10p c11n c12 1 ? c7p
c8n c9 c7p c8n c9 ? c10p c11n c12
61Bounds Analysis, Step 7
Reduce symbolic inequalities to linear
inequalities c1p c2n c3 ? c4p c5n c6 if
c1 ? c4, c2 ? c5, and c3 ? c6
62Bounds Analysis, Step 8
Apply reduction and generate a linear program
0 ? c7 0 ? c8 0 ? c9 c10 ? c7 c11 ? c8 c121
? c9 c7 ? c10 c8 ? c11 c9 ? c12
c1 ? 0 c2 ? 0 c3 ? 0 c1 ? c4 c2 ? c5
c3 ? c61 c4 ? c1 c5 ? c2 c6 ? c3
63Bounds Analysis, Step 8
Apply reduction and generate a linear program
0 ? c7 0 ? c8 0 ? c9 c10 ? c7 c11 ? c8 c121
? c9 c7 ? c10 c8 ? c11 c9 ? c12
c1 ? 0 c2 ? 0 c3 ? 0 c1 ? c4 c2 ? c5
c3 ? c61 c4 ? c1 c5 ? c2 c6 ? c3
Objective Function max (c1 c6) - (c7
c12)
lower bounds
upper bounds
64Bounds Analysis, Step 10
Solve linear program to extract bounds
Solution
-? ? i ??
i 0
c10 c2 0 c3 0 c40 c5 0 c6 0 c70 c8 1 c9
0 c100 c111 c12-1
0 ? i ? 0
l2 ? i ? u2
i lt n
l2 ? i ? n-1 n ? i ? u2
l3 ? i ? u3
(pi) 1 i i1
l3 ? i ? u3
l31 ? i ? u31
65Bounds Analysis, Step 9
Solve linear program to extract bounds
Solution
-? ? i ??
i 0
c10 c2 0 c3 0 c40 c5 0 c6 0 c70 c8 1 c9
0 c100 c111 c12-1
0 ? i ? 0
l2 ? i ? u2
i lt n
l2 ? i ? n-1 n ? i ? u2
Symbolic Bounds
l3 ? i ? u3
(pi) 1 i i1
u2 n u3 n-1
l2 0 l3 0
l3 ? i ? u3
l31 ? i ? u31
66Bounds Analysis, Step 10
Substitute bounds at each program point
Solution
-? ? i ??
i 0
c10 c2 0 c3 0 c40 c5 0 c6 0 c70 c8 1 c9
0 c100 c111 c12-1
0 ? i ? 0
0 ? i ? n
i lt n
0 ? i ? n-1 n ? i ? n
Symbolic Bounds
0 ? i ? n-1
(pi) 1 i i1
u2 n u3 n-1
l2 0 l3 0
0 ? i ? n-1
1 ? i ? n
67Access Regions
Compute access regions at each load or store
Solution
-? ? i ??
i 0
c10 c2 0 c3 0 c40 c5 0 c6 0 c70 c8 1 c9
0 c100 c111 c12-1
0 ? i ? 0
0 ? i ? n
i lt n
0 ? i ? n-1 n ? i ? n
Symbolic Bounds
0 ? i ? n-1
(pi) 1 i i1
p,pn-1
u2 n u3 n-1
l2 0 l3 0
0 ? i ? n-1
1 ? i ? n
68Interprocedural Region Analysis
Pointer Analysis
Bounds Analysis
Symbolic Regions Accessed By Execution of Each
Procedure
Region Analysis
Data Race Detection
69Interprocedural Region Analysis
GOAL Compute accessed regions of memory for
each procedure E.g. f(p,n) accesses
p, pn-1
- Same Approach
- Set up target bounds of accessed regions
- Build a constraint system to compute these bounds
- Constraint System
- Accessed regions for a procedure must include
- 1. Regions accessed by statements in the
procedure - 2. Regions accessed by invoked procedures
70Region Analysis in Example
- void f(char p, int n)
- if (n gt CUTOFF)
- spawn f(p, n/2)
- spawn f(pn/2, n/2)
- sync
- else
- int i 0
- while (i lt n)
- (pi) 1 i
-
p, pn-1
71Region Analysis in Example
f(p,n) accesses l(p,n), u(p,n)
- void f(char p, int n)
- if (n gt CUTOFF)
- spawn f(p, n/2)
- spawn f(pn/2, n/2)
- sync
- else
- int i 0
- while (i lt n)
- (pi) 1 i
-
p, pn-1
72Region Analysis in Example
f(p,n) accesses l(p,n), u(p,n)
- void f(char p, int n)
- if (n gt CUTOFF)
- spawn f(p, n/2)
- spawn f(pn/2, n/2)
- sync
- else
- int i 0
- while (i lt n)
- (pi) 1 i
-
l(p,n/2), u(p,n/2)
l(pn/2,n/2), u(pn/2,n/2)
p, pn-1
73Derive Constraint System
- Region constraints
- l(p,n/2), u(p,n/2) ? l(p,n), u(p,n) www
- l(pn/2,n/2), u(pn/2,n/2) ? l(p,n), u(p,n)
www - p, pn-1 ? l(p,n), u(p,n) www
- Reduce to inequalities between lower/upper bounds
- Further reduce to a linear program and solve
- l(p,n) p
- u(p,n) pn-1
- Access region for f(p,n) p, pn-1
74Data Race Freedom
Pointer Analysis
Bounds Analysis
Region Analysis
Data Race Freedom
Check that Parallel Threads Are Independent
75Data Race Freedom
- Dependence testing of two statements
- Do accessed regions intersect?
- Based on comparing upper and lower bounds of
accessed regions - Absence of data races
- Check that all the statements that execute in
parallel are independent
76Data Race Freedom
f(p,n) accesses p, pn-1
- void f(char p, int n)
- if (n gt CUTOFF)
- spawn f(p, n/2)
- spawn f(pn/2, n/2)
- sync
- else
- int i 0
- while (i lt n)
- (pi) 1 i
-
77Data Race Freedom
f(p,n) accesses p, pn-1
- void f(char p, int n)
- if (n gt CUTOFF)
- spawn f(p, n/2)
- spawn f(pn/2, n/2)
- sync
- else
- int i 0
- while (i lt n)
- (pi) 1 i
-
p, pn/2-1
pn/2, pn-1
78Data Race Freedom
- void f(char p, int n)
- if (n gt CUTOFF)
- spawn f(p, n/2)
- spawn f(pn/2, n/2)
- sync
- else
- int i 0
- while (i lt n)
- (pi) 1 i
-
No data races !
79Fundamental Property of the Analysis No Fixed
Point Computations
- The analysis does not use fixed-point
computations - The problem is reduced to a linear program
- The solution to the linear program directly gives
the symbolic lower and upper bounds - Fixed-point approaches
- Termination is not guaranteed analysis domain of
symbolic expressions has infinite ascending
chains - Use imprecise techniques to ensure termination
- Artificially truncate number of iterations
- Use imprecise widening operators
80Experience
- Set of benchmark programs
- Two versions of each benchmark
- Sequential version written in C
- Multithreaded version written in Cilk
- Experiments
- Data Race Freedom for the multithreaded versions
- Array Bounds Violation Detection for both
sequential and multithreaded versions - Automatic Parallelization for the sequential
version
81Data Races and Array Bounds Violations
Application Data races (multithreaded) Array Bounds Violations (multithreaded) Array Bounds Violations (sequential)
QuickSort NO NO NO
MergeSort NO NO NO
BlockMul NO NO NO
NoTempMul NO NO NO
LU NO NO NO
Knapsack YES NO NO
Heat NO NO NO
82Parallel Performance
Quicksort
Mergesort
Heat
BlockMul
NoTempMul
LU
83Summary
- Sophisticated Memory Disambiguation Analysis
- Points-to Information
- Accessed Region Information
- Automatic
- Interprocedural
- Handles Multithreaded Programs
- Other Uses Besides Data Race Freedom
- Bitwidth Analysis
- Array-Bounds Check Elimination
- Buffer Overrun Detection
84Bigger Picture
- Analysis has a very specific goal
- Developer understands and cares about results
- Points-to and region information is (implicitly)
part of the interface of each procedure - Developer understands interfaces
- Developer has expectations about analysis results
- Analysis can identify serious programming errors
- Developer expectations are implicit
85Idea
- Enhance procedure interface to make points-to and
region information explicit - Points-to language
- Points-to graphs at entry and exit
- Effect on points-to relationships
- Region language
- Symbolic specification of accessed regions
- Developer provides information
- Analysis verifies that it is correct, and that
correctness implies data race freedom
86Points-to Language
- f(p, q, n)
- context
- entry p-gt_a, q-gt_b
- exit p-gt_a, _a-gt_c,
- q-gt_b, _b-gt_d
-
- context
- entry p-gt_a, q-gt_a
- exit p-gt_a, _a-gt_c,
- q-gt_a
-
-
87Points-to Language
f(p, q, n) context entry p-gt_a,
q-gt_b exit p-gt_a, _a-gt_c, q-gt_b,
_b-gt_d context entry p-gt_a,
q-gt_a exit p-gt_a, _a-gt_c, q-gt_a
Contexts for f(p,q,n)
entry
exit
88Verifying Points-to Information
- One (flow sensitive) analysis per context
- f(p,q,n)
- .
- .
- .
Contexts for f(p,q,n)
entry
exit
89Verifying Points-to Information
Start with entry points-to graph f(p,q,n)
. . .
Contexts for f(p,q,n)
entry
exit
90Verifying Points-to Information
Analyze procedure f(p,q,n) . . .
Contexts for f(p,q,n)
entry
p
q
exit
91Verifying Points-to Information
Analyze procedure f(p,q,n) . . .
Contexts for f(p,q,n)
entry
exit
92Verifying Points-to Information
Check result against exit points-to
graph f(p,q,n) . . .
Contexts for f(p,q,n)
entry
exit
93Verifying Points-to Information
Similarly for other context f(p,q,n)
. . .
Contexts for f(p,q,n)
entry
exit
94Verifying Points-to Information
Start with entry points-to graph f(p,q,n)
. . .
Contexts for f(p,q,n)
entry
exit
95Verifying Points-to Information
Analyze procedure f(p,q,n) . . .
Contexts for f(p,q,n)
entry
exit
96Verifying Points-to Information
Check result against exit points-to
graph f(p,q,n) . . .
Contexts for f(p,q,n)
entry
exit
97Analysis of Call Statements
g(r,n) . . f(r,s,n) . .
98Analysis of Call Statements
- Analysis produces points-graph before call
- g(r,n)
- .
- .
- f(r,s,n)
- .
- .
-
r
s
99Analysis of Call Statements
Retrieve declared contexts from callee g(r,n)
. . f(r,s,n) . .
Contexts for f(p,q,n)
entry
r
s
exit
100Analysis of Call Statements
Find context with matching entry graph g(r,n)
. . f(r,s,n) . .
Contexts for f(p,q,n)
entry
r
s
exit
101Analysis of Call Statements
Find context with matching entry graph g(r,n)
. . f(r,s,n) . .
Contexts for f(p,q,n)
entry
r
s
exit
102Analysis of Call Statements
Apply corresponding exit points-to graph g(r,n)
. . f(r,s,n) . .
Contexts for f(p,q,n)
entry
r
s
exit
103Analysis of Call Statements
Continue analysis after call g(r,n)
. . f(r,s,n) . .
104Analysis of Call Statements
g(r,n) . . f(r,s,n) . .
- Result
- Points-to declarations separate analysis of
multiple procedures - Transformed
- global, whole-program analysis into
- local analysis that operates on each procedure
independently
105Experience
- Implemented points-to and region languages
- Integrated with points-to and region analyses
- Divide and Conquer Benchmarks
- Quicksort (QS)
- Mergesort (MS)
- Matrix multiply (MM)
- LU decomposition (LU)
- Heat (H)
- We added points-to and region information
Sorting Programs
Dense Matrix Computations
Scientific Computation
106Programming Overhead
- Proportion of C Code, Region Declarations, and
Points-to Declarations
1.00
C Code
0.75
0.50
0.25
0.00
QS
MS
MM
LU
H
107Evaluation
- How difficult is it to provide declarations?
- Not that difficult.
- Have to write comparatively little code
- Must know information anyway
- How much benefit does analysis obtain?
- Substantial benefit.
- Simpler analysis software (no complex
interprocedural analysis) - More scalable, precise analysis
108Evaluation
- Software Engineering Benefits of Points-to and
Region Declarations - Improved communication between developer and
analysis - Analysis reflects developers expectations
- Enhanced code reliability
- Enhanced interface information
- Analyze incomplete programs
- Programs that use libraries
- Programs under development
109Evaluation
- Drawbacks of Points-to and Region Declarations
- Have to learn new language
- Have to integrate into development process
- Legacy software issues (programmer
may not know points-to and region information)
110Steps to Design Conformance
- Verify that Program Correctly Implements Key
Design Properties as Expressed by Developer or
Designer - Role Verification
- Design Conformance for Object Models
(joint with Daniel Jackson, MIT LCS) - Context Air Traffic Control Software
- MIT LCS (Daniel Jackson, Martin Rinard) MIT
Aero-Astro Department (R. John Hansman) NASA Ames
Research Center (Michelle Eshow) Kansas State
University CS Dept. (David Schmidt) - CTAS (Center/TRACON Automation System)
111Role Verification
- Objects play different roles during their
lifetime in computation - Parked Aircraft, Taxiing Aircraft, Cleared for
Takeoff Aircraft, In Flight Aircraft - Roles reflect constraints on activities of object
- System actions must respect role constraints
- Parked Aircraft cant take off
- Action violations indicate system confusion
- Goals
- Obtain role information from developer
- Check that program uses roles correctly
112Role Classification
Aircraft
- Two General Kinds of Classification
- Content-based (predicate on object
fields determines role) - Relative (points-to relationships
determine role) - Role Classification is Application Dependent
Class
Flying Aircraft
Parked Aircraft
Taxiing Aircraft
Cleared Aircraft
Roles
113Standard View of Object
Incoming References
Outgoing References
List of Meter Fixes
Fields
Flight Plan
Sequence Of Points
Trajectory
String
Flight Name
Runway Object
Runway
Gate
Gate Object
114Relative Role Classification
- Points-to relationships define roles
- Specify sources of incoming edges
- Field of an object playing a given role
- Global or local variable
- Specify target of outgoing edges
- Specify available fields in each role
115Example Roles
Parked Aircraft
Flight Plan
Gate Object
Trajectory
Aircraft
Flight Name
Runway
Gate
116Example Roles
Cleared for Takeoff Aircraft
List of Meter Fixes
Flight Plan
Trajectory
String
Flight Name
Runway Object
Runway
Gate
Aircraft
117Role Verification
- Analysis Obtains
- Role Definitions
- Method Information
- Roles of parameters and globals on entry
- Role changes that method performs
- Role of return value
- Intraprocedural Analysis
- Simulates potential executions of method
- Precise abstraction of heap
- Use role information for invoked methods
- Verify correctness of role information
118Benefits of Roles
- Software Engineering Benefits
- Safety checks that take application semantics
into account - Enhanced implementation transparency
- Transformations Enabled By Precise Referencing
Behavior - Safe real-time memory management
- Parallelization and race detection for Programs
with linked data structures - Optimized Atomic Transactions
119Key Issue Obtaining Role Information
- Range of Developer and Designer Involvement
- Some Involvement Reasonable and Necessary Roles
Reflect Application-Specific Properties - Primary Focus Role Definitions
- Determine analysis distinctions
- Relevance of extracted information
- Secondary Focus Method Specifications
- Developer specifies roles of parameters
- Analysis extracts role changes
120Design Conformance
- Software Development Activities
- Requirements
- Design
- Implementation
- Design is Partial
- Focus on Important Aspects
- Omit Many Low-Level Details
- Design and Implementation are Disconnected
- No guarantee that code conforms to design
121Goal of Design Conformance
- Establish and mechanically check conformance
- Use specific design formalism (object models)
- Boxes (objects) and Arrows (relations between
objects)
Aircraft
Flying Aircraft
Parked Aircraft
Taxiing Aircraft
Cleared Aircraft
Flight Plan
Flight Plan
Meter Fix
122Key Issue
- Establishing correspondence between object model
and implementation - Object models usually at a higher level of
abstraction - Many relations in object model realized as group
of objects and references - Object model may entirely omit some objects or
references - Enables designer to focus on important aspects
- But complicates path to conformance analysis
123Aircraft
Abstract Object Model
Flight Plan
Meter Fix
Intermediate Object Model
Concrete Object Model
Roles
124Concretization Specifications
- Maps Between Object Models
- Enables Designer/Developer to Establish
Correspondence Between Object Models - Specify how Object Model is Realized in Code
- Foundation for design conformance analysis
- Guides implementation of object model
- Implementation patterns for object models
125Design Conformance Benefits
- Higher Confidence in Software
- Promote clean implementation of design
- Guarantee important design properties
- Design becomes useful throughout entire
development cycle - Updated as implementation changes
- Reliable source of information
- Enables more precise, relevant analysis
126Related Work
- Pointer Analysis
- Landi, Ryder, Zhang PLDI93
- Emami, Ghiya, Hendren PLDI94
- Wilson, Lam PLDI96
- Rugina, Rinard PLDI99
- Rountev, Ryder CC01
- Salcianu, Rinard PPoPP01
- Region Analysis
- Triolet, Irigoin, Feautrier- PLDI86
- Havlak, Kennedy IEEE TPDS91
- Rugina, Rinard PLDI00
- Pointer Specifications
- Hendren, Hummel, Nicolau PLDI92
- Guyer, Lin LCPC00
127Related Work
- Shape Analysis CWZ90,GH96,FL97,SRW99,MS01
- Extended Type Systems
- FX/87 GJLS87
- Dependent Types XF99
- Program Verification
- ESC DLNS98
- PVS ORRSS96
- Implementations of Object Models HBR00
128Conclusion
- Developer and Designer Interact with Analysis
- Benefits
- More precise, relevant analysis
- Verify key safety and design properties
- Enhance utility of design
- Enable powerful transformations
- Key Issue
- Determining appropriate abstractions to leverage
- Access regions, roles, object models
- Abstractions Share Several Features
- Identify important properties of data
- Relate properties of data to behavior of
computation