Probabilistic Pointer Analysis PPA - PowerPoint PPT Presentation

About This Presentation
Title:

Probabilistic Pointer Analysis PPA

Description:

EPIC instruction sets. Uses explicit load/store speculation. Thread ... Example. Thread Level Speculation (TLS) To take advantage of data speculation support, ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 69
Provided by: jeffda3
Category:

less

Transcript and Presenter's Notes

Title: Probabilistic Pointer Analysis PPA


1
Probabilistic Pointer Analysis PPA
  • Presented by Jeff DaSilva

CARG April 12, 2005
2
The Pointer Alias Analysis Problem
A B
  • Statically decide for any pair of pointers, at
    any point in the program, whether two pointers
    point to the same memory location.

?This problem is known to be undecidable. Landi
1992
3
The Compiler Writer vs. The Programming Language
Designer
  • Pointers are needed by programmers to realize
    complex data structures
  • Pointers can make life difficult for optimizing
    compilers

4
Concluding Remarks (from last time)
  • Traditional pointer analysis techniques are
    either overly conservative or are so complex that
    they fail to scale with respect to code size
  • Examples include Address taken, Andersons
    analysis, Steensgaard, Emami
  • Pointer analysis is a very difficult problem that
    may never be adequately solved.
  • Does hardware support for data speculation make
    the analysis easier for the compiler?

5
Support for Data Speculation Exists
  • EPIC instruction sets
  • Uses explicit load/store speculation
  • Thread Level Speculation (TLS)
  • Speculative Compiler Optimizations
  • Speculative PRE, register promotion and strength
    reduction
  • Speculative behavioral synthesis

?Why are these techniques not more widely used?
One Reason Currently, they rely on data
dependence profiling.
6
ExampleSpeculative Compiler Optimization
7
ExampleThread Level Speculation (TLS)
? Can this loop be parallelized?
?To take advantage of data speculation support,
a probability metric and a cost function are
required.
8
Probabilistic Pointer Analysis (PPA)
A B
  • Statically decide for any pair of pointers, at
    any point in the program, whether two pointers
    point to the same memory location.
  • Statically estimate for any pair of pointers, at
    any point in the program, the probability that
    two pointers point to the same memory location.

?This problem is known to be undecidable. Landi
1992
?Isnt this problem even worse?
9
Outline
  • PPA Objectives
  • Probabilistic Pointer Analysis (PPA) Theory
  • The Probabilistic Points-To Graph
  • The Points-To Matrix
  • The Transformation Matrix
  • An Example
  • Some Preliminary Results

10
How is Pointer Analysis used?
Pointer Analysis
Executable
Source Code
11
Traditional Points-To Graph
int a, b, c int r, s, t int x, y,
z
12
Traditional Points-To Graph
int a, b, c int r, s, t int x, y,
z xr ys zt ra sb zc

13
Traditional Points-To Graph
int a, b, c int r, s, t int x, y,
z xr ys zt ra sb zc
if() xs sc
14
Traditional Points-To Graph
int a, b, c int r, s, t int x, y,
z xr ys zt ra sb zc
if() xs sc rb zr
15
Traditional Points-To Graph
int a, b, c int r, s, t int x, y,
z xr ys zt ra sb zc
if() xs sc rb zr if() y
x
16
Traditional Points-To Graph
int a, b, c int r, s, t int x, y,
z xr ys zt ra sb zc
if() xs sc rb zr if() y
x x a
17
How is PPA used?
Control Flow Edge Profiling (Optional)
Probabilistic Pointer Analysis
Speculative Executable
Source Code
with Data Speculation Support
18
Definition Probability Analysis
  • Let ltp,vgt denote a points-to relationship from a
    pointer p to a location v.
  • At every static program point s there exists a
    probability function P(s, ltp,vgt) that denotes the
    probability that p points to v during dynamic
    program execution.

P(s, ltp,vgt) D(s, ltp,vgt) / D(s)
  • Where D(s) is the number of times s is (expected
    to be) dynamically visited and D(s, ltp,vgt) is the
    number of times that the points-to relation ltp,vgt
    dynamically holds.

19
Algorithm should output a Safe Conservative may
alias Probability
  • A probability of 0.0 P(s,ltp,vgt) 0.0 indicates
    that a points-to relation ltp,vgt will never hold.
  • The converse in not necessarily true
  • A probability of 1.0 P(s,ltp,vgt) 1.0 indicates
    that a points to relation ltp,vgt will always hold.
  • The converse is also not necessarily true a
    dynamic points-to relationship ltp,vgt that always
    exists may not be reported with a probability of
    1.0

20
Probabilistic Points-To Graph
int a, b, c int r, s, t int x, y,
z xr ys zt ra sb zc

1.0
1.0
1.0
1.0
1.0
1.0
21
Probabilistic Points-To Graph
int a, b, c int r, s, t int x, y,
z xr ys zt ra sb zc
if() /60 taken/ xs sc
0.4
0.6
0.6
0.4
22
Probabilistic Points-To Graph
int a, b, c int r, s, t int x, y,
z xr ys zt ra sb zc
if() /60 taken/ xs sc rb
zr
0.6
0.4
0.6
0.4
23
Probabilistic Points-To Graph
int a, b, c int r, s, t int x, y,
z xr ys zt ra sb zc
if() /60 taken/ xs sc rb
zr if() /10 taken/ y x
0.04
0.96
0.6
0.4
0.6
0.4
24
Probabilistic Points-To Graph
y
int a, b, c int r, s, t int x, y,
z xr ys zt ra sb zc
if() /60 taken/ xs sc rb
zr if() /10 taken/ y x x a
x
z
0.04
0.6
0.96
0.4
r
s
t
0.4
0.16
0.4
0.24
0.6
0.6
0.6
a
b
c
? What is the probability that y points to a?
25
Our PPA Algorithm Objectives
  • An Interprocedural, Flow Sensitive, Context
    Sensitive approach that uses Linear Transfer
    Functions.
  • Must be scalable in time and space.
  • Provides an approximate probability for the
    Maybe output.

26
Accuracy/Efficiency Tradeoff
  • Doubly Exponential
  • Accurate very few maybe outputs
  • Does not scale

BDD based
Chen, et al Only Other PPA
Address-taken
Anderson
SPAN
Steensgaard
Emami
  • Linear Time Complexity
  • Inaccurate - many false maybe outputs
  • Memory Required Negligible

27
Encoding the Probabilistic Points-To GraphThe
Points-To Matrix
  • The Probabilistic Points-To graph is encoded
    using a sparse Markov Matrix
  • All elements are real numbers in the closed
    interval 0,1
  • All rows sum to 1.0
  • Each pointer set and location set is given a
    unique id representing its matrix row column
  • Rules for linearity
  • Pointers can only point to Location sets
  • Location sets always point to themselves with
    probability 1.0

28
Points-To Matrix Structure
Pointer Sets
Location Sets
N-1 N
1 2 3
Area Of Interest
ø
1 2

Pointer Sets
ø
I
N-1 N
Location Sets
29
Points-To Matrix Example
30
Another Points-To Matrix Example
31
What about double pointers?
32
Basic Pointer Assignments
33
Transforming the points-to matrix
  • Let Xs represent the probabilistic points-to
    matrix at a specific program point s.

XIN
Basic pointer assignment instruction
XOUT
  • Claim There exists a transformation function
    T(X) for every instruction i, such that XOUT
    Ti(XIN).

34
The fundamental PPA Equation
Points-To Matrix Out
Points-To Matrix In
Transformation Matrix

35
Transformation Matrix Structure
Pointer Sets
Location Sets
1 2 3
N-1 N
1 2
Area of Interest

Pointer Sets
ø
I
Location Sets
N-1 N
36
Example
37
Example
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 1.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0
0.0 0.0 1.0 0.0 0.0 0.0 1.0
0.0 0.0 0.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0
38
Example
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 1.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0
0.0 0.0 1.0 0.0 0.0 0.0 1.0
0.0 0.0 0.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0

39
Combining Transformation Matrices
XIN
T1 Basic pointer assignment instruction
T2 Basic pointer assignment instruction
XOUT
  • XOUT T2 T1 XIN

40
Combining Transformation Matrices Example
0.0 0.0 1.0 0.0 0.0 0.0 1.0
0.0 0.0 0.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0
41
Combining Transformation Matrices Example
0.0 0.0 1.0 0.0 0.0 0.0 1.0
0.0 0.0 0.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0
1.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0
42
Combining Transformation Matrices Example
0.0 0.0 1.0 0.0 0.0 0.0 1.0
0.0 0.0 0.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0
1.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0
0.001 0.01 0.989 0.0 0.0 0.0 0.001
0.999 0.0 0.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0
43
Combining Transformation Matrices Example
0.0 0.0 1.0 0.0 0.0 0.0 1.0
0.0 0.0 0.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0
1.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0
0.001 0.01 0.989 0.0 0.0 0.0 0.001
0.999 0.0 0.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0
0.0 0.0 0.99 0.01 0.0 0.0 0.0
0.999 0.001 0.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0

44
Combining Transformation Matrices Example
0.0 0.0 0.99 0.01 0.0 0.0 0.0
0.999 0.001 0.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0
45
Combining Transformation Matrices Example
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 1.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0
0.0 0.0 0.99 0.01 0.0 0.0 0.0
0.999 0.001 0.0 0.0 0.0 1.0 0.0
0.0 0.0 0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0

46
How to handle Control Flow?
1
3
2
4
10
TF_foo

0.5
0.5
3
2
1
4
47
PPA Infrastructure
Suif Infrastructure
.spd files
.spx files
PPA Results
ICFG
Points-To Matrix Propagator TD
Edge Profile
Abstract Memory Model (AMM)
Transformation Matrix Collector BU
Transfer Function Builder (Suif2TF)
PPA Matrix Builder
MATLAB C Library
MATLAB Debugging Script
48
Current Results
49
Current Results
50
The End
  • Any Comments or Questions?

51
References
  • Peng-Sheng Chen, Ming-Yu Hung, Yuan-Shin Hwang,
    Roy Dz-Ching Ju, Jenq Kuen Lee. Compiler support
    for speculative multithreading architecture with
    probabilistic points-to analysis PPOPP 2003
    25-36
  • Only other PPA research Group
  • Jin Lin, Tong Chen, Wei-Chung Hsu, Peng-Chung
    Yew, Roy Dz-Ching Ju, Tin-Fook Ngai and Sun Chan,
    A Compiler Framework for Speculative Analysis
    and Optimizations, in Proceedings of the ACM
    SIGPLAN 2003 Conference on Programming Language
    Design and Implementation, San Diego, California,
    June 9-11, 2003, pp. 289-299
  • Roy Dz-Ching Ju. Probabilistic Memory
    Disambiguation and its Application to Data
    Speculation (1996)
  • Probabilistic array subscript analysis
  • Manel Fernandez and Roger Espasa. Speculative
    Alias Analysis for Executable Code (2002)

52
BACKUPSLIDES
53
Does Probabilistic Aliasing Exist?
54
Does Probabilistic Aliasing Exist?
55
Does Probabilistic Aliasing Exist?
56
Does Probabilistic Aliasing Exist?
57
Does Probabilistic Aliasing Exist?
58
Does Probabilistic Aliasing Exist?
59
Probabilistic Dependence Matrix
Compress95 Ref input set Flow Sensitive
Analysis Location Oriented DDA 85x85 static
lod/str pairs
60
Pointer Analysis Issues
  • Scalability vs. Accuracy
  • Generally, a difficult tradeoff exists between
  • the amount of computation and memory required vs.
  • the accuracy of the analysis.
  • Precision/Efficiency tradeoff, where is the sweet
    spot?
  • Which metric should be used?
  • Direct metric
  • Report performance applied to an optimization
  • Dynamically measure false positives
  • Which benchmark suite?
  • Are the results reproducible?

61
Pointer Analysis Issues
  • Complications associated with pointer arithmetic,
    casting, function pointers, long jumps, and
    multithreaded applications.
  • Can these be ignored?
  • Different pointer analysis uses have different
    needs.
  • A universal pointer analysis probably doesnt
    exist.

62
Optimizing Compilers 101
  • Optimizing compilers must preserve program
    correctness
  • All compiler algorithms (code transformations)
    are developed within this rule
  • What if this rule was relaxed?
  • If not bound by this rule of program correctness,
    the opportunity constitutes a reevaluation of all
    optimizations that were originally created within
    this rule.
  • If given a framework for speculation recovery
    (hardware or software), this rule becomes
    flexible.

63
Compiler Support for Speculation
  • Control Speculation
  • Executing instructions before determining that
    they will execute in the normal flow of
    execution.
  • Existing compiler frameworks can adequately
    incorporate and exploit control speculation using
    control flow profiling or simple heuristic rules.
  • Data speculation
  • Executing loads before potentially dependant
    stores
  • Very little has been done to effectively exploit
    data speculation.
  • Data dependence profiling is expensive
  • No effective heuristics exist

64
Pointer Analysis
  • Pointer analysis is a critical compiler component
    used to analyze programs written in C-like
    programming languages, which utilize pointers and
    pointer-based data structures
  • It attempts to disambiguate indirect memory
    references, so that subsequent compiler passes
    have a more accurate view of program behaviour.

65
How is Pointer Analysis used?
Parallelizing Compiler Can the loop be
parallelized? (TLP)
void foo(int a, int b) for(i1 iltN i)
ai bi / 13 for(i1 iltN
i) a b 1
Guide Compiler Optimizations load/store
redundancy elimination, register allocation,
CSE, dead code elimination, live variable,
instruction scheduling eg. VLIW (ILP), loop
invariant code motion, etc.
Behavioral Synthesis Data Flow Processors
Necessary for partitioning and instruction
scheduling.
Error Detection Program Understanding Programme
r can detect errors or discover poorly written
code.
66
Pointer Analysis Design Choices
  • Flow/Path sensitivity
  • Context sensitivity
  • Heap modeling
  • Aggregate modeling
  • Alias representation
  • Whole Program
  • Incremental Compilation

67
How is Pointer Analysis used?
Parallelizing Compiler Can the loop be
parallelized? (TLP)
void foo(int a, int b) for(i1 iltN
i) ai bi-1 for(i1 iltN i)
a b 1
Guide Compiler Optimizations load/store
redundancy elimination, register allocation,
CSE, dead code elimination, live variable,
instruction scheduling eg. VLIW (ILP), loop
invariant code motion, etc.
Behavioral Synthesis Data Flow Processors
Necessary for partitioning and instruction
scheduling.
Error Detection Program Understanding Programme
r can detect errors or discover poorly written
code.
?Any more?
68
Pointer Analysis Published by Year
?Havent we Solved this Problem Yet?
Write a Comment
User Comments (0)
About PowerShow.com