Refinement-Based Context-Sensitive Points-To Analysis for Java - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Refinement-Based Context-Sensitive Points-To Analysis for Java

Description:

Fully precise in the limit. Only small amount of code analyzed precisely ... More precise for tested clients. Interactive performance for queries ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 19
Provided by: manusri
Category:

less

Transcript and Presenter's Notes

Title: Refinement-Based Context-Sensitive Points-To Analysis for Java


1
Refinement-Based Context-Sensitive Points-To
Analysis for Java
  • Manu Sridharan, Rastislav Bodík
  • UC Berkeley
  • PLDI 2006

2
What Does Refinement Buy You?
  • Increased scalability enable new clients
  • Memory orders of magnitude savings
  • Time answer for a variable comes back in 1
    second
  • ) Suitable for IDE

Cast Safety Client
Precision
3
Approach Focus on the Client
  • Demand-driven only do requested work
  • Client-driven refinement stop when client
    satisfied
  • Example
  • client asks can x point to o?
  • we refine until we answer NO (the good answer) or
    we time out

4
Context-Sensitive Analysis Costly
  • Context-sensitive analysis (def)
  • Compute result as if all calls inlined
  • But, collapse recursive methods
  • Exponential blowup (code growth)

5
Why Not Existing Technique?
  • Most analyses approximate same way in all code
  • E.g., k-CFA
  • Precision lost, esp. for data structures
  • Our analysis focuses precision where it matters
  • Fully precise in the limit
  • Only small amount of code analyzed precisely
  • First refinement algorithm for Java

6
Points-To Analysis Overview
  • Compute objects each variable can point to
  • For each var x, points-to set pt(x)
  • Model objects with abstract locations
  • 1 x new Foo() yields pt(x) o1
  • Flow-insensitive statements in any order

7
Points-To Analysis as CFL-Reachability
1) Assignments x new Obj() // o1 y new
Obj() // o2 z x
o1
x
z
a
2) Method calls id(p) return p a id(x) b
id(y)
(1
)1
f
f
pid
retid
d
c
g
)2
(2
3) Heap accesses c.f x c.g y d c.f
y
o2
b
  • pt(x) o o flowsTo x

flowsTo path exists
flowsTo balanced call and field parens
flowsTo balanced call parens
8
Summary of Formulation
  • Graph represents program
  • Compute reachability with two filters
  • Language of balanced call parens
  • Language of balanced field parens

9
Single path problem


t9
t7
j
p
t8
)8
t6

t5
)5
(7
(1
)1
o
x
t0
t1
t2
h
f
t12
k
f (1 )1 h
g
o2
t10
  • Problem show path is unbalanced
  • Goal reduce number of visited edges
  • Insight enough to find one unbalanced paren

t11
10
Approximation via Match Edges
o
t3
t0
t1
t2
t4
x
g
h
h
f
j
f
f g h h j f
  • Match edges connect matched field parens
  • From source of open to sink of close
  • Initially, all pairs connected
  • Use match edges to skip subpaths

11
Refining the Approximation
o
t3
t0
t1
t4
x
g
f
j
f
f g h h j f
  • Refine by removing some match edges
  • Exposes more of original path for checking
  • Soundness Traverse match edge ) assume
    field parens balanced on skipped path
  • Remove where unbalanced parens expected
  • Explore deeper levels of pointer indirection

12
Refinement With Both Languages
(2
)3
(1
)1
o
t5
t0
t1
t2
t6
x
t3
t4
g
g
f
f
Fields f g g f
Calls (1 )1 (2 )3
  • Match edges enable approximation of calls
  • Only can check calls on match-free subpaths
  • Match edge removal ) more call checking
  • Key point refine heap and calls together

13
Evaluation
14
Experimental Configuration
  • Implemented in Soot framework
  • Tested on large benchmarks x 2 clients
  • SPECjvm98, Dacapo suite
  • Downcast checking, factory method props
  • Refine context-insensitive result
  • Timeout for long-running queries

15
Precision Cast Checking
16
Scalability Time and Memory
  • Average query time less than 1 second
  • Interactive performance (for IDE)
  • At most 13 minutes for casts, 4 minutes for
    factory client
  • Very low memory usage at most 35MB
  • Of this, 30MB for context-insensitive result
  • Compare with gt2GB for 1-ObjSens analysis

17
Demand-Driven vs. Exhaustive
  • Demand advantage no caching required
  • Hence, low memory overhead
  • No engineering of efficient sets
  • Good for changing code just re-compute
  • Demand advantage faster for many clients
  • Often only care about some variables
  • Demand disadvantage slower querying all vars
  • At most 90 minutes for all app. vars
  • But, still good precision, memory

18
Conclusions
  • Novel refinement-based analysis
  • More precise for tested clients
  • Interactive performance for queries
  • Low memory could scale even more
  • Relatively easy to implement
  • Insight refine heap and calls together
  • Useful for other balanced-paren analyses?
Write a Comment
User Comments (0)
About PowerShow.com