Dave Binkley and Mark Harman - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Dave Binkley and Mark Harman

Description:

Dave Binkley and. Mark Harman. Locating Dependence Clusters and ... Categorise. Other ways to reduce dependence. By being more sophisticated about what it is ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 43
Provided by: markh61
Category:

less

Transcript and Presenter's Notes

Title: Dave Binkley and Mark Harman


1
Dave Binkley and Mark Harman
  • Locating Dependence Clusters and Dependence
    Pollution
  • Preview of ICSM 05 talk

2
Overview
  • Dependence and slicing
  • Monotone Slice Size Graphs (MSGs)
  • Approximating similarity
  • Verification Does the approximation work?
  • Validation Do the clusters occur in real code?
  • Pollution
  • Refactoring

3
Dependence
4
(No Transcript)
5
(No Transcript)
6
(No Transcript)
7
(No Transcript)
8
program slicing
the essential idea ...
which other lines affect the selected line?
dont waste time on the grey part
reuse only the red part
debugging
re-use
9
Manifesto
  • Slicing as a means to an end
  • To produce metrics
  • To produce visualizations
  • To understand dependence structure

DASL
10
MSGs
  • Monotone Slice Size Graph
  • Plot all slices of program in one graph
  • Order by monotonically increasing slice size
  • This gives a landscape

11
Example MSG
X axis percentage of slices represented Y axis
normalised slice size
12
Example MSG
X axis percentage of slices represented Y axis
normalised slice size
13
Conjecture
  • Clusters correspond to sheer cliff drops
  • We implicitly assume same size is a close
    approximation to same slice
  • In the previous example there were no clusters
  • Lets look at another example

14
(No Transcript)
15
Verification
  • How good is the approximation ?
  • What is the chance that two slices could have the
    same size and yet be different slices?
  • Well the same within a tolerance

16
Tolerance different yet considered the
same Agreement of slices which are the same
17
Tolerance different yet considered the
same Agreement of slices which are the same
18
Tolerance different yet considered the
same Agreement of slices which are the same
19
Verified
  • Agreement is close for reasonable tolerane
  • 20 programs studied
  • 0.4 of slices need gt 1 tolerance for total
    agreement
  • But 0.4 is the number of clusters
  • Only 0.00533 of pairwise slice comparisons
    require more than 1 tolerance to agree
  • These are the slices which simple happen to have
    the same size yet are different slices
  • The chance of a false positive is therefore very
    low

20
Validation
  • Do these clusters occur much in real code?
  • We studied 20 programs
  • The results were startling
  • We expected clusters here and there
  • We found them everywhere
  • Well not quite

21
Validation
  • Of course two slices of the same size is a
    cluster
  • So we search for large clusters
  • We chose 10 as our threshold for large
  • This is conservative

22
(No Transcript)
23
Results
  • 6 had no clusters larger than 10
  • 14 had at least one cluster larger than 10
  • 4 of the 14 were extreme

24
No clusters
25
With clusters
26
Extreme Cases
27
Manifesto
active not passive use dependence analysis to
change programs
28
Manifesto
active not passive use dependence analysis to
change programs
29
Manifesto
active not passive use dependence analysis to
change programs
30
Manifesto
active not passive use dependence analysis to
change programs
31
(No Transcript)
32
(No Transcript)
33
What is pollution?
34
What is pollution?
  • Like noise pollution
  • it is in the ear of the beholder
  • It could be thought of as a bad thing
  • mixed into a good thing
  • unnecessarily
  • We define dependence pollution to be avoidable
    dependence clusters

35
What is pollution?
Like noise pollution it is in the ear of the
beholder It could be thought of as a bad thing
mixed into a good thing unnecessarily We define
dependence pollution to be avoidable dependence
clusters
36
Why
  • Why are dependence clusters bad?
  • Impact of change
  • Comprehension
  • Testing
  • Reuse
  • How could they be avoidable
  • Capillary data flow (CFD)
  • Mutually recursive Cluster (MRC)

37
Case Study
  • copia
  • Example of a mutually Recursive Cluster (MRC)
  • We can remove it by refactoring
  • This removes the dependence pollution

38
(No Transcript)
39
Case Study
  • bc
  • A calculator program
  • We looked for Capillary Data flow CDF
  • We tried removing variable which contributed most
    to dependence

40
(No Transcript)
41
Related work
  • Clustering in the large
  • This is fine grained clustering based on SDG
  • Slicing in general
  • But we use slicing as a means to an end
  • We are interested in dependence profile
  • Slicing as a means to an end?
  • Bieman and Ott
  • Canfora, De Lucia and Munro
  • Korel and Rilling
  • Beszédes et al.
  • Krinke and Snelting
  • Visualising Dependence
  • Balmas
  • Ball and Eick

42
Future work
  • Extend empirical results
  • Categorise
  • Other ways to reduce dependence
  • By being more sophisticated about what it is
  • Domain specific dependence
  • Application specific dependence

43
Conclusions
  • Dependence clusters are prevalent
  • They can be discovered using slicing
  • The approximation is very close
  • The visualisation can be a help
Write a Comment
User Comments (0)
About PowerShow.com