Title: Bug Isolation via Remote Program Sampling
1Bug Isolation viaRemote Program Sampling
- Ben Liblit, Alex Aiken
- Alice X. Zheng, Michael I. Jordan
- UC Berkeley
Presented by Chao Liu
2Debugging is Hard
- Limited Resource
- Time
- Human Efforts
- Test Cases
- Triage and Guesswork
Windows 2000, 35M LOC, 63,000 known bugs at the
time of release, 2 per 1000 lines --Quoted from
Monica Lams Slides
3Leverage End-users
Predicates
ShippingApplication
ProgramSource
Sampler
Compiler
StatisticalDebugging
Counts J/L
Top bugs withlikely causes
Courtesy of Ben Liblit
4Outline
- Low-overhead Sampling
- Bug Isolation
- Related Works
- Conclusion and Discussion
5Low-overhead Sampling
- Program predicate
- Any proposition
- Fingerprint of execution
- Straightforward Checking
6Periodical Countdown
- counter 100
- while ( )
- if(--counter 0) check(p ! Null)
- counter 100
-
- p p-gtnext
- if(--counter 0) check(i lt max)
- counter 100
-
- total sizei
7Randomize it!
8From Bernoulli to Geometric
- Randomized
- Fair
- Low-overhead
9Outline
- Low-overhead Sampling
- Bug Isolation
- Related Works
- Conclusion and Discussion
10Bug Isolation
- Assumptions
- Predicates capture incorrect behavior.
- Each predicate P should always be false during
correct execution. - Therefore, when P is true, the program
- either fails (a deterministic bug)
- or is at increased risk of failing (a
nondeterministic bug).
11Isolating Deterministic Bug
- Winnowing Strategy
- Predicates observed true on some bad runs
- Predicates never observed true on any good run
- Case Study ccrypt
- Instrument scalar return sites, 570
- 3 570 1710 counters
- Simulate large user community
- 2990 randomized runs 88 crashes
12Winnowing
- 1710 counters
- 1569 are always zero
- 141 remain
- 139 are nonzero on some successful run
- Not much left!
- file_exists() gt 0
- xreadline() 0
Courtesy of Ben Liblit
13Non-deterministic Bug
14Maximum Likelihood Estimation
- Maximize the log-likelihood function
where
15Regularized Logistic Regression
- Maximize the penalized log-likelihood function
where
16Case Study bc-1.06
- void more_arrays ()
-
- old_count a_count
- a_count STORE_INCR
- / Copy the old arrays. /
- for (indx 1 indx lt old_count indx)
- arraysindx old_aryindx
- / Initialize the new elements. /
- for ( indx lt v_count indx)
- arraysindx NULL
-
1 indx gt scale
1 indx gt scale 2 indx gt use_math
1 indx gt scale 2 indx gt use_math 3 indx gt
opterr 4 indx gt next_func 5 indx gt i_base
Courtesy of Ben Liblit
17Bug Found Buffer Overrun
void more_arrays () old_count a_count
a_count STORE_INCR / Copy the old arrays.
/ for (indx 1 indx lt old_count indx)
arraysindx old_aryindx / Initialize
the n ew elements. / for ( indx lt v_count
indx) arraysindx NULL
Courtesy of Ben Liblit
18Outline
- Low-overhead Sampling
- Bug Isolation
- Related Works
- Conclusion and Discussion
19Related Work
- Fault Localization
- Program spectra-based
- NN/Perm RR03, ASE03
- Memory graph-based
- Delta-Debugging Z02, FSE02
- Cause-Transition (CT) CZ05, ICSE05
- Predicate-based
- Liblit03 LA03, PLDI03
- Liblit05 LN05, PLDI05
- SOBER LY05, FSE05
20Quality Comparison
21Shameless Advertisement LX05
22Outline
- Low-overhead Sampling
- Bug Isolation
- Related Works
- Conclusion and Discussion
23Conclusions
- Fault localization is possible
- Semantic bugs can be also localized
- Intense competition in this problem
24Discussion
- How many of you believe in the applicability of
fault localization - Industry use,
- Personal use,
- Is low, say less than 10, overhead acceptable to
you?
25References
- RR03 M. Renieris and S. Reiss. Fault
Localization with nearest neighbor queries. In
Proc. 18th IEEE Int. Conf. Automated Software
Engineering (ASE03), 2003. - CZ05 H. Cleve and A. Zeller. Locating causes of
program failures. In Proc. 27th Int. Conf.
Software Engineering (ICSE05), 2005. - LN05 B. Liblit, M. Naik, A. Zheng, A. Aiken,
and M. Jordan. Scalable statistical bug
isolation. In Proc. ACM SIGPLAN 2005 Int. Conf.
Programming Language Design and Implementation
(PLDI05), 2005. - LA03 B. Liblit, A. Aiken, A. Zheng, and M.
Jordan. Bug isolation via remote program
sampling. In Proc. ACM SIGPLAN 2003 Int. Conf.
Programming Language Design and Implementation
(PLDI03), pp. 141154, 2003. - Z02 A. Zeller. Isolating cause-effect chains
from computer programs. In Proc. ACM 10th Int.
Symp. Foundations of Software Engineering
(FSE02), 2002. - LY05 C. Liu, X. Yan, L. Fei, J. Han and S.
Midkiff, SOBER Statistical Model-based bug
Localization. In Proc. ACM 13th Int. Symp.
Foundations of Software Engineering (FSE05),
2005.