Title: Eternal Testing
1Eternal Testing
Approaches and Implementation Issues Christian
Murphy Candidacy Exam April 7, 2008
2What is Eternal Testing?
- Testing done in the development environment
typically is considered finished prior to
deployment - Eternal testing refers to all approaches to
testing/monitoring deployed software to look for
faults not found in development
3Why Eternal Testing?
- Infeasible to fully test a large system prior to
deployment considering - different runtime environments
- different configuration options
- different patterns of usage
- new versions of hardware, OS, libraries, etc.
that are released after software is deployed
4Why Eternal Testing?
- Developers using 3rd-party components have little
knowledge of their quality and future volatility - Can be used to gather field data to assist
in-house testing efforts - Orso03, Elbaum05
5Overview
- Background
- Approaches to Eternal Testing
- Who has done what in this field?
- Implementation issues
- How are these approaches realized?
- Conclusion future directions
- What technical and non-technical challenges
remain?
6Background
- Self-checking software is not a new concept
- Yau75, Yau76
- Diagnostics of embedded systems
- Telephone switches
- Health care systems
- Air traffic control
7Background
- Programming with assertions
- Rosenblum92
- Clarke06
- Perpetual testing
- Clarke00
8Approaches to Eternal Testing
9(No Transcript)
10Summary of Approaches
- Passive monitoring/profiling
- Debugging fault localization
- Active testing
- Ongoing testing during development
- Built-in testing of 3rd-party components
- Failure prediction
11Monitoring Defects Targeted
- In general, anything that can be monitored, e.g.
coverage, performance, gross failures, etc. - In particular, anything that comes from code that
was not tested prior to deployment because it was
deemed unreachable or uncommon (residue)
12Monitoring/Profiling Software
- Residual testing
- Pavlopoulou99
- Expectation-driven residual testing
- Naslavsky04
- Data structure consistency checking
- Demsky06
- Gamma
- Orso02, Bowring03
13Debugging Fault Localization
- Cooperative Bug Isolation (CBI)
- Liblit03, Liblit04
- Automated Debugging of Deployed Applications
(ADDA) - Clause07
- Triage
- Tucek07
- Offline approaches
- Jones02, Cleve05, Gupta05
14Summary of Approaches
- Passive monitoring/profiling
- Debugging fault localization
- Active testing
- Ongoing testing during development
- Built-in testing of 3rd-party components
- Failure prediction
15Active Testing Defects Targeted
- Functionality that does not meet system
specifications - Anything that comes from platforms or
configuration options that were not tested prior
to deployment - Defects in the in-house testing process
16Testing during Development
- Traditional testing approaches
- Nightly builds
- Continuous Testing
- Saff03
17Active Testing in the Field
- Skoll
- Memon04
- MuGamma
- Kim06
- In vivo testing (Invite)
- Chu08
18Built-in Testing Defects Targeted
- Anything hidden from user of 3rd-party component
- Those introduced through changes to components
- Those that come about when ubiquitous computing
components interact the syntax may be okay but
the semantics may not
19Built-in Testing of Components
- Retrospectors
- Liu98
- BIT and Component
- Momotko04
- Mao07
- Self-testing COTS components (STECC)
- Beydeda05
- MORABIT
- Merdes06, Brenner07
20Summary of Approaches
- Passive monitoring/profiling
- Debugging fault localization
- Active testing
- Ongoing testing during development
- Built-in testing of 3rd-party components
- Failure prediction
21Fault Prediction Defects Targeted
- A sequence of events (fault propagation chain)
that signals an impending failure - Control/data flow
- Predicate values
- Resource utilization
22Fault Prediction
- Classifying executions as failing or passing
- Haran05
- Predicting failure based on execution anomalies
- Elbaum03
- Baah06
- Predicting fault manifestation (Kheiron)
- Griffith06
23Kheiron
???
24Implementation Issues
25Assignment
26How are tasks assigned to application instances?
- Software tomography determine appropriate
subtasks and assign them to clients either using
dynamic usage data or static analysis tasks can
then be reassigned as needed (Gamma) - Server treats exploration of configuration space
as a planning problem and identifies all
acceptable plans (Skoll) - Each instance does random sampling (CBI)
27How does instrumentation get delivered into the
field?
- Simply including instrumentation built into the
shipped software (Invite) or supporting libraries
(Triage) - As a separate component bundled with software
(STECC, MORABIT) - On request from the user (Skoll)
- Sent from a central server, and then hot-swapped
or injected into the software (Gamma)
28How does the system know when to execute the
instrumentation?
- Constant monitoring (Gamma)
- On demand from the user (Skoll)
- Periodically, randomly, or during idle time (CBI,
Invite) - Context-sensitive component lookup, method call,
change in topology (MORABIT) - Resource-aware threshold, priority, delay
(MORABIT)
29How is the instrumentation executed?
- As event monitors (Gamma, Triage)
- As assertions (CBI)
- Tests invoked by a separate component (MORABIT,
STECC) - Tests executed on software in a separate/parallel
execution environment (MuGamma, Invite)
30What happens when an error is detected?
- Send results to the developers for analysis
(Skoll, Invite) - Send results to the server for reassignment of
tasks (Gamma) - Replay/analyze the error either locally (Triage)
or in the development environment (ADDA, CBI)
31Conclusion
32Future Directions Concerns
- Privacy and security issues
- Evaluation of applicability, cost, and benefit
- Detecting computation errors
- Optimizing placement and execution of
instrumentation - Efficient filtering, aggregation, and
interpretation of large amounts of field data
33Future Directions Concerns
- Emerging computing models
- Ubiquitous computing
- Multicore architectures
- Embedded/Realtime systems
- Parallel/Concurrent systems
- Distributed/Clustered systems applications
- Distinguishing between OS-level and
application-level testing
34Conclusion
- Monitoring and testing could possibly be combined
to predict failures - Work to date has taken advantage of some (but
very few) emerging computational models - Still many open areas for research, both
technical and non-technical
35Thank You!
36To ETERNITY and beyond!!
Any Questions???
37Bibliography
38Baah06 George Kofi Baah, Alexander Gray, Mary
Jean Harrold, On-line Anomaly Detection of
Deployed Software A Statistical Machine Learning
Approach. In Proceedings of the Third
International Workshop on Software Quality
Assurance (SOQUA 2006), 2006. Beydeda05 S.
Beydeda. Research in testing COTS components -
built-in testing approaches. In Proc. of the 3rd
ACS/IEEE International Conference on Computer
Systems and Applications, 2005. Brenner07 D.
Brenner, C. Atkinson, et al. Reducing
verification effort in component-based software
engineering through built-in testing. Information
System Frontiers, 2007. Bowring03 J. Bowring,
A. Orso, M.J. Harrold, Monitoring deployed
software using software tomography, Proceedings
of the 2002 ACM SIGPLAN-SIGSOFT workshop on
Program analysis for software tools and
engineering (PASTE), 2003. Chu08 M. Chu, C.
Murphy, G. Kaiser, Distributed In Vivo Testing of
Software Applications, In Proc of the First
International Conference on Software Testing,
Verification, and Validation, 2008. Clarke00 Lo
ri A. Clarke and Leon J. Osterweil, Continuous
Self-Evaluation for the Self-Improvement of
Software, Proceedings of the first international
workshop on self-adaptive software,
2000. Clarke06 L. Clarke and D. Rosenblum, A
historical perspective on runtime assertion
checking in software development, ACM SIGSOFT
Software Engineering Notes, 2006 Clause07 J.
Clause and A. Orso. A technique for enabling and
supporting debugging of field failures. In Proc.
of the 29th ICSE, 2007. Cleve05 Holger Cleve
and Andreas Zeller. Locating Causes of Program
Failures, In Proc. 27th International Conference
on Software Engineering, 2005. Demsky06 Brian
Demsky, Michael D. Ernst, Philip J. Guo, Stephen
McCamant, Jeff H. Perkins, Martin Rinard,
Inference and enforcement of data structure
consistency specifications, In Proc of ISSTA,
2006. Elbaum03 S. Elbaum, S. Kanduri, A.
Amschler, Anomalies as precursors of field
failures, 14th International Symposium on
Software Reliability Engineering, 2003
39Elbaum05 S. Elbaum and M. Diep. Profiling
Deployed SoftwareAssessing Strategies and
Testing Opportunities. Transactions on Software
Engineering, 2005 Griffith06 Rean Griffith and
Gail Kaiser. A Runtime Adaptation Framework for
Native C and Bytecode Applications. In Proc of
Third IEEE International Conference on Autonomic
Computing, 2006. Gupta05 Neelam Gupta ,
Haifeng He , Xiangyu Zhang , Rajiv Gupta,
Locating faulty code using failure-inducing
chops, Proceedings of the 20th IEEE/ACM
international Conference on Automated software
engineering, 2005. Haran05 M. Haran, A. Karr,
A. Orso, A. Porter, A. Sanil. Applying
classification techniques to remotely-collected
program execution data. ACM SIGSOFT Software
Engineering Notes Vol 30, No. 5,
2005. Jones02 James A. Jones, Mary Jean
Harrold, John Stasko. Visualization of Test
Information to Assist Fault Localization. In
Proceedings of the International Conference on
Software Engineering (ICSE 2002),
2002. Kim06 S.W. Kim, M.J. Harrold, Y.R. Kwon,
MUGAMMA Mutation Analysis of Deployed Software
to Increase Confidence and Assist Evolution,
Second Workshop on Mutation Analysis, 2006.
Liblit03 B. Liblit, A. Aiken, A. Zheng, and
M. Jordan. Bug Isolation via Remote Program
Sampling. in Programming Language Design and
Implementation, 2003 Liblit04 B. Liblit, M.
Naik, A.X. Zheng, A. Aiken, and M.I. Jordan.
Public deployment of cooperative bug isolation.
In Proceedings of the Second International
Workshop on Remote Analysis and Measurement of
Software Systems, 2004. Liu98 C. Liu and D.
Richardson. Software components with
retrospectors. In Proc. of International Workshop
on the Role of Software Architecture in Testing
and Analysis, 1998. Mao07 C. Mao, Y. Lu, and
J. Zhang. Regression testing for component-based
software via built-in test design. In Proc. of
the 2007 ACM Symposium on Applied Computing,
2007
40Memon04 Memon, A. Porter, C. Yilmaz, A.
Nagarajan, D. Schmidt and B. Natarajan. Skoll
distributed continuous quality assurance. In
Proc. of the 26th ICSE, 2004. Merdes06 M.
Merdes, et al. Ubiquitous RATs how
resource-aware run-time tests can improve
ubiquitous software systems. In Proc. of the 6th
International Workshop on Software Engineering
and Middleware, 2006. Momotko04 Mariusz
Momotko and Lilianna Zalewska. Component
Built-in Testing A Technology for Testing
Software Components, 2004 Naslavsky04 Leila
Naslavsky, Roberto Silva Filho, Cleidson de
Souza, Marcio Dias, Debra Richardson, David
Redmiles. Distributed expectation-driven residual
testing. In Second International Workshop on
Remote Analysis and Measurement of Software
Systems, 2004 Orso02 A. Orso, D. Liang, M.J.
Harrold, R. Lipton. Gamma System Continuous
Evolution of Software after Deployment, Proc of
ISSTA, 2002. Orso03 A. Orso, T.
Apiwattanapong, and M.J. Harrold. Leveraging
field data for impact analysis and regression
testing. In Proc. of the 9th European Software
Engineering Conference, 2003 Pavlopoulou99 C.
Pavlopoulou and M. Young. Residual test coverage
monitoring. In Proc. of the 21st ICSE,
1999 Rosenblum92 D.S. Rosenblum, Towards a
method of programming with assertions, In Proc of
14th ICSE, 1992. Saff03 D. Saff and M.D.
Ernst. Reducing wasted development time via
continuous testing. In Proc. Of the 14th
International Symposium on Software Reliability
Engineering, 2003. Tucek07 J. Tucek, S. Lu, C.
Huang, S. Xanthos, Y. Zhou, Triage Diagnosing
Production Run Failures at the User Site, In Proc
of SOSP, 2007. Yau75 S. Yau and R.C. Cheung.
Design of self-checking software. In Proc. of the
International Conference on Reliable Software,
1975 Yau76 S.S. Yau, R.C. Cheung, D.C.
Cochrane, An approach to error-resistant software
design, Proc of 2nd ICSE, 1976.