Title: The Distribution of Faults in a Large Industrial Software System
1The Distribution of Faults in a Large Industrial
Software System
- Thomas J. Ostrand
- ostrand_at_research.att.com
- ATT Labs - Research
- Elaine J. Weyuker
- weyuker_at_research.att.com
- ATT Labs - Research
2Introduction
- What is the Problem?
- Identify the characteristics of files that can be
used as predictors of fault-proneness - What is Motivation?
- Where to focus limited test resource
- What is the subject program?
- Case study of a large industrial inventory system
3The questions
- 1. Distribution of faults over different files
(release, lifecycle stage, severity) - 2. Density of faults among different files
- 3. Persistence of faults (stages and releases)
- 4. New files VS old files
4Topics of Discussion
- Approach
- Common Belief
- Their Experimental Study
- Conclusion
- Related Work
- Limitations Future Work
5Approach
- Statistical Analysis Investigate data collected
from an inventory system - Related Work Compared with Work of Fenton
Ohlsson - Thirteen Releases VS two Releases
- Nine phases VS four phases
- Severity VS none
- Basic Component File (1974 files, 500000 lines
of code, Removed non-code files from study)
6Pareto Distribution of Faults(1-1)
- Common belief Pareto Distribution of fault
- What is a Pareto Distribution?
- Experimental Work
- By Release
- By Stage
- By Severity
7Pareto Distribution of Faults(1-2)
- Fault Concentration By Release
- Supportive table
- Conclusion concentrate on small number of fault
prone files
8Pareto Distribution of Faults(1-3)
9Pareto Distribution of Faults(1-4)
- Fault Concentration By Stage
- Early-pre-release
- Late-pre-release
- Post-release
- Conclusion
10Pareto Distribution of Faults(1-5)
11Pareto Distribution of Faults(1-6)
- Fault Concentration By Severity
- Severity 1 faults and Severity 4 faults accounted
for only 4 of the faults - With remaining 15 being Severity 2 faults
- Severity 3 faults accounted for 81 of the faults
- Conclusion
12Effect of Module Size (2-1)
- Common belief large modules are much more
fault-prone than small ones? - Earlier empirical studies contrary
- Experimental Work
- Hatton
- Fenton and Ohlsson
- Limitations Future Work
13Effect of Module Size (2-2)
14Persistence of Faults (3-1)
- Common belief
- Files with high concentration of faults detected
during pre-release also tend to have high
concentration of faults detected during
post-release. - Faultiness persists between releases.
- Application identify files that are unusually
fault-prone and focus test resources on them.
15Persistence of Faults (3-2)
- Experimental work
- Fault persistence between stages
Late-pre-release and post-release - Fault persistence between releases (13)
- Results
- 72-94 of pre-release faults with no
post-release faults - 100 of post-release faults in files with 6 to
28 pre-release faults
16Persistence of Faults (3-3)
- In contrast with common belief
- Each release has 584 to 1772 -gt 20 faults
- Conclusion Not enough data
17Persistence of Faults (3-4)
- Related work Fenton and Ohlsson
- Two successive releases pre-release and
post-release - Many of the post-release faults occur in modules
with no pre-release faults - 100 of post-release faults in modules with 7
or 23 pre-release faults - Conclusion
- In contrast with common belief
18Persistence of Faults (3-5)
Persistence of High-Fault Files
- Conclusion Supports common belief
19Old Files, New Files (4-1)
- Common belief
- New files have more faults
- Experimental work
- Compute the percentage of faulty new files and
the percentage of faulty pre-existing files
20Old Files, New Files (4-2)
- Conclusion allocate more resources for testing
new files than pre-existing ones
21Discussion
22Discussion Questions
- 1. The evaluation of their work is certainly
valuable to internal use, but how meaningful
would it be to the external use? - 2. Would it be more interesting or more important
if they elaborate their methodology rather than
just generalizing the result?