Title: Software Quality Metrics Overview
1Software Quality Metrics Overview
2Types of Software Metrics
- Product metrics e.g., size, complexity, design
features, performance, quality level - Process metrics e.g., effectiveness of defect
removal, response time of the fix process - Project metrics e.g., number of software
developers, cost, schedule, productivity
3Software Quality Metrics
- The subset of metrics that focus on quality
- Software quality metrics can be divided into
- End-product quality metrics
- In-process quality metrics
- The essence of software quality engineering is to
investigate the relationships among in-process
metric, project characteristics , and end-product
quality, and, based on the findings, engineer
improvements in quality to both the process and
the product.
4Three Groups of Software Quality Metrics
- Product quality
- In-process quality
- Maintenance quality
5Product Quality Metrics
- Intrinsic product quality
- Mean time to failure
- Defect density
- Customer related
- Customer problems
- Customer satisfaction
6Intrinsic Product Quality
- Intrinsic product quality is usually measured by
- the number of bugs (functional defects) in the
software (defect density), or - how long the software can run before crashing
(MTTF mean time to failure) - The two metrics are correlated but different
7Difference Between Errors, Defects, Faults, and
Failures (IEEE/ANSI)
- An error is a human mistake that results in
incorrect software. - The resulting fault is an accidental condition
that causes a unit of the system to fail to
function as required. - A defect is an anomaly in a product.
- A failure occurs when a functional unit of a
software-related system can no longer perform its
required function or cannot perform it within
specified limits
8Whats the Difference between a Fault and a
Defect?
9The Defect Density Metric
- This metric is the number of defects over the
opportunities for error (OPE) during some
specified time frame. - We can use the number of unique causes of
observed failures (failures are just defects
materialized) to approximate the number of
defects. - The size of the software in either lines of code
or function points is used to approximate OPE.
10Lines of Code
- Possible variations
- Count only executable lines
- Count executable lines plus data definitions
- Count executable lines, data definitions, and
comments - Count executable lines, data definitions,
comments, and job control language - Count lines as physical lines on an input screen
- Count lines as terminated by logical delimiters
11Lines of Code (Contd)
- Other difficulties
- LOC measures are language dependent
- Cant make comparisons when different languages
are used or different operational definitions of
LOC are used - For productivity studies the problems in using
LOC are greater since LOC is negatively
correlated with design efficiency - Code enhancements and revisions complicates the
situation must calculate defect rate of new and
changed lines of code only
12Defect Rate for New and Changed Lines of Code
- Depends on the availability on having LOC counts
for both the entire produce as well as the new
and changed code - Depends on tracking defects to the release origin
(the portion of code that contains the defects)
and at what release that code was added, changed,
or enhanced
13Function Points
- A function can be defined as a collection of
executable statements that performs a certain
task, together with declarations of the formal
parameters and local variables manipulated by
those statements. - In practice functions are measured indirectly.
- Many of the problems associated with LOC counts
are addressed.
14Measuring Function Points
- The number of function points is a weighted total
of five major components that comprise an
application. - Number of external inputs x 4
- Number of external outputs x 5
- Number of logical internal files x10
- Number of external interface files x 7
- Number of external inquiries x 4
15Measuring Function Points (Contd)
- The function count (FC) is a weighted total of
five major components that comprise an
application. - Number of external inputs x (3 to 6)
- Number of external outputs x (4 to 7)
- Number of logical internal files x (7 to 15)
- Number of external interface files x (5 to 10)
- Number of external inquiries x (3 to 6)
- the weighting factor depends on complexity
16Measuring Function Points (Contd)
- Each number is multiplied by the weighting factor
and then they are summed. - This weighted sum (FC) is further refined by
multiplying it by the Value Adjustment Factor
(VAF). - Each of 14 general system characteristics are
assessed on a scale of 0 to 5 as to their impact
on (importance to) the application.
17The 14 System Characteristics
- Data Communications
- Distributed functions
- Performance
- Heavily used configuration
- Transaction rate
- Online data entry
- End-user efficiency
18The 14 System Characteristics (Contd)
- Online update
- Complex processing
- Reusability
- Installation ease
- Operational ease
- Multiple sites
- Facilitation of change
19The 14 System Characteristics (Contd)
- VAF is the sum of these 14 characteristics
divided by 100 plus 0.65. - Notice that if an average rating is given each of
the 14 factors, their sum is 35 and therefore VAF
1 - The final function point total is then the
function count multiplied by VAF - FP FC x VAF
20Customer Problems Metric
- Customer problems are all the difficulties
customers encounter when using the product. - They include
- Valid defects
- Usability problems
- Unclear documentation or information
- Duplicates of valid defects (problems already
fixed but not known to customer) - User errors
- The problem metric is usually expressed in terms
of problems per user month (PUM)
21Customer Problems Metric (Contd)
- PUM Total problems that customers reported for
a time period ltdivided bygt Total number of
license-months of the software during the period - where
- Number of license-months Number of the
install licenses of the software x Number of
months in the calculation period
22Approaches to Achieving a Low PUM
- Improve the development process and reduce the
product defects. - Reduce the non-defect-oriented problems by
improving all aspects of the products (e.g.,
usability, documentation), customer education,
and support. - Increase the sale (number of installed licenses)
of the product.
23Defect Rate and Customer Problems Metrics
Defect Rate Problems per User-Month (PUM)
Numerator Valid and unique product defects All customer problems (defects and nondefects, first time and repeated)
Denominator Size of product (KLOC or function point) Customer usage of the product (user-months)
Measurement perspective Producersoftware development organization Customer
Scope Intrinsic product quality Intrinsic product quality plus other factors
24Customer Satisfaction Metrics
Customer Satisfaction Issues
Customer Problems
Defects
25Customer Satisfaction Metrics (Contd)
- Customer satisfaction is often measured by
customer survey data via the five-point scale - Very satisfied
- Satisfied
- Neutral
- Dissatisfied
- Very dissatisfied
26IBM Parameters of Customer Satisfaction
- CUPRIMDSO
- Capability (functionality)
- Usability
- Performance
- Reliability
- Installability
- Maintainability
- Documentation
- Service
- Overall
27HP Parameters of Customer Satisfaction
- FURPS
- Functionality
- Usability
- Reliability
- Performance
- Service
28Examples Metrics for Customer Satisfaction
- Percent of completely satisfied customers
- Percent of satisfied customers (satisfied and
completely satisfied) - Percent of dissatisfied customers (dissatisfied
and completely dissatisfied) - Percent of nonsatisfied customers (neutral,
dissatisfied, and completely dissatisfied)
29In-Process Quality Metrics
- Defect density during machine testing
- Defect arrival pattern during machine testing
- Phase-based defect removal pattern
- Defect removal effectiveness
30Defect Density During Machine Testing
- Defect rate during formal machine testing
(testing after code is integrated into the system
library) is usually positively correlated with
the defect rate in the field. - The simple metric of defects per KLOC or function
point is a good indicator of quality while the
product is being tested.
31Defect Density During Machine Testing (Contd)
- Scenarios for judging release quality
- If the defect rate during testing is the same or
lower than that of the previous release, then
ask Does the testing for the current release
deteriorate? - If the answer is no, the quality perspective is
positive. - If the answer is yes, you need to do extra
testing.
32Defect Density During Machine Testing (Contd)
- Scenarios for judging release quality (contd)
- If the defect rate during testing is
substantially higher than that of the previous
release, then ask Did we plan for and actually
improve testing effectiveness? - If the answer is no, the quality perspective is
negative. - If the answer is yes, then the quality
perspective is the same or positive.
33Defect Arrival Pattern During Machine Testing
- The pattern of defect arrivals gives more
information than defect density during testing. - The objective is to look for defect arrivals that
stabilize at a very low level, or times between
failures that are far apart before ending the
testing effort and releasing the software.
34Two Contrasting Defect Arrival Patterns During
Testing
35Three Metrics for Defect Arrival During Testing
- The defect arrivals during the testing phase by
time interval (e.g., week). These are raw
arrivals, not all of which are valid. - The pattern of valid defect arrivals when
problem determination is done on the reported
problems. This is the true defect pattern. - The pattern of defect backlog over time. This is
needed because development organizations cannot
investigate and fix all reported problems
immediately. This metric is a workload statement
as well as a quality statement.
36Phase-Based Defect Removal Pattern
- This is an extension of the test defect density
metric. - It requires tracking defects in all phases of the
development cycle. - The pattern of phase-based defect removal
reflects the overall defect removal ability of
the development process.
37Defect Removal by Phase for Two Products
38Defect Removal Effectiveness
- DRE (Defects removed during a development phase
ltdivided bygt Defects latent in the product) x
100 - The denominator can only be approximated.
- It is usually estimated as
- Defects removed during the phase
- Defects found later
39Defect Removal Effectiveness (Contd)
- When done for the front end of the process
(before code integration), it is called early
defect removal effectiveness. - When done for a specific phase, it is called
phase effectiveness.
40Phase Effectiveness of a Software Product
41Metrics for Software Maintenance
- The goal during maintenance is to fix the defects
as soon as possible with excellent fix quality - The following metrics are important
- Fix backlog and backlog management index
- Fix response time and fix responsiveness
- Percent delinquent fixes
- Fix quality
42Fix Backlog
- Fix backlog is a workload statement for software
maintenance. - It is related to both the rate of defect arrivals
and the rate at which fixes for reported problems
become available. - It is a simple count of reported problems that
remain at the end of each time period (week,
month, etc.)
43Backlog Management Index (BMI)
- BMI (Number of problems closed during the month
ltdivided bygt Number of problem arrivals during
the month) x 100. - If BMI is larger than 100, it means the backlog
is reduced. - If BMI is less than 100, then the backlog is
increased.
44Opened Problems, Closed Problems, and Backlog
Management Index by Month
45Fix Response Time and Fix Responsiveness
- The fix response time metric is usually
calculated as - Mean time of all problems from open to closed
- Metric may be used for different defect severity
levels. - Fix response time relates to customer
satisfaction. - But meeting agreed-to fix time is more than just
achieving a short fix time. - A possible metric is the percentage of delivered
fixes meeting committed dates to customers.
46Percent Delinquent Fixes
- The mean response time metric is a central
tendency measure. - A more sensitive metric is the percentage of
delinquent fixes (for each fix, if the turnaround
time greatly exceeds the required response time,
it is classified as delinquent). - Percent delinquent fixes (Number of fixes that
exceeded the response time criteria by severity
level ltdivided bygt Number of fixes delivered in a
specified time) x 100
47Percent Delinquent Fixes (Contd)
- This is not a real-time metric because it is for
closed problems only. - For a real-time metric we must factor in problems
that are still open. - We can use the following metric
- Real-Time Delinquency Index 100 x Delinquent
/ (Backlog Arrivals)
48Real-Time Delinquency Index
49Fix Quality
- The number of defective fixes is another quality
metric for maintenance. - The metric of percent defective fixes is simply
the percentage of all fixes in a time interval
that are defective. - Recording both the time the defective fix was
discovered and the time the fix was made to be
able to calculate the latent period of the
defective fix.
50Examples of Metrics Programs
- Motorola
- Follows the Goal/Question/Metric paradigm of
Basili and Weiss - Goals
- Improve project planning
- Increase defect containment
- Increase software reliability
- Decrease software defect density
- Improve customer service
- Reduce the cost of nonconformance
- Increase software productivity
51Examples of Metrics Programs (Contd)
- Motorola (contd)
- Measurement Areas
- Delivered defects and delivered defects per size
- Total effectiveness throughout the process
- Adherence to schedule
- Accuracy of estimates
- Number of open customer problems
- Time that problems remain open
- Cost of nonconformance
- Software reliability
52Examples of Metrics Programs (Contd)
- Motorola (contd)
- For each goal the questions to be asked and the
corresponding metrics were formulated - Goal 1 Improve Project Planning
- Question 1.1 What was the accuracy of estimating
the actual value of project schedule? - Metric 1.1 Schedule Estimation Accuracy (SEA)
- SEA (Actual project duration)/(Estimated
project duration)
53Examples of Metrics Programs (Contd)
- Hewlett-Packard
- The software metrics program includes both
primitive and computed metrics. - Primitive metrics are directly measurable
- Computed metrics are mathematical combinations of
primitive metrics - (Average fixed defects)/(working day)
- (Average engineering hours)/(fixed defect)
- (Average reported defects)/(working day)
- Bang A quantitative indicator of net usable
function from the users point of view
54Examples of Metrics Programs (Contd)
- Hewlett-Packard (contd)
- Computed metrics are mathematical combinations of
primitive metrics (contd) - (Branches covered)/(total branches)
- Defects/KNCSS (thousand noncomment source
statements) - Defects/LOD (lines of documentation not included
in program source code) - Defects/(testing time)
- Design weight sum of module weights (function
of token and decision counts) over the set of all
modules in the design
55Examples of Metrics Programs (Contd)
- Hewlett-Packard (contd)
- Computed metrics are mathematical combinations of
primitive metrics (contd) - NCSS/(engineering month)
- Percent overtime (average overtime)/(40 hours
per week) - Phase (engineering months)/(total engineering
months)
56Examples of Metrics Programs (Contd)
- IBM Rochester
- Selected quality metrics
- Overall customer satisfaction
- Postrelease defect rates
- Customer problem calls per month
- Fix response time
- Number of defect fixes
- Backlog management index
- Postrelease arrival patterns for defects and
problems
57Examples of Metrics Programs (Contd)
- IBM Rochester (contd)
- Selected quality metrics (contd)
- Defect removal model for the software development
process - Phase effectiveness
- Inspection coverage and effort
- Compile failures and build/integration defects
- Weekly defect arrivals and backlog during testing
- Defect severity
58Examples of Metrics Programs (Contd)
- IBM Rochester (contd)
- Selected quality metrics (contd)
- Defect cause and problem component analysis
- Reliability (mean time to initial program loading
during testing) - Stress level of the system during testing
- Number of system crashes and hangs during stress
testing and system testing - Various customer feedback metrics
- S curves for project progress
59Collecting Software Engineering Data
- The challenge is to collect the necessary data
without placing a significant burden on
development teams. - Limit metrics to those necessary to avoid
collecting unnecessary data. - Automate the data collection whenever possible.
60Data Collection Methodology (Basili and Weiss)
- Establish the goal of the data collection
- Develop a list of questions of interest
- Establish data categories
- Design and test data collection forms
- Collect and validate data
- Analyze data
61Reliability of Defect Data
- Testing defects are generally more reliable than
inspection defects since inspection defects are
more subjective - An inspection defect is a problem found during
the inspection process that, if not fixed, would
cause one or more of the following to occur - A defect condition in a later inspection phase
62Reliability of Defect Data
- An inspection defect is one which would cause
(contd) - A defect condition during testing
- A field defect
- Nonconformance to requirements and specifications
- Nonconformance to established standards
63An Inspection Summary Form