Title: Why the existing theory of software reliability must be discarded..and what should replace it
1Why the existing theory of software reliability
must be discarded..and what should replace it?
Aditya P. Mathur Professor, Department of
Computer Science, Associate Dean, Graduate
Education and International Programs Purdue
University
Wednesday July 26, 2006. Microsoft_at_Redmond, WA,
USA.
2Reliability
- Probability of failure free operation in a given
environment over a given time.
Mean Time To Failure (MTTF)
Mean Time To Disruption (MTTD)
Mean Time To Restore (MTTR)
3Operational profile
- Probability distribution of usage of features
and/or scenarios.
Captures the usage pattern with respect to a
class of customers.
4Reliability estimation
Operational profile
5Issues Operational profile
- Variable. Becomes known only after customers
have access to the product. Is a stochastic
processa moving target!
Random test generation requires an oracle. Hence
is generally limited to specific outcomes, e.g.
crash, hang.
6Issues Failure data
- Should we analyze the failures?
If yes then after the cause is removed then the
reliability estimate is invalid.
If the cause is not removed because the
failure is a minor incident then the
reliability estimate corresponds to irrelevant
incidents.
7Issues Model selection
- Rarely does a model fit the failure data.
Model selection becomes a problem. 200 models to
choose from? New ones keep arriving! More
research papers!
- Markov chain models suffer from a lack of
estimate of transition probabilities. - To compute these probabilities, you need to
execute the application. - During execution you obtain failure data. Then
why proceed further with the model?
8Issues Markovian models
?12
?12
?131
?21
?32
?13
- Markov chain models suffer from a lack of
estimate of transition probabilities. - To compute these probabilities, you need to
execute the application. - During execution you obtain failure data. Then
why proceed further with the model?
9Issues Assumptions
- Software does not degrade over time memory leak
is not degradation and is not a random process a
new version is a different piece of software.
- Reliability estimate varies with operational
profile. Different customers see different
reliability. - Can we not have a reliability estimate that is
independent of operational profile? - Can we not advertise quality based on metric that
are a true representation of reliability..not
with respect to a subset of features but over the
entire set of features?
10Sensitivity of Reliability to test adequacy
11Basis for an alternate approach
Why not develop a theory based on coverage of
testable items and test adequacy? Testable
items Variables, statements,conditions, loops,
data flows, methods, classes, etc.
Pros Errors hide in testable items.
Cons Coverage of testable items is inadequate.
Is it a good predictor of reliability?
Yes, but only when used carefully. Let us see
what happens when coverage is not used or not
used carefully.
12Saturation Effect
Rm
Rd
Rdf
Rf
Reliability
Rm
Rdf
Mutation
Rd
Dataflow
Rf
Decision
Functional
tfs
tfe
tds
tde
tdfs
tdfe
tms
tfe
Testing Effort
FUNCTIONAL, DECISION, DATAFLOW AND MUTATION
TESTING PROVIDE TEST ADEQUACY CRITERIA.
13Modeling an application
14Reliability of a component
Reliability, probability of correct operation,
of function f based on a given finite set of
testable items.
R(f)? (covered/total), 0lt?lt1.
Issue How to compute ? ?
Approach Empirical studies provide estimate of ?
and its variance for different sets of testable
items.
15Reliability of a subsystem
Cf1, f2,..fn is a collection of components
that collaborate with each other to provide
services.
R(C) g(R(f1), R(f2), ..R(fn), R(I))
Issue 1 How to compute R(I), reliability of
component interactions?
Issue 2 What is g ?
Issue 3 Theory of systems reliability creates
problems when (a) components are in a loop and
(b) are dependent on each other.
16Scalability
Is the component based approach scalable?
Powerful coverage measures lead to better
reliability estimates whereas measurement of
coverage becomes increasingly difficult as more
powerful criteria are used.
Solution Use component based, incremental,
approach. Estimate reliability bottom-up. No need
to measure coverage of components whose
reliability is known.
17Next steps
Develop component based theory of reliability.
Base the new theory on existing work in software
testing and reliability.
Do experimentation with large systems to
investigate the applicability of the their and
its effectiveness in predicting and estimating
various reliability metrics.