Title: Software Reliability
1Software Reliability
2Software Reliability
- Informally--
- The reliability of a software system is a measure
of how well users think it meets their needs.
3Software Reliability
- Formally--
- Reliability is the probability of failure-free
operation for a specified time in a specified
environment for a specified purpose. - Adapted fromhttp//www.ece.cmu.edu/koopman/des_s
99/sw_reliability/
4Software Reliability
- Is not a direct reflection of time, even though
it is expressed as a probability. - Software does not become old and break down or
wear out with use. - Software will not change over time unless
intentionally changed or upgraded. - Adapted fromhttp//www.ece.cmu.edu/koopman/des_s
99/sw_reliability/
5Software Reliability
- Software Reliability is a primary factor in
software quality. - Software quality also involves functionality,
usability, performance, serviceability,
capability, installability, maintainability, and
documentation. - Software Reliability is hard to achieve, because
of the inherent complexity of software. - Adapted fromhttp//www.ece.cmu.edu/koopman/des_s
99/sw_reliability/
6Software Failures
- Software Failures may be due to
- Errors, ambiguities, oversights or
misinterpretation of the specification that the
software is supposed to satisfy. - Carelessness or incompetence in writing code,
inadequate testing, incorrect or unexpected usage
of the software or other unforeseen problems. - Adapted fromhttp//www.ece.cmu.edu/koopman/des_s
99/sw_reliability/
7Software vs. Hardware Reliability
- Dont follow the same rules
- Hardware faults are mostly physical faults
- Software faults are design faults
- Design Faults are
- Harder to visualize, classify, detect, and
correct. - Related to fuzzy human factors and the design
process itself in which we don't have a solid
understanding. - While hardware may also suffer from design
faults, these faults do not dominate. - Adapted fromhttp//www.ece.cmu.edu/koopman/des_s
99/sw_reliability/
8Software vs. Hardware Reliability
- Problems that software experience that make it
different from hardware - Failure cause Software defects are mainly design
defects. - Wear-out Software does not have energy related
wear-out phase. Errors can occur without warning. - Repairable system concept Periodic restarts can
help fix software problems. - Time dependency and life cycle Software
reliability is not a function of operational
time. - Environmental factors Do not affect Software
reliability, except it might affect program
inputs. - http//www.ece.cmu.edu/koopman/des_s99/sw_reliabi
lity/
9Software vs. Hardware Reliability
- Reliability prediction Software reliability can
not be predicted from any physical basis, since
it depends completely on human factors in design. - Redundancy Can not improve Software reliability
if identical software components are used. - Interfaces Software interfaces are conceptual
not visual. - Failure rate motivators Usually not predictable
from analyses of separate statements. - Built with standard components Well-understood
and extensively-tested standard parts will help
improve maintainability and reliability. - Code reuse has been around for some time, but to
a very limited extent. - There are no standard parts for software, except
to a limited degree. - http//www.ece.cmu.edu/koopman/des_s99/sw_reliabi
lity/
10Hardware Reliabilityhttp//www.ece.cmu.edu/koopm
an/des_s99/sw_reliability/
11Software Reliability http//www.ece.cmu.edu/koopm
an/des_s99/sw_reliability/
12Software Reliability
- Reliability is a direct reflection of users'
expectations. - Software reliability is a function of the number
of failures experienced by a particular user with
a particular piece of software.
13Software Reliability
- Reliability is related to the probability of an
error occurring in operational use. - Removing software problems from parts of the
system that are rarely used does little to
increase perceived reliability.
14Software Reliability
- Why would a program with known faults be
perceived as reliable by users?
15Software Reliability When is a fault not a
fault?
- Users may never enter "faulty" input so program
failures never occur. - Users may never user a portion of the program
that has faults. - Experienced users often "work around" software
faults which are known to cause failures. - What the users experience is all they know.
16Software Reliability
- If the customer does not perceive a problem
- Should we repair the faults in these sections of
code?
17Software Reliability
- The reliability is difficult to quantify.
- Reliability depends on the pattern of usage of
that particular system.
18Software Reliability
- Some claim that the use of formal methods for
system development leads to more reliable
systems.
19Software Reliability
- Experience shows that formal specification and
proofs do not guarantee that the software will be
reliable in practical use. - Why?
20Why do formal specifications and proofs not
guarantee reliability?
- The specification may not reflect the real
requirements of system users. - Formal program proofs may contain errors because
they are large and complex reflecting large and
complex programs. - The system may not be used as anticipated.
21Software Reliability
- It is NOT possible to certify that a system is
100 reliable. - The time involved in doing so would be equal to
the lifetime of the system.
22Software Reliability
- As reliability requirements increase, system
costs usually rise exponentially. - Reliable system are often less efficient.
- Why?
23How can Reliable system be less efficient?
- We MUST include extra, often redundant, code to
perform the necessary checking for exceptional
conditions.
24But -- Reliability should always take precedence
over efficiency because
- Computers are now cheap and fast.
- Unreliable software is liable to be discarded
(????) by users - System failure costs may be enormous
- Unreliable system are difficult to improve.
- Inefficiency is predictable.
- Unreliable systems may cause information loss.
25Software Reliability
- Users often complain that systems are unreliable.
- This may be due to poor software engineering
- -- BUT --
- The most common cause of perceived unreliability
is incomplete specifications.
26The most common cause of perceived unreliability
is incomplete specifications
27Unreliability incomplete specifications
- The system performs as specified but the
specifications do not set out how the software
should behave in exceptional situations (or even
not so exceptional situations).
28Building in reliability
- Most large systems are composed of several
sub-systems which often have different
reliability requirements. - Reliability requirements of each sub-system must
be assessed separately rather than impose the
maximum reliability requirement on all
sub-systems.
29Failure class Description Transient Occurs only
with certain inputs Permanent Occurs with all
inputs Recoverable System can recover without
operator intervention Unrecoverable Operato
r intervention is needed to
recover Non-corrupting Failure does not
corrupt system state or data Corrupting Fail
ure corrupts system state or data
30Software Validation
- Software managers must decide how much effort
should be devoted to system testing. - Customers expect their software to operate
without failures.
31Reliability strategies
- (1) Fault avoidance
- (2) Fault tolerance
- (3) Fault detection
32Reliability strategies Fault avoidance
- The design and implementation process should be
organized around the goal of producing fault-free
systems!
33Reliability strategies Fault tolerance
- The product is designed with the expectation that
some faults will persist past the testing stage.
Facilities are provided in the software to allow
operation to continue when these faults cause
system failures -- recovery procedures.
34Reliability strategies Fault detection
- The goal is to detect faults BEFORE the software
is put into operation.
35Software Reliability
- A small number of faults may be acceptable.
- However--
- When software is safety-critical the program MUST
be designed so that residual faults do not cause
catastrophic failure.