Title: Idaho RISE
1Idaho RISE System Reliability and Designing to
Reduce Failure
ENGR204 19 Sept 2005
2Reliability Analysis
Let R probability system (or instrument) will
operate without failure for time t (Success
Probability)
R e-lt
Note l failure rate (failures/second), sec-1
t-1 where t average seconds/failure
Failure Probability 1 - R
3If a system comprises n nonredundant systems all
equally essential for mission success, then the
total system reliability is
Rs R1 R2 R3 ... ... Rn e-l1t
e-l2t e-l3t e-lnt
where li is the failure rate of the ith system
4If a system comprises n redundant systems in
parallel, each of which can satisfy the mission
requirements individually, then the system
parallel (redundant) reliability is
Rp 1 - (1 - R1 ) (1 - R2 ) (1 - R3)
... (1 - Rn) 1 - F1
F2 F3 ... Fn
where Fi (1 - Ri) is the failure probability of
the ith system
5Series Reliability
A
B
C
Rtot RA RB RC
Full Redundancy
A
B
C
Rtot 1- (1- RA ) (1 - RB) (1 - RC)
6Partial Redundancy (A B are redundant, C is
essential)
A
C
B
Rtot RC 1- (1- RA) (1 - RB)
Non-Identical Full Redundancy (A B are
Essential, C is redundant)
B
A
C
Rtot 1 - (1- RA RB ) (1 - RC)
7Designing for Reliability
1. Keep It Simple! 2. Design Margin - Assure
adequate strength of all mechanical and
electrical parts, including allowance for
unusual loads due to environmental extremes.
This includes environmental shielding. 3.
Redundancy - Provide alternative means of
accomplishing required functions where design
for excess strength is not suitable /
reasonable. This includes most electronics.
8Notes on Redundancy
- Same Design Redundancy two or more identical
components - or systems
- Switching allows only one system to be active
- Outputs can be combined so switching is not
necessary - (e.g. power distribution systems)
- Voting for combining outputs of redundant units.
Requires - three or more units (e.g. accelerometer
activation of - critical sequence)
- Offers high protection against random failures
- Not effective against design deficiencies
9Notes on Redundancy, cont.
- Diverse Design Redundancy utilize two or more
systems of - different design
- High protection against failures due to design
deficiencies - Can offer lower cost if backup is lifeboat
with lesser - accuracy and functionality, but still adequate
for minimum - mission needs
10Notes on Redundancy, cont.
- Functional (Analytic) Redundancy addressing
requirements by - different techniques. For example, determination
of - spacecraft attitude by gyroscope or by star
tracker. - Avoids cost and weight penalties of physical
redundancy - Provides protection against design faults
- Disadvantage backup usually provides reduced
- performance.
Temporal Redundancy Repetition of unsuccessful
operation (i.e., retry after failure)
11Apollo Design Principles
The primary consideration governing the design of
the Apollo system was that, if it could be made
so, no single failure should cause the loss of
any crewmember, prevent the successful
continuation of the mission, or, in the event of
a second failure in the same area, prevent a
successful abort of the mission.
To implement this policy, the following specific
principles were established 1. Use established
technology 2. Stress hardware reliability 3.
Comply with safety standards 4. Minimize
in-flight maintenance and testing for failure
isolation 5. Simplify operations 6. Minimize
interfaces 7. Make maximum use of experience
gained from previous manned-space
missions.
Reference NASA SP-287
12Qualification and Acceptance Testing
- Assume
- Engineering data is complete and exact
- Engineering data completely controls manufacture
- All items manufactured to same engineering data
are - identical.
- Therefore
- the results of Qualification Tests for one
component - are considered valid for all components.
- If a representative component passes a sequence
of - qualification tests, all other components
built to same - engineering specifications should also pass
Design is said to be Qualified
Acceptance Testing is less severe, and is for the
purpose of certifying workmanship
13Failure Mode Definitions
Catastrophic failure complete loss of mission,
including flight hardware. (Examples Loss of
GPS Parachute failure) Major failure
significant loss of mission primary goals
significant degradation expected. (Example
Power supply failure) Minor - minor loss of
data or ability to achieve mission goals
system failure that is overcome by other flight
systems. (Example loss of primary temp sensor,
but temp data still retrieved from backup
sensor Loss of single GPS) Negligible
negligible impact on achieving mission goals.
14Team Assignment
Consider Catastrophic and Single Point failure
possibilities. 1. Initiate a list of potential
Catastrophic, Major, and Minor Failures. 2.
How can Catastrophic and Major failure
possibilities be prevented? Consider
simplifying design, redundancy, and design
margins. 3. Which failures are Single Point
(i.e., if a failure occurs there is no
viable means of recovery)? Example of
Catastrophic Single Point Failure heat shield
on atmospheric entry probe