Title: Week 9 - Systems Engineering
1Week 9 - Systems Engineering
How bad can a weakest link problem be? This is
the Silver Bridge at Point Pleasant, WV, which
collapsed into the Ohio River during rush hour on
Dec 15, 1967. The cause was the failure of a
single eyebar in the suspension chain, due to a
defect 0.1 inch deep.
- System Wide Requirements The Ilities
- Reliability
2Worst case reliability - Engineering disasters
- ATT Network Crash story (See http//users.csc.cal
poly.edu/jdalbey/SWE/Papers/att_collapse.html. ) - Kansas City Hotel story (See for example
http//ethics.tamu.edu/Portals/3/Case20Studies/Hy
attRegency.pdf. ) - Challenger (discussed here)
ATT network map
3The Ilities
- Quality
- Reliability
- Blanchard and Fabrycky, Systems Engineering and
Analysis, 4th Ed. Ch 12 - Wasson Ch 50
- Interoperability
- Usability
- Maintainability
- Serviceability
- Producibility and Disposability
4The Ilities-2
- All are System Wide in Scope.
- All are desirable system outcomes.
- Technical, engineering, mathematical definitions
behind each one. - Included as Technology and System-Wide
requirements when critical enough. - How to measure and quantify ?
5The Second Ility - Reliability
- Our focus
- Reliability Definitions.
- Series and Parallel Systems.
- Reliability Improvement Methods.
- Reliability Prediction and Testing.
- Risk (Ch. 19)
6Definition of Reliability
- The reliability of an item is the probability
that it will adequately perform its function for
a specified period of time. - Time is involved
- specify units hrs, miles, etc.
- specify time duration.
7Reliability vs. Quality
- Reliability includes passage of time.
- Quality a static descriptor.
- Or, may include Reliability as one component
- High reliability implies high quality converse
not true. - Tire example
- Ones made in 1960 and 2000.
- Both high quality wrt current standards
- New ones last longer more reliable.
- Microsoft example
- Quality means three dimensions Reliability,
Feature Set, and Schedule!
8Reliability Example
- Space Shuttle Challenger accident on January 28,
1986. - O-Rings sealed the joints in the solid rocket
motors. - Engineers used two O-rings one for backup.
9(No Transcript)
10(No Transcript)
11(No Transcript)
12Launch Details
- During flight, the rocket casing bulges which
widens the gap between sections. - Due to low temperature and bulging effect both
O-rings failed resulting in accident. (not
independent systems). - Launch reliability calculated (after the
accident) as 0.87 at 31 deg F. (but 0.98 at 60
deg F).
13(No Transcript)
14(No Transcript)
15(No Transcript)
16Three Aspects of Reliability
- Analysis how to quantify, equations
- Testing how to test
- Prediction how do I know in advance
- Well look at analysis first ?
17Measures of Reliability (BF 12.2, Wasson Ch 50)
- Reliability Function, R(t) probability that
system will be successful for some time period t. - R(t) 1 F(t)
- F(t) is the failure distribution or
unreliability function. - Like, what are the odds of the system staying
up for a year? - At t 0, R(t) 1.0. At t 8, F(t) 1.0.
18R(t) for Exponential distn.
Integral from t to infinity is the rest of the
probability beyond t, i.e., the probability it
didnt fail up to time t.
- R(t) 1 F(t)
- If time to failure is (assumed to be) defined
by Exponential Function (Constant Failure Rate)
then - f(t)
Like, if half fail in year 1, then half of the
remaining ones will fail in year 2, etc.
19Resulting R(t) function
- R(t)
- Mean life (q) is average lifetime of all items
considered. - For exponential distribution, MTBF is q.
This is the accumulated value, what you get doing
the integration.
20Failure rate and MTBF
- R(t)
- l is instantaneous failure rate
- M or q are MTBF.
- l 1/q 1/MTBF
21Wasson MTTF
Light bulb failures
22Wasson MTBF
- Wasson suggests
- MTBF MTTF MTTR
- Mean Time Between Failures
- Mean Time To Failure
- Mean Time To Repair
- Since MTTR is small, MTBF approx MTTF
23Systems Perspective of Failures
- A failure is any event where system is not
functioning properly. - Failures may be classified as primary, secondary,
etc. (Table 12.1). - Wasson suggest MIL-HBDK-470A failure of
mission critical items. - Systems engineers must consider all failure modes
and types.
Failure distributions consider many modes of
failure therefore are often difficult to
characterize
24Useful fact
- If a system has a constant failure rate, the
reliability of that item at its mean life is 37. - 37 probability that it will survive to its mean
life without failure.
25Exponential Distn f(t) e-tSee figure 12.1
errors
26Failure and Hazard Rates
27The Failure Rate
- Failure Rate is
- Number of Failures/Total Operating Hrs
- Failure rate expressed as failures per hour,
failures per million hours, etc.
28Failure Rate Example
- 10 Components tested for 600 hrs.
- So the other 5 lasted the full 600 hours.
- Total of 4180 hours in the test, for all 10.
- Failure Rate per hr, l 5/4180 0.001196
- MTBF ?? (This is a prediction for all.)
29Reliability Nomograph - Fig 12.3
- For exponential distribution.
- Relationship between MTBF, l, R(t).
- Example MTBF is 200 hrs (l0.005) and operating
time is 2 hrs then R(t) 0.99
30l 1/q 1/MTBF
31Failure Rates vs. Life
32Wasson Bathtub Curve
Burn-in of electronics devices
33Wasson Electronic Equip
34Reliability of Component Relationships
- Engineers assemble systems from components and
sub-systems. - How to analyze the reliability of the whole
based on structure and component reliabilities. - Two simple structures series and parallel.
35Series Networks
- Series components all must function.
- R (RA ) (RB ) (RC) (multiply Rs)
- R (add ls)
36Sample Problem Series
- Series system of four components, expected to
operate to 1000 hrs. - MTBFs
- A (6000 hrs), B(4500), C(10500), D(3200)
- What is R for the series system ??
- (Ans. 0.4507)
- What is MTBF for the series system ??
37Solution
A B C D
MTBF 6000 4500 10500 3200
? 0.000167 0.000222 9.52E-05 0.000313 0.000797 0.450847
R 0.846482 0.800737 0.909156 0.731616
Prod Rs 0.450847
Sum the ?s
e 1000 0.000797
38Parallel Networks
- Parallel components all must fail for system to
fail. - R RA RB (RARB)
- R 1 (1 RA) (1 RB) (1 RC)
- (n components)
39Reliability and Redundancy
40Series and Parallel Networks
- Figure 12.10- Reduce parallel blocks to
equivalent series element.
41Sample Problems
- Figure 12.10 a and c.
- RA 0.99
- RB 0.96
- RC 0.98
- RD 0.92
- RE 0.8
- RF 0.8
42Related Figures of Merit (FOM)
- Mean Time Between Maintenance MTBM
- Scheduled
- Unscheduled
- Availability A
- Probability that system when used under stated
conditions in ideal/actual operational
environment will operate satisfactorily. - Wasson RAM
- Reliability
- Availability
- Maintenance
43Figure 12.11
- How to calculate MTBF, MTBM ??
- MTBF 58 failed ?
- MTBM 100 failed ?
A Common Service Shop Finding NTF, no trouble
found
43
44Service Life Extension
45Reliability and System Life Cycles section 12.3
- What Reliability should the System have to
accomplish mission, over life cycle, under
expected environment. - Requirements that affect reliability
- System performance factors,
- Mission profile,
- Use conditions, duty cycle, etc.
- Environment temp, vibration, etc.
46Review of Key Concepts
- Ilities are System Wide Requirements.
- Specify Reliability as MTBF, MTBM, R(t),..
- Flow down/allocate top level requirements to
functional blocks (Fig 12.16,17) - We have functional architecture.
- We have series/parallel tools to do this.
47Reliability Flow Down
Series Add lambdas
Series Add lambdas
MTBFs have to get larger - See slide 33
48Boeing Flowdown Example
KPP Key Performance Parameter
49Ways to Manage/Improve Reliability
- Failure Analysis
- Component Selection
- Pick standardized components.
- Evaluate prior to acceptance.
- Custom parts/testing takes time money.
- Part Derating
- Electrical part concept.
- Operate at lower conditions, longer life.
- Redundancy
50Redundant Subsystems
51Ways to Manage/Improve Reliability-2
- Redundancy
- Parallel paths higher reliability.
- But- penalties of weight, space, cost, etc.
- Must truly be independent systems.
- (Buede pg. 242, Sioux City Plane crash)
- Genesis spacecraft
- Often cannot be applied, or on limited basis.
52UA232
53Ways to Manage/Improve Reliability-3
- To now have considered Operating Redundancy
all subsystems working. - Standby redundancy if A fails, switch to and
operate B. B not operating while A operates. - Equation 12.27 for one standby.
- R(standby) gt R(operating)
54Sample Problem - Standby
- One operating, one standby (identical)
- 200 hrs operating period, l 0.002 per hr.
- Calculate R (standby)
- Calculate R (operating) assuming both operating.
- Switch R ? (100 ?)
- Why Standby gt Operating ??
55Reliability Analysis Methods Section 12.4
- FMECA failure mode, causes, effects, and
criticality analysis. - Identify failure modes early.
- Focus on high risk/problem items.
- Stress/strength analysis
- Operate at critical/maximum stress conditions.
- Identify weak, critical components.
56Cause and Effect Chart
- Graphical document of possible causes for an
effect (problem, error, fault). - Usually consider 5Ms E as main branches.
57FMEA Steps Review
- Team activity
- Select component, system, process step, etc.
- Identify possible failure modes.
- Identify causes of failure modes.
- Identify effects of failures.
- Estimate (1-10 ranking)
- Occurrence how often (1not, 10often)
- Severity how bad (1not, 10severe)
- Detection how easy (1easy, 10difficult)
- Calculate RPN risk priority number
58Reliability Prediction
- Predict based on similar equipment easy but
inaccurate. - Predict from Parts Count
- Predict from Life/Stress Analysis
59Example Parts Count
where n Number of part categories Ni
Quantity of ith part ? Failure rate of ith part
p Quality Factor of ith part(handbook)
60MTBF 1/l
where n Number of part categories Ni
Quantity of ith part ? Failure rate of ith part
p Quality Factor of ith part(handbook)
61Reliability Testing - 12.6
- Part of test and qualification.
- Assure that MTBF requirements are met.
- Testing
- Either accept, reject, continue test (Fig. 12.30)
- Test under simulated mission profile (Fig 12.31)
Run some tests how confident are we in the
results ??
62Sequential Test Plan
63Simulated Mission Profile
64Reliability Testing-2
- Establish criteria for accept, reject, and risks
of false decisions. - Equations 12.29, 12.30. Determine regions for
accept, reject, continue, with defined acceptance
risks.
65Example MIL-STD-781 Fig. 12.32
66Actual Test Conditions Fig. 12-33
- MTBF400
- Max time 4000
- Failures noted and fixed.
- Accept at 3200 hrs.
67Test Results