Safety, Reliability, and Robust Design - PowerPoint PPT Presentation

About This Presentation
Title:

Safety, Reliability, and Robust Design

Description:

Safety, Reliability, and Robust Design in Embedded Systems * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * fig_08_22 fig_08_22 Block codes ... – PowerPoint PPT presentation

Number of Views:272
Avg rating:3.0/5.0
Slides: 51
Provided by: andrel1
Learn more at: https://eecs.ceas.uc.edu
Category:

less

Transcript and Presenter's Notes

Title: Safety, Reliability, and Robust Design


1
  • Safety, Reliability, and Robust Design
  • in Embedded Systems

2
Risk analysis managing uncertainty GOAL be
prepared for whatever happens Risk analysis
should be done for ALL PHASES of a
project ---planning phase ---development
phase ---the product itself Identify risks
What could you have done during the planning
stage to manage each of these risks? How
likely is it (what is probability) each one will
occur? How likely is it (what is probability)
more than one will occur? What actions will best
manage the risk if it occurs?
3
risk managementidentify, plan for risks
During planning, a Risk Table can be generated
Risks Type Probability Impact Plan
(Pointer) System not available Hardware
failure Color printer unavailable Person
nel absent (one meeting) Personnel
unavailable (several meetings) Personnel
have left project Type Performance
(product wont meet requirements) Cost (budget
overruns) Support (project cant be maintained
as planned) Schedule (project will fall
behind) Probability of this risk
occurring Impact e.g., catastrophic, critical,
marginal, negligible
4
Then table is sorted by probability and impact
and a cutoff line is defined. Everything above
this line must be managed (with a management
plan pointed to in the last column). Useful
reference Embedded Syst. Prog. Nov.
00--examples http//www.embedded.com/2000/0011/00
11feat1.htm Additional interesting reference
H. Petroski, To Engineer is Human The Role of
Failure in Successful Design, Vintage, 1992. .
risk managementidentify, plan for risks
5
professional risk analysis is proactive, not
reactive
6
Important concepts for embedded systems Risk
(Probability of failure) Severity Increased
risk ? decreased safety Safety
failurespossible causes incorrect or
incomplete specification bad design improper
implementation faulty component improper use
RELIABILITY what is the probability of
failure?
7
Some ways to determine reliability --product
performs consistently as expected --MTBF (mean
time between failures) is long --system behavior
is DETERMINISTIC --system responds or FAILS
GRACEFULLY to out-of-bounds or unexpected
conditions and recovers if possible
8
Definitions Fault incorrect or unacceptable
state or condition Fault duration and frequency
determines clasification transientfrom
unexpected external condition-soft intermittent
unstable hardware or marginal design
periodic / aperiodic permanentfailed
component, e.g.hard Error static, inherent
characteristic of system Failure dynamic,
occurs at specific time Possible fault
consequences inappropriate action timingevent
occurs too early or too late sequence of events
incorrect quantitywrong amount of energy or
substance used
9
  • Achieving reliability
  • safe design
  • fault detection
  • fault management
  • fault tolerantsystem recovers, fault not
    detected
  • e.g., packet transfers
  • Definition of reliability for embedded system
  • probability that a failure is detected by the
    user is less than a specified threshold

10
Examplessection 8.5read these
carefully! Ariane 5 rocket register
overflow64-bit word assigned to 16-bit register
in a reused subsystem Mars Pathfinder mission
1997lower priority tasks were allowed to hog
resources, higher priority tasks could not
execute 2004 Mars missionfile management
problems Many more examples in articles at
embedded.com
11
How do we define safety? One criterion single
point failure of a single component will not
lead to unsafe condition common-mode failure
failure of multiple components due to a single
failure event will not lead to an unsafe
condition Safety must be considered THROUGHOUT
the project
12
fig_08_00
Embedded system designproject components Dev
elopment process (waterfall model) Alternati
ve process models Need risk analysis AT EACH
INCREMENT (Aanalysis, Ddesign, Iimplement,
Ttest, Mmaintenance) Basic waterfall model
A--gtD--gtI--gtT--gtM Prototyping A--gtD--gtI--gtT--gtM
Incremental A--gtD--gtI--gtT--gtM--gtA--gtD--gtI--gtT
--gt --gtM Component based A--gtD--gtLibrary--gtInt
egrate--gtT--gtM I
fig_08_00
13
  • Specifications
  • Identify hazards
  • Calculate risk
  • Define safety measures
  • Specification document should include safety
    standards and guidelines which system complies
    with
  • e.g. Underwriters Laboratory, FCC, FDA, FAA,
    AEC, NASA, ISO, NHTSA, etc.
  • Some industry standards / procedures
  • FAA DO178B (and newer Do178C).
  • Medical device industry ISO 14971
  • Nuclear power industry ( others) IEC 61508,
    "Functional Safety of Electrical/Electronic/Progra
    mmable
  • Electronic Safety-related Systems (E/E/PE, or
    E/E/PES)" areas

14
Methods
  • --Process and Tool Chain evaluation (this is the
    main focus of DO178B)
  • --Probability-based models
  • --Formal methods
  • --Traditional methods for code testing, e.g.,
    basis path testing
  • --Standard code-checking tools (e.g., avoiding
    inclusion of redundant code)

15
fig_08_01
Design and review process steps
fig_08_01
16
fig_08_02
  • Coding
  • Trade-off
  • traditional efficiency (speed/space) vs better
    reliability
  • Some examples
  • Array declarations const may not be required but
    is preferred, e.g.
  • const int size 5 int myarraysize
  • Make sure initialization is explicit, do not
    depend on compiler, e.g.
  • int tot 0 for (int j0 jlt10 j) tot tot
    j
  • Do not depend on lazy evaluation, e.g.
  • if (( a ! 0) (b/a lt 0)) ? if (a!0)
  • if (b/a lt 0)

17
fig_08_02
Primitive C error-handling May not be
sufficient for embedded system Assert
fig_08_02
18
fig_08_03
Example Good for debugging stage,
allows controlled crash Not robust enough for
final code
fig_08_03
19
fig_08_04
Jump statements consequences may not be
acceptable
fig_08_04
20
fig_08_05
Example Better high compiler
warning level, variable typing, e.g.
fig_08_05
21
fig_08_06
Example system Control Memory Data /
comm Power / reset Peripherals Clock
fig_08_06
22
fig_08_07
Basic method redundancy (triple)
fig_08_07
23
fig_08_08
Higher redundancy
fig_08_08
24
fig_08_09
Reduced capability in case of failure / error
fig_08_09
25
fig_08_10
Alternative monitor only
fig_08_10
26
fig_08_11
Bussing interconnection architectures
fig_08_11
27
fig_08_12
Sequential still can fail at one point
fig_08_12
28
fig_08_13
Better ring
fig_08_13
29
fig_08_14
Even better ring with redundancy
fig_08_14
30
fig_08_15
Signal values magnitude duration ignore det
ect / warn react
fig_08_15
31
fig_08_16
Data errors detect / correct Example errors
in 3 bits
fig_08_16
32
fig_08_17
Error detection example
fig_08_17
33
fig_08_18
Hamming code (review)
fig_08_18
34
fig_08_22
Block codes example Lateral longitudinal
parity
fig_08_22
35
fig_08_23
fig_08_23
36
fig_08_24
More complex codes use the field Z2
fig_08_24
37
fig_08_25
Shift register for encoding, decoding
fig_08_25
38
fig_08_26
Checking data
fig_08_26
39
fig_08_27
syndrome calculator
fig_08_27
40
fig_08_28
Encoding
fig_08_28
41
fig_08_29
Some polynomials must choose correct one
fig_08_29
42
fig_08_30
Power system
fig_08_30
43
fig_08_31
Redundancy and power monitoring
fig_08_31
44
fig_08_32
Potential actions
fig_08_32
45
fig_08_33
Using backups
fig_08_33
46
fig_08_34
Backups short-term fix
fig_08_34
47
fig_08_35
Bus faults buffering
fig_08_35
48
fig_08_36
Bus testing
fig_08_36
49
fig_08_37
Interface system monitoring and testing
fig_08_37
50
table_08_00
Example common fault analysis
table_08_00
Write a Comment
User Comments (0)
About PowerShow.com