Title: Design for Safety
1Design for Safety
- Hazard Identification and Fault Tree Analysis
- Risk Assessment
- Define Safety Measures
- Create Safe Requirements
- Implement Safety (we will talk about software
specifically here) - Assure Safety Process
- Test,Test,Test,Test,Test
All of this happens in parallel, not just once
per design
2Hazard Identification
- Two Approaches
- Hazard analysis start from hazard and work
backwards - Ventilator
- Hypoventilation hazard ? No pressure in air
resevoir ? resevoir vent stuck open (single
failure) - Hyperventilation hazard ? pressure sensor failure
? overpressure valve stuck closed (double
failure) - FMEA Failure Modes and Effects Analysis, start
from failure work forward - Fuel Cell Example
- H2 sensor stuck normal ? failure to detect
internal leak ? Chassis vent blocked ? H2
concentration gt 45 ? explosion hazard (double
failure) - H2 sensor stuck as if H2 present ? system
shutdown on H2 leak error code ? no hazard - Single fault tolerance require timing analysis
toois first fault detected before it causes a
hazard, and before second fault can happens
3FMEA Working Forward
- Failure Mode how a device can fail
- Battery never voltage spike, only low voltage
- Valve Stuck open? Stuck Closed?
- Motor Controller Stuck fast, stuck slow?
- Hydrogen sensor Will it be latent or mimic the
presence of hydrogen? - FMEA
- For each mode of each device perform hazard
analysis as in the previous flow chart - Huge search space
4Fault Tree Analysis
And gates are good!
single fault hazard
52. Risk Assessment
S Extent of Damage Slight injury Serious
Injury Few Deaths Catastrophe E Exposure
Time infrquent continuous G Prevenability Poss
ible Impossible W Probability low medium high
- Determine how risky your system is
TUV standard
was single death and several deaths in
source Hard Time
Toy oven S2E1G2W2 lt 2
6Example Risk Assessment
Device Hazard Extent of Damage Exposure Time Hazard Prevention Probability TUV Risk Level
Microwave Oven Irradiation S2 E2 G2 W3 5
Pacemaker Pacing too slowly Pacing too fast S2 E2 G2 W3 5
Power station burner control Explosion S3 E1 -- W3 6
Airliner Crash S4 E2 G2 W2 8
73. Define the Safety Measures
- Obviation Make it physically impossible
(mechanical hookups, etc). - Education Educate users to prevent misuse or
dangerous use. - Alarming Inform the users/operators or higher
level automatic monitors of hazardous conditions - Interlocks Take steps to eliminate the hazard
when conditions exist (shut off power, fuel
supply, explode, etc. - Restrict Access. High voltage sources should be
in compartments that require tools to access, w/
proper labels. - Labeling
- Consider
- Tolerance time
- Supervision of the system constant, occasional,
unattended. Airport People movers have to be
design to a much higher level of safety than
attended trains even if they both have fully
automated control
84. Create Safe Requirements Specifications
- Document the safety functionality
- eg. The system shall NOT pass more than 10mA
through the ECG lead. - Typically the use of NOT implies a much more
general requirement about functionalityin ALL
CASES - Create Safe Designs
- Start w/ a safe architecture
- Keep hazard/risk analysis up to date.
- Search for common mode failures
- Assign responsibility for safe designhire a
safety engineer. - Design systems that check for latent faults
- Use safe design practicesthis is very domain
specific, we will talk about software
95. Implement Safety Safe Software
- Language Features
- Type and Range Safe Systems
- Exception Handling
- Re-use, Encapsulation
- Objects
- Operating Systems
- Protocols
- Testing
- Regression Testing
- Exception Testing (Fault Seeding)
- Nuts and Bolts
10Language Features
- Type and Range Safe Systems Pascal, Ada.Java?
- Program WontCompile1
- type
- MySubRange 10 .. 20
- Day Mo, Tu, We, Th, Fr, Sa, Su
- var
- MyVar MySubRange
- MyDate Day
- begin
- MyVar 9 will not compile range error
- MyDate 0 will not compile wrong type)
- True type safety also requires runtime checking.
- aj b what must be checked here to
guarantee type safety? - range/type of j, range/type of b
- Overhead in time and code size. But safety may
require this. - Does type-safe safe?
- If no, then what good is a type safe system?
11Guidelines
- Make it right before you make it fast
- Verify during program execution
- Pre-condition invariants
- Things that must be true before you attempt to
perform and operation. - Post-condition invariants
- Things that must be true after and operation is
performed - eg
- while (item!null)
- process(item)
- item item?next
-
- assert(item tail) // post-condition
invariant - Exception handling
- What should happen in the event of an exception
(assert(false))?
who should be responsible for this check?
12Exception Handling
- Its NOT okay to just let the system crash if some
operation fails! You must, at least, get into
safe mode. - Standard C it is up to the app writer to perform
error checking on the value returned by f1 and
f2. Easily put off, or ignored. Cant
distinguish error handling from normal flow, no
guarantee that all errors are handled gracefully.
- a f1(b,c)
- if (a) switch (a)
- case 1 handle exception 1
- case 2 handle exception 2
-
-
- d f2(e,f)
- if (d) switch (d)
- case 1 handle exception 1
- case 2 handle exception 2
-
-
13Exception Handling in Java
- void myMethod() throws FatalException
- try // normal functional flow
- a x.f1(b) // a is return value, b is
parameter - d x.f2(e) // d is return value, e is
parameter - catch (IOException ex)
- recover and continue
- catch (ArrayOutOfBoundsException ex)
- not recoverable, throw new FatalException(Im
Dead) - finally
- finish up and exit
-
-
- Exceptions that are thrown, or not handled will
terminate the current procedure and raise the
exception to the caller, and so on. Exceptions
are subclassed so that you can have very general
or very specific exception handlers. No errors go
unhandled.
Separates throwing exceptions functional
code exception handling
14Safety of Object Oriented SW
- Strongly typed at compile time
- Run time checking is not native, but can be built
into class libraries for extensive modularization
and re-use. The class author can force the app to
deal with exceptions by throwing them! - class embeddedList extends embeddedObject()
- public add(embeddedObject item) throws
tooBigException - if (this.len() gt this.max())
- throw new tooBigException(List size too big)
- else addItem2List(item)
-
- If you call embeddedList.add() you have three
choices - Catch the exception and handle it.
- Catch the exception and map it into one of your
exceptions by throwing an exception of a type
declared in your own throws clause. - Declare the exception in your throws clause and
let the exception pass through your method
(although you might have a finally clause that
cleans up first). Compiler will make you aware of
any exceptions you forgot to consider! - When to use exceptions and when to use status
codes or other means?
15More Language Features
- Garbage collection
- What is this for
- Is it good or bad for embedded systems
- Inheritance
- Means that type safe systems can still have
functions that operate on generic objects. - Means that we can re-use commonalities between
objects. - Encapsulation
- Means the the creator of the data structure also
gets to define how the data structure is accessed
and used, and when it is used improperly. - Means that the data structure can change without
changing the users of the data structure (is the
queue an array or a linked listwho cares!) - Re-use
- Use trusted systems that have been thoroughly
tested - OS
- Networking
- etc.
- We have a project group looking into pros/cons of
embedded java
166. Testing
- Unit test (white box)
- requires knowledge of the detailed implementation
of a single sub-system. - Test local functionality
- Control algorithms
- Boundary conditions and fault response
- Integration Test (gray box)
- Distributed processor systems w/ ongoing
communications - Subsystems are already unit tested
- Primarily for interfaces and component
interaction - Falt seeding includes breaking the bus, disabling
a subsystem, EMI exposure, power supply
fluxuation, etc - Embedded systems require physical test
environments - Validation Testing
- Complete system
- Environmental chamber
- More fault seeding, bad user, etc.
- Fault Seeding and Regression Testing!!!
177. Safe Design Process
- Mainly, the hazard/risk/FMEA analysis is a
process not an event! - How you do things is as important as what you do.
- Standards for specification, documentation,
design, review, and test - ISO9000 defines quality processone quality level
is stable and predictable. - There are many processes, but the good ones
include release/test early and often! Incremental
analysis, development, and testing