Title: Embedded Fault Tolerant Distributed Schedule Synthesis
1Embedded Fault Tolerant Distributed Schedule
Synthesis
- Recovery Oriented Computing Seminar
- Claudio Pinello (pinello_at_eecs.berkeley.edu)
- Sam Williams (samw_at_cs.berkeley.edu)
Professor Dave Patterson Professor Armando Fox
2Overview / Motivation
- Generate Distributed Schedules that are
- Applied to embedded real-time systems
- Fault tolerant
- Support degraded functionality/accuracy
- Sensor replication and determinism
- Actuator replication and determinism
- Processors and Wires are far cheaper than tubes,
bars, etc. ? More and more functionality
implemented in electronics - Allows for reduced hardware requirements
(fewer/slower) than simple replication of boards - Reduced accuracy allows for graceful degradation
and reduced hardware requirements. - Example is BMW drive-by-wire system
3Model Extensions
Brake Pedal1
Brake Pad1
ABS
Brake Pedalpos
Brake Force
Arbiter
Brake Pedal2
Brake Pad2
Stability
sensors
Arbitration
Inputs
dataflow
Outputs
actuators
Task Graph
- Dataflow model
- requires all inputs to fire ? deterministic
behavior - Does not account for sensor failure/optional
operations
- Input operations
- Need subset of inputs (Sensor operations) to fire
? non-determinism
- Arbitration operations
- Similar to Input but for optional operations
4Generating Schedules and Merging into FTS
- Full architecture
- a. Schedule all functionalities
- For each failure pattern
- Mark the faulty
- architecture
- components (critical
- functionalities cannot
- run there)
- Schedule all
- functionalities
- Merge the schedules
- Care must be taken to deal with multiple
routings, clearly overly redundant
ECU0Sensor1
ECU1Sensor2
ECU1Input receiver (requires 1)
ECU0Input receiver (requires 1)
ECU0Function1 (required)
ECU1Function1 (required)
ECU0Function2 (optional)
ECU0Function2 (optional)
ECU0Arbiter
ECU1Arbiter
ECU0Output driver (requires 1)
ECU1Output driver (requires 1)
ECU1Actuator2
ECU0Actuator1
5Heuristics Limit CPU and BUS Load
Heuristics 1
Heuristics 2
- Full architecture
- Schedule all functionalities
- For each failure pattern
- Mark the faulty architecture components
- Re-schedule only critical functionalities
(constrain non critical as in full architecture) - Merge the schedules
- For each data dependency
- If result available from memory
- Add memory arc
- Remove bus arc
- For each data dependency
- If no outgoing arcs
- Remove datadependency
- Repeat until stable
6Functional Verification
- Schedule Verification is separable from verifying
correctness of algorithms is performed offline
(not time critical) - Apply equivalence checking methods to Distributed
Schedule - Create Task DAGs System Architecture Model
- Generate Failure Patterns based on combinations
of failures of devices. - For each Failure Pattern
- Generate FT DAGs (one for each actuator on each
processor) - Compare FT DAG for each actuator for each output
driver to the corresponding Task DAG (satisfied
by minimum needed by driver) - Output from Input drivers are valid iff the
minimum number of inputs are present - Arbiter produces output iff every input is
present or can be ignored due to criticality. - Takes milliseconds to run small cases. Few
minutes for large schedules - Allows for failures on task granularity (not just
iteration), but requires significantly more time
to verify (any processor can fail at any point). - Tool was written in PERL (performance was
sufficient)
7Functional Verification (example - continued)
ECU0Sensor1
ECU1Sensor2
Sensor1
Sensor2
Input receiver (requires 1)
ECU0Input receiver (requires 1)
ECU1Input receiver (requires 1)
Function1 (required)
Function2 (optional)
ECU0Function1 (required)
? ?
Arbiter
ECU1Function1 (required)
Output driver (requires 1)
ECU0Function2 (optional)
Actuator1
Task Graph Actuator1
ECU0Arbiter
ECU1Arbiter
Sensor1
Sensor2
ECU0Output driver (requires 1)
ECU1Output driver (requires 1)
? ?
Input receiver (requires 1)
Function1 (required)
Function2 (optional)
ECU1Actuator2
ECU0Actuator1
Arbiter
Output driver (requires 1)
Actuator2
- For the full functionality case, the arbiter
must include both functions. - The output function only requires one of the
actuators be visible. - In the other graphs (which include failures) ,
the arbiter only needs the - single required input (Function1)
Task Graph Actuator2
8Conclusions
- Results
- Off-line synthesis of fault tolerant schedules
- Off-line functional verification
- Extended dataflow model (failures, multiple
levels of criticality) - Heuristics (1,2) to limit CPU and BUS load
- Replica determinism analysis
- Future Work
- Tradeoff heuristics (1,2) vs 3 (passive replicas)
- Timing verification
- Full DBW example (Usability issues)
9DBW example
System Architecture (6 ?P, 3 channels)
and Dataflow model (sensors ? functions ?
actuators)
Example of scheduling without Fault tolerance
(much more replication with fault tolerance)