Ardea: A Reconfigurable Architecture for Fault Tolerant Distributed Embedded Systems - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

Ardea: A Reconfigurable Architecture for Fault Tolerant Distributed Embedded Systems

Description:

... Blue Heron, a wading bird of the heron family Ardeidae, ... This is the largest North American heron. 11. Intelligent Dependable Embedded Architectures Lab ... – PowerPoint PPT presentation

Number of Views:127
Avg rating:3.0/5.0
Slides: 56
Provided by: osamahar
Category:

less

Transcript and Presenter's Notes

Title: Ardea: A Reconfigurable Architecture for Fault Tolerant Distributed Embedded Systems


1
Ardea A Reconfigurable Architecture for Fault
Tolerant Distributed Embedded Systems
  • Osamah. A. Rawashdeh
  • Ph.D. Defense
  • Department of Electrical and Computer Engineering
  • University of Kentucky
  • Lexington, KY
  • November 29, 2005

2
Outline
  • Motivation and Background
  • Objective and Contributions
  • Ardea Framework Overview
  • Ardea Hardware Architecture
  • Software Module Dependency Graphs
  • Ardea Fault Tolerance
  • Runtime Behavior
  • Related Work
  • Example Implementation
  • Future Work
  • Conclusion

3
Motivation
  • Majority of Processors are in Embedded Systems
  • Systems are becoming more complex and perform
    more critical operations
  • Correct operation must be insure for safety of
    public and environment
  • Fault will always occur gt systems must be
    designed to be dependable

4
Dependable Systems
  • Dependability trustworthiness of a system
    allowing reliance to be justifiably placed on
    its services
  • Failures Deviation of service provided from
    compliance
  • with specifications
  • Faults the cause of failures
  • Failure semantics omission, timing, response,
    and crash
  • Hardware versus software faults
  • Fault Tolerance ability to continue operation
    despite failures

Figure 1 - Page 6
5
Traditional Fault Tolerance
  • Fault tolerance entails fault detection and
    subsequent handling
  • Fault tolerance requires redundancy
  • Static redundancy (spatial redundancy)
  • Modular redundancy
  • Design Diversity
  • Dynamic redundancy (temporal redundancy)
  • Recovery blocks
  • Failover programming

6
BIG BLUE Fault Tolerance
May 2003
  • BIG BLUE I
  • Single processor design
  • Static sensor redundancies
  • Ad hoc fault tolerance
  • BIG BLUE II
  • Distributed 3 processor design
  • I2C communication bus
  • Shared data memory for communication
  • Static redundancies
  • BIG BLUE III
  • Single processor design
  • Real-time multi-tasking OS used
  • Task interdependencies limited by using a
    mailman

May 2004
May 2005
7
Reconfiguration Based FT
  • Run-time reconfiguration FT feasible in
    distribute embedded systems
  • Cost, size, power constraints
  • Availability of non-critical resources
  • Graceful degradation a loss of or reduction in
    the quality of services a system provides in
    response to faults
  • Graceful degradation for distributed embedded
    systems is a new research area

8
The Challenge
  • How to specify a dynamically reconfiguring system
    that included static and dynamic redundancies
  • How to manage the redundancies
  • What infrastructure is needed to run these
    dynamic applications

9
Objective and Contributions
  • Ardea is a systematic framework for designing an
    implementing real-time dynamically reconfiguring
    fault tolerant distributed embedded systems
  • Contributions
  • A graphical software specification technique for
    real-time applications that supports static
    redundancies as well as reconfiguration fault
    tolerance
  • An infrastructure supporting run-time application
    reconfiguration

10
The Ardea Framework
  • Ardea Automatically Reconfigurable Distributed
    Embedded Architectures
  • Ardea herodias The Great Blue Heron, a wading
    bird of the heron family Ardeidae, common all
    over North and Central America. This is the
    largest North American heron.

11
Ardea Overview
  • Software is developed in a modular fashion
  • Mobile software modules can have several
    implementations with different resource
    requirements and output qualities
  • Dependencies among modules are graphically
    captured in software module dependency graphs
    (DGs) specifying application operating modes and
    execution parameters
  • A set of networked processors for running
    application software

12
Ardea Overview cont.
  • A global system manager tracks status of hardware
    and software resources
  • System manager computes new system configurations
    (a mapping of software modules onto processing
    elements)
  • Local management tasks are responsible for OS
    scheduling and data routing
  • Target applications real-time distributed
    embedded control/periodic applications

13
Outline
  • Motivation and Background
  • Objective and Contributions
  • Framework Overview
  • Ardea Hardware Architecture
  • Software Module Dependency Graphs
  • Ardea Fault Tolerance
  • Runtime Behavior
  • Related Work
  • Example Implementation
  • Future Work
  • Conclusion

14
HW Architecture Overview
  • Processing Elements (PEs)
  • - Homogeneous set of processors
  • - Real-time OS.
  • - Local management tasks (scheduler,
    network interface, loader)
  • I/O Devices
  • - Sensors and actuators
  • - Hosted by PEs
  • Communication Network
  • - Broadcast Network
  • - Bandwidth and Latency
  • System Manager
  • - Fault tolerant by other means
  • - Tracks status of resources
  • - Finds and deploys configurations

Figure 14 - Page 34
15
Application Software Specification
  • Dependency graphs show the periodic flow of
    information from sensors to actuators (i.e., data
    pipelines)
  • Graph nodes software modules, data variables,
    I/O devices, and dependency gates
  • Software modules
  • Executable code schedulable
  • on a processing element
  • Suspended while input(s) unavailable
  • Produce and consume data variables
  • Attributes worst case execution time
  • and rate factor

Figure 3 - Page 18
16
Data Exchange
  • Data variables
  • Application data between
  • software modules
  • State data variables are local to a software
    module
  • Management data variables contain data consumed
    by system manager.
  • Attributes
  • Size
  • Quality value or function

Figure 5 - Page 19
17
Specifying Dependencies
  • Dependency gates
  • k-out-of-n OR gates n gt 0,
  • 0 k n
  • AND all input required
  • XOR only one input required
  • DEMUX for fanning out
  • OR gates can be specified to distribute inputs

Figure 5 - Page 20
18
DG with Node Attributes
ID yaw_cntrl1 Exec_T 900 cycl. Rate_factor
15
ID out1, out2 Criticality critical Priority
1, Rate 10 Hz State Enabled
ID rud_Angle1 Size 2 bytes Quality 1
ID mag_drv1 , mag_drv2 Exec_T 300
cycl. Rate_factor n/a
ID yaw_history Size 8 bytes Quality n/a
ID servo1_drv, servo2_drv Exec_T 200
cycles Rate_factor 11
ID yaw1, yaw2 Size 2 bytes Quality 1, 2
ID rud_Angle2 Size 2 bytes Quality 2
ID yaw_cntrl2 Exec_T 400 cycl. Rate_factor
12
19
Ardea Fault Detection Handling
  • Failure detection of sensors, actuators and
    software modules is the responsibility of
    application software
  • Ardea built-in fault detection
  • PE crash failures by heartbeat messages
  • Network link failures detected and handled as PE
    failures
  • Software module crashes detected locally by a
    module execution monitors
  • Critical output modules detect missed deadlines
  • Fault Handling masking, reconfiguration, or
    fail-stop

20
Sensor Fault Detection
Figure 22 - Page 52
Figure 21 - Page 51
21
Actuator Fault Detection
Figure 23 - Page 53
22
Software Fault Detection
Figure 25 - Page 55
23
Triple Modular Redundancy
Figure 20 - Page 50
24
Ardea Runtime Behavior
  • Supporting mobile software modules (moving object
    code, scheduling/unscheduling, and data
    re-routing)
  • Tracking resource availability
  • Finding Configurations
  • Deploying Resources
  • Manage state data variables

25
The System Manager
Figure 18 - Page 45
26
Processing Elements (PEs)
  • Memory Loader copies code into program memory
  • Scheduler starts and stops execution of modules
  • Network Interface handles public data variables
    (data routing)

Figure 17 - Page 42
27
Mobility and Data Routing
  • Module I/O data passed through mailboxes
  • Data routing transparent to modules
  • Starting, stopping of modules

Figure 26 - Page 61
28
Reconfiguration Policies
  • Two configuration finding algorithms
  • High-fidelity is (NP-hard) to find high-utility
    configurations
  • Low-fidelity (fast) to insure running of critical
    services
  • Response based on criticality of
    detected/reported fault
  • Deploying configurations starting from sensor
    side of a DG

See Figure 31 - Page 75
29
Outline
  • Motivation and Background
  • Objective and Contributions
  • Framework Overview
  • Software Module Dependency Graphs
  • Ardea Hardware Architecture
  • Ardea Fault Tolerance
  • Runtime Behavior
  • Related Work
  • Example Implementation
  • Future Work
  • Conclusion

30
Related Work Analysis Tools
  • Goal Based Success Trees
  • Failure Mode Analysis (fault trees)
  • AADL Architecture Analysis and Design Language
  • By SAE
  • A textual modeling language for specification of
    real-time embedded systems
  • System is defined as a set of components with
    resource and timing properties
  • No support for components with degraded modes of
    operation
  • Graphical tools currently under development

31
Related Work Graceful Degradation
  • RoSES at CMU and Chameleon at the Technical
    University of Keiserslautern
  • Both are abstract, not considering implementation
  • Both are considering non-safety critical and
    non-real-time applications
  • RoSES focuses on complexity configuration of
    search algorithms and on product families
  • Chameleon focuses on modeling and analysis of
    gracefully degrading systems

32
Related Work Distributed Object Computing
  • Examples Jini, CORBA, RT-CORBA, and ARMORs
  • Principle service based computing, where
    services are brokered at runtime
  • Designed with large information systems in mind
  • Depend on TCP/UDP (not reliable)
  • Not suitable for embedded systems

33
Outline
  • Motivation and Background
  • Objective and Contributions
  • Framework Overview
  • Software Module Dependency Graphs
  • Ardea Hardware Architecture
  • Ardea Fault Tolerance
  • Runtime Behavior
  • Related Work
  • Example Implementation
  • Future Work
  • Conclusion

34
Example Ardea Implementation
  • Specified an implementation of a control system
    and scientific data collection system for a light
    UAV
  • Application includes redundant sensors,
    actuators, yaw controllers with different
    fidelities
  • Application includes non-critical functions in
    form of a scientific data collection system
  • Limitations
  • Object code is preloaded on PEs
  • Uses pre-computed configurations
  • Considers only processor time as constraint

35
Pseudocode of Modules
1 Check connection to magnetometer i 2 IF no
connection, write failure message into mag i
fail mailbox and suspend 3 ELSE 4 Sample
magnetometer n times 5 Average the n samples
6 Place average into yaw i mailbox
7 Suspend for sampling period amount of
time 8 GOTO 1
Figure 33 Magnetometer Drivers Pseudocode, p. 83
1 Check connection to servo 2 IF no connection,
write failure message into servo i fail mailbox
and suspend 3 ELSE 4 Read rudder angle
from input mailbox 5 IF rudder-angle is
fail-stop code, move servo to 180 degrees 5
ELSE 6 Set servo to
rudder angle 7 Delete data in input
mailbox 8 Reset deadline timer 9 Suspend until
input mailbox full or deadline timer
overflow 10 IF deadline timer overflow 11 trans
mit fail-stop to system manager 12 ELSE GOTO 1
Figure 35 Magnetometer Drivers Pseudocode, p. 84
36
COTS Components
  • Network CAN2.0B Bus
  • Controller Area Network
  • Robust differential signaling, CRC, fail-silent
    nodes
  • CSMA/CD with non-destructive bitwise arbitration
  • PEs high-performance Silicon Labs 8051 core
    microcontrollers
  • OS microC/OS-II
  • Preemptive, multitasking, priority based, ROMable
  • DO-178B Level-A certified

Figure 39 - Page 87
37
microC/OS-II API
  • OS Functions called by the scheduler task for
    managing task execution and mailboxes
  • Other calls read/write mailboxes, suspend while
    mailbox empty, and suspend for time t

Table 8 Operating System API, p. 93
38
CAN API and Messages
Table 10 The CAN Bus API, p. 95
Table 11 Overview of Messages, p. 96
39
Resource Tracking and Finding New Configurations
Table 12 Highest Utility Configuration Array, p.
99
Table 13 Example Resource Status Array, p. 101
40
Future Work Ardea System Monitor
Figure 46 - Page 104
41
Future Work A Wireless Bus Extension
Figure 47 - Page 105
42
Future Work The Ardea CAD Tool
Figures 12,13 - Pages 31,32
43
Future Work Honeywells EAFTC
  • Honeywells Environmentally Adaptive Fault
    Tolerant Computing System (EAFTC)
  • One of four technology validation payloads on the
    New Millennium Programs Space Technology 8 (ST8)
    Mission scheduled for 2008.
  • Purpose fault tolerant high-rate onboard
    parallel processing for science data
  • We are currently investigating the
    use/modification of Ardea for EAFTC
  • Supported by the Kentucky Space Grant Consortium

Testing Tomorrows Technology Today!
44
Ardea Benefits
  • More flexible fault tolerance at reduced cost
  • Ability to analyze reconfigurable architectures
    using DGs
  • Simplified debugging and maintenance
  • Runtime system testing
  • Graceful upgrade and repair
  • Reduction of design errors
  • Software reusability

45
Conclusion
  • Graceful degradation in distributed embedded
    system is a new research area currently focusing
    on either abstract modeling or on
    non-real-time/non-critical systems
  • Ardea provides a structured framework for the
    design and implementation of real-time systems
  • Dependency graphs were presented to capture fault
    tolerant, dynamically reconfiguring, software
    architectures
  • An infrastructure supporting reconfigurable
    distributed reconfigurable applications was
    presented

46
Misc. Publications Patents
  • Patents
  • Vallance, R.R., S. Chikkamaranahalli, O.A.
    Rawashdeh, J.E. Lumpp, B. Walcott, and E.
    Wolsing. System and Device for Characterizing
    Shape Memory Alloy Wires, U.S. Patent 6,916,115,
    July 12, 2005.
  • Wermeling, D., R. Vallance, B. Walcott, J. Main,
    J. Lumpp, O.A. Rawashdeh,  Programmable
    Multi-Dose Intranasal Drug Delivery Device, U.S.
    patent pending, application for utility patent
    filed December 2002.
  • Balasubramanian, A., R.R. Vallance, B.L. Walcott,
    J.E. Lumpp, O.A. Rawashdeh, Linear Actuator
    Using Shape Memory Wire with Controller, U.S.
    patent pending, provisional application filed
    September, 2002.
  • Publications
  • D. Jackson, A. Groves, O. Rawashdeh, G. Chandler,
    W. Smith, and J. Lumpp, Evolution of an Avionics
    System for a High-Attitude UAV, proc. AIAA
    Infotech_at_Aerospace Conference, paper
    AIAA-2005-7152, September 2005.
  • S. Chikkamaranahalli, R.R. Vallance, and A.Khan,
    E.R. Marsh, O.A. Rawashdeh, J. E. Lumpp, and B.L.
    Walcott, Precision Instrument for Characterizing
    Shape Memory Alloy Wires in Bias Spring
    Actuation, Review of Scientific Instruments
    Journal, v. 76, June 2005.
  • G. Chandler, D. Jackson, A. Groves, O.A.
    Rawashdeh, N.A. Rawashdeh, W. Smith, J. Jacob,
    and J.E. Lumpp, Jr., A Low-Cost Control System
    for a High-Altitude UAV, IEEE Aerospace
    Conference, IEEEAC paper 1438, March 2005.
  • A. Simpson, O.A. Rawashdeh, S. Smith, J. Jacob,
    W. Smith, and J.E. Lumpp, JR., BIG BLUE A
    High-Altitude UAV Demonstrator of Mars Airplane
    Technology, IEEE Aerospace Conference, IEEEAC
    paper 1436, March 2005.
  • A. Simpson, J. Jacob, S. Smith, O. A. Rawashdeh,
    J. E. Lumpp, and W. Smith, BIG BLUE II Mars
    Aircraft Prototype with Inflatable-Rigidizable
    Wings, 43rd AIAA Aerospace Sciences Meeting and
    Exhibit, January 2005.
  • K.N. Roberts, K.M. Miller, J.E. Lumpp, M. Wells,
    C.P. Harr, O.A. Rawashdeh, and S.W. Scheff,
    Computer Controlled Cortical Contusion Device
    for the Mouse, Journal of Neurotrauma, vol. 21,
    no. 1296, November 2004.
  • O.A. Rawashdeh, Design of a Computer Controller
    for a Nasal Drug Delivery Device using SMA
    Actuators, Thesis, Masters of Science in
    Electrical Engineering, Dept. of Electrical and
    Computer Engr., University of Kentucky, May 2003.
  • S. Chikkamaranahalli, R.R. Vallance, O.A.
    Rawashdeh, J.E. Lumpp, and B. Walcott, Setup to
    Characterize Nitinol Wires, Int.Conf. on Shape
    Memory and Superelastic Technologies (SMST-2003),
    Pacific Grove, CA, May 4-8, 2003.
  • S. Chikkamaranahalli, R.R. Vallance, O.A.
    Rawashdeh, J.E. Lumpp, and B. Walcott,
    Characterization of SMA Wire in Bias Spring
    Actuation, Proceedings of the 2003 Proceedings
    of the the International Conference on Shape
    Memory and Superelastic Technologies (SMST-2003).
    Pacific Grove, CA, May 4-8, 2003.
  • J.E. Lumpp, K.N. Roberts, M. Wells, J.A. Main,
    C.P. Harr, O.A. Rawashdeh, and S.W. Scheff,
    Characterization of a Computer Controlled
    Non-penetrating Cortical Contusion Device,
    Journal of Neurotrauma, vol. 20, no. 1087, May
    2003.
  • S. Chikkamaranahalli, R.R. Vallance, O.A.
    Rawashdeh, J.E. Lumpp, and B. Walcott, Precision
    Instrument for Characterizing Contraction and
    Extension of Nitinol Wire, Proceedings of the
    17th Annual Meeting of the American Society for
    Precision Engineering (ASPE), October 20-25, 2002.

47
Ardea Related Publications
  • O.A. Rawashdeh and J.E. Lumpp, Jr. Run-Time
    Behavior of Ardea A Dynamically Reconfiguring
    Distributed Embedded Control Architecture, to
    appear, IEEE Aerospace Conference, IEEEAC paper
    1516, March 2006.
  • O.A. Rawashdeh and J.E. Lumpp, Jr. Ardea A
    Dynamic Reconfiguration Framework for
    Fault-Tolerant Distributed Embedded Systems,
    under review, Journal of Systems and Software,
    Special Issue Architecting Dependable Systems,
    submitted October 2005.
  • G. Chandler, C. Harr, O. Rawashdeh, D. Feinauer,
    D. Jackson, A. Groves, and J. Lumpp, Wireless
    Extension of an Avionics Bus for Prototyping and
    Testing Reconfigurable UAVs, proc. AIAA
    Infotech_at_Aerospace Conference, paper
    AIAA-2005-7151, September 2005.
  • O. Rawashdeh, D. Feinauer, C. Harr, G. Chandler,
    D. Jackson, A. Groves, and J. Lumpp, A
    Dynamically Reconfiguring Avionics Architecture
    for UAVs, proc. AIAA Infotech_at_Aerospace
    Conference, paper AIAA-2005-7050, September
    2005.
  • O.A. Rawashdeh, G.D. Chandler, and J. E. Lumpp,
    Jr., A UAV Test and Development Environment
    Based on Dynamic System Reconfiguration,
    International Conference on Software Engineering
    (ICSE) proc. of the 2005 Workshop on
    Architecting Dependable Systems (WADS05), pp. 1
    7, May 2005.
  • O.A. Rawashdeh and J.E. Lumpp, Jr., A Technique
    for Specifying Dynamically Reconfigurable
    Embedded Systems, IEEE Aerospace Conference,
    IEEEAC paper 1435, March 2005.

48
(No Transcript)
49
Reconfiguration Policies
Figure 31 - Page 75
50
Related Work RoSES
  • Robust Self-Configuring Embedded Systems
  • Long term abstract graceful degradation research
    at CMU
  • Composes system into feature
  • subsets, each having a utility value depending
    on the operation of its software components
  • Offline exponential reconfiguration search
    algorithms find optimal configurations
  • Not deterministic and not testable
  • Focus
  • Reducing complexity of search
  • algorithms
  • Software fault tolerance for non-critical
    functionality
  • Use as product family specification

51
Related Work Chameleon
  • Focus is on modeling and analysis of gracefully
    degrading distributed embedded systems
  • System modeled as a set of services, each with
    input requirements
  • Each service has a configuration tree (or success
    trees)
  • Abstract modeling work, no implementation or
    runtime considerations

52
I/O Devices
  • I/O devices
  • Interfaces to the environment
  • Output device attributes
  • Criticality
  • Priority
  • Real-time deadline
  • Status
  • Attributes are modifiable
  • I/O software modules
  • Input modules are time triggered
  • Output modules monitor deadlines

Figure 8 - Page 24
53
Ardea Fault Handling
  • Static redundancies do not cause reconfiguration
  • Report of failures to system manager trigger
    reconfiguration to employ redundancies
    dynamically
  • Fail-stop mode initiated when critical deadlines
    are missed (due to undetected failures or due to
    reconfiguration delays)

54
Example Dependency Graph
1/3
1/3
Figure 9 - Page 24
55
Scheduling and Unscheduling
  • Starting, stopping, and restarting modules
  • Restarting requires
  • State Preservation
  • Unprocessed data preservation

Figures 27,30 - Pages 65,69
Write a Comment
User Comments (0)
About PowerShow.com