Title: RunTime Behavior of Ardea: A Dynamically Reconfigurable Distributed Embedded Control Architecture
1Run-Time Behavior of Ardea A Dynamically
Reconfigurable Distributed Embedded Control
Architecture
- Osamah A. Rawashdeh and James E. Lumpp, Jr.
- Department of Electrical and Computer Engineering
- University of Kentucky
- Lexington, KY
2Outline
- Motivation/Background
- Objective and Contributions
- Ardea Framework Overview
- Ardea Hardware Architecture
- Software Module Dependency Graphs
- Ardea Fault Tolerance
- Runtime Behavior
- Summary and Conclusion
3Embedded Control
- Distributed embedded control system
- Mission critical tasks
- Non-mission critical tasks
- Array of sensors and actuators
- Set of computing resources
- Interconnection network
4Distributed Fault-Tolerance
- Nodes, links and software can fail
- Fault tolerance entails fault detection/handling
- Requires redundancy
- Static redundancy (spatial redundancy)
- Modular redundancy
- Design Diversity
- Dynamic redundancy (temporal redundancy)
- Recovery blocks
- Failover programming
- Roll-back/roll-forward
- Real-Time deadlines
5BIG BLUE
- BIG BLUE Baseline Inflatable-wing Glider,
Balloon- Launched Unmanned Experiment. - Ongoing project at UK to developing a test bed
for Mars airplane technology. - 40 undergraduate students involved per year.
6UAV Research
- BIG BLUE is funded by NASA Workforce Development
Program. - Dependable UAVs for Homeland Security
- BIG BLUE III, with inflatable only wings, and a
UAV for entry in the AUVSI 3rd Annual Student UAV
Competition. - READY UAV
7BIG BLUE II Architecture
- Mission Controller
- Auto-Sequencing
- Data Acquisition
- Ground Communication
- Flight Controller
- Control Glider
- Chute Control
- Monitor System Status
- Deploy Recovery Chute
- Camera Driver
- Capture Images
- Store to NVRAM
- Shared Memory Space
- Mailbox-Based Messaging
8Reconfiguration Based FT
- Run-time reconfiguration feasible in distribute
embedded systems - Cost, size, power constraints
- Availability of non-critical resources
- Graceful degradation a loss of or reduction in
the quality of services a system provides in
response to faults
9Approach
- Hardware/Software faults degrade performance
instead of causing system failure. - Resources dedicated to non-critical functions
serve as backup resources for critical functions. - No need to consider every failure combination at
design time. - Objective To develop a framework for specifying
gracefully degrading distributed embedded systems.
10Primary Challenges
- Specify static and dynamic redundancies as well
as graceful degradation - Provide Run-time infrastructure for these
applications
11The Ardea Framework
- Ardea Automatically Reconfigurable Distributed
Embedded Architectures - Ardea herodias The Great Blue Heron, a wading
bird of the heron family Ardeidae, common all
over North and Central America. This is the
largest North American heron.
12HW Architecture Overview
- Processing Elements (PEs)
- - Homogeneous set of processors
- - Real-time OS.
- - Local management tasks (scheduler,
network interface, loader) - I/O Devices
- - Sensors and actuators
- - Hosted by PEs
- Communication Network
- - Broadcast Network
- - Bandwidth and Latency
- System Manager
- - Fault tolerant by other means
- - Tracks status of resources
- - Finds and deploys configurations
13Micro C/OS-II
- Portable, ROMable, scalable,
- preemptive, real-time, multi-tasking,
priority-based kernel. - Source available, ANSI C and free for academic
use. - Ported to 40 architectures (8 to 64 bit) since
1992. - Meets RTCA DO-178B Level 1
- Uses 4 CPU and 3 KB - 30 KB RAM
14CAN Aerospace
- Stock Flight Systems
- NASA Langley AGATE/SATS
- NASA Ames SOFIA
15Specifying Fault Tolerance
- Specifying static redundancy
- Modular redundancy
- N-version programming
- Specifying dynamic fault redundancy
- Roll-back
- Roll-forward
- Specifying graceful degradation
- Multi-version software modules
- Shedding non-critical services
- Reducing update/output rate of services
16Application Specification
- Dependency graphs show the periodic flow of
information from sensors to actuators (i.e., data
pipelines) - Graph nodes software modules, data variables,
I/O devices, and dependency gates - Software modules
- Executable code schedulable
- on a processing element
- Suspended while input(s) unavailable
- Produce and consume data variables
- Attributes worst case execution time
- and rate factor
17Data Flow
- Data variables
- Application data between
- software modules
- State data variables are local to a software
module - Management data variables contain data consumed
by system manager. - Attributes
- Size
- Quality value or function
Figure 5 - Page 19
18Specifying Dependencies
- Dependency gates
- k-out-of-n OR gates n gt 0,
- 0 k n
- AND all input required
- XOR only one input required
- DEMUX for fanning out
- OR gates can be specified to distribute inputs
19Node Attributes
ID yaw_cntrl1 Exec_T 900 cycl. Rate_factor
15
ID out1, out2 Criticality critical Priority
1, Rate 10 Hz State Enabled
ID rud_Angle1 Size 2 bytes Quality 1
ID mag_drv1 , mag_drv2 Exec_T 300
cycl. Rate_factor n/a
ID yaw_history Size 8 bytes Quality n/a
ID servo1_drv, servo2_drv Exec_T 200
cycles Rate_factor 11
ID yaw1, yaw2 Size 2 bytes Quality 1, 2
ID rud_Angle2 Size 2 bytes Quality 2
ID yaw_cntrl2 Exec_T 400 cycl. Rate_factor
12
20Ardea Fault Detection Handling
- Failure detection of sensors, actuators and
software modules is the responsibility of
application software - Ardea built-in fault detection
- PE crash failures by heartbeat messages
- Network link failures detected and handled as PE
failures - Software module crashes detected locally by a
module execution monitors - Critical output modules detect missed deadlines
- Fault Handling masking, reconfiguration, or
fail-stop
21Sensor Fault Detection
22Actuator Fault Detection
23Software Fault Detection
24Triple Modular Redundancy
25The Challenges
- Software mode location independence
- Moving object code
- Routing module I/O data
- Fault Recognition
- User/application code reported faults
- Ardea built-in fault detection
- Tracking status/availability of HW and SW
resources - Configuration Management
- Tracking resource availability (HW and SW)
- Finding new configurations (mapping of modules to
PEs) - Deploying new configurations (starting, stopping,
and restarting modules) - Managing state data variables
- Reconfiguration time and critical deadlines.
(multiple system reconfiguration policies to
support reconfiguration before deadlines are
missed. If a deadline is missed, then system
fail-stop)
26The System Manager
27Processing Elements (PEs)
- Memory Loader copies code into program memory
- Scheduler starts and stops execution of modules
- Network Interface handles public data variables
(data routing)
28Scheduling and Unscheduling
- Starting, stopping, and restarting modules
- Restarting requires
- State Preservation
- Unprocessed data preservation
29Reconfiguration Policies
- Two configuration finding algorithms
- High-fidelity is (NP-hard) to find high-utility
configurations - Low-fidelity (fast) to insure running of critical
services - Response based on criticality of
detected/reported fault - Deploying configurations starting from sensor
side of a DG
30Reconfiguration Policies
31Ardea Features
- Ardea provides a structured framework for the
design and implementation of real-time systems - Dependency graphs for application software
specification - Run-time support for portable software modules,
fault recognition, and handling - More flexible fault-tolerance at reduced cost
- Specify reconfigurable systems using DGs
- Simplified debugging and maintenanc
- Graceful upgrade and repair
- Software reusability
32Current Work
- Applying techniques to a UAVs student UAV
competitions. - Avionics system for BIG BLUE Mars Airplane.
- READY UAV Project
- Expand bus via wireless link
- Rapid prototyping
- Minimize risk to hardware
- Flexible Reconfiguration