Title: COTS Challenges for Embedded Systems
1Middleware Aspect Frameworks in Support of
Model-Driven Component Middleware
Christopher D. Gill Center for Distributed
Object Computing Department of Computer Science
and Engineering Washington University, St. Louis,
MO cdgill_at_cse.wustl.edu
This research has been supported by DARPA and
AFRL, in collaboration with Boeing, BBN, Lockheed
Martin, and OOMWorks
Thanks to the many people who have contributed to
this work
2Presentation Roadmap
- Shameless Plug Call for Participation
- Multi-Property Systems Challenges
- Open Questions in QoS Middleware Research
- Research Approach and Assumptions
- QoS Middleware Research Initiatives
- CIAO
- nORB
- Kokyu/RTCORBA 2.0
- FT/RT EC
- WSOA
- Summary and Concluding Remarks
3Shameless Plug Call for Participation
- OMG Real-Time and Embedded Workshop
- One page abstracts due Friday 3/7
- RTAS 2003 Workshop on Model-Driven Embedded
Systems - May 27th at RTAS (May 27th 30th)
- 6 page position/experience papers due 3/31
- Also, OMG TC meeting
- QoS CCM RFP
- MDA
- RT CORBA 2.0
4Multi-Property Systems Challenges
- Inherent Tension
- Abstraction/separation of concerns
- Entangled properties
- Competing constraints
- Design Tension
- Single-system solutions
- Too expensive to maintain
- Difficult to evolve
- Purely off-the-shelf solutions
- May not meet key needs
- E.g., Real-time constraints
- E.g., Security requirements
- Motivates several research topics
- Component models
- Configuration/checking by MIC/MDA
- Systemic aspect frameworks
System Properties
5Open QoS Middleware Research Questions
- Real-Time, Component Middleware
- Can we manage QoS configuration complexity?
- Real-Time, Small Footprint
- Can we reduce footprint while maintaining RT
assurances? - Real-Time, Distribution
- Can we schedule distributed groups of threads and
events? - Real-Time, Fault Tolerance
- Can we trade off fault-tolerance and real-time
properties - Real-Time, Adaptation
- Can we adapt QoS effectively in a dynamic
environment? - Can we make adaptation rigorous?
6Research Approach and Assumptions
- RT Middleware Focus
- Distribution assumed
- Other issues
- Time, Faults
- Concurrency
- Synchronization
- Consistency
- Multiple QoS Dimensions at Once
- Real-time
- Small footprint
- Fault-tolerant
- Secure
- Management
- Adaptation/control
- Static/dynamic
Real-Time
criticality
laxity
Fault- Tolerance
contention
heartbeat
pacing
detection
fail-over
delivery
replicas
consensus
defense
adaptation
detection
authorization
Security
7Distributed Real-Time as a Starting Point
- Crucial features
- Relative criticality
- Resource competition
- Resource utilization
- Whats needed next?
- Automated configuration
- AO, OO, generic versions
- Correctness by composition
- Type-decorated primitives
- Model-driven weaving
- Wider integration
- CIAO
- DSRTCORBA 2.0
- New TAO Event Channel
- Additional dimensions
- FTSecureEmbedded
Real-Time
criticality
laxity
contention
pacing
latency
clock consistency
endsystem concurrency
global policies
distributed identity
Distribution
8Relevant Projects
- Real-Time, Component Middleware
- CIAO Component-Integrated ACE ORB
- Real-Time, Small Footprint
- nORB small footprint real-time ORB
- Real-Time, Distribution
- Kokyu, RTCORBA 2.0, distributed scheduling
- Real-Time, Fault Tolerance
- FT/RT Event Channel
- Real-Time, Adaptation
- ASFD/ASTD/WSOA
9Real-Time Component Middleware
- Overarching Open Research Question
- Can we manage QoS configuration complexity?
- Observations
- CIAO addresses component-level accidental
complexity - We can then complement CIAO with lower-level
aspect frameworks - Reduce accidental complexity of models, modeling
tools - Analogy KSUs experience with OpenCCM under
Cadena - Existing frameworks (Kokyu, nORB) are a good
starting point - Collaborations with Boeing, Vanderbilt, KSU, BBN,
Stanford - CIAO Component-Integrated ACE ORB
- Cadena has used OpenCCM, migrating to CIAO
- QuO Qoskets and the components / QoS aspects
relationship - New ideas for integrating filtering, correlation,
dispatching - Strongly complementary approaches
- Model Integrated Computing / Model Driven
Architecture - Component Middleware / QoS metadata configuration
- Aspect Weaving at component and infrastructure
levels
10Real-Time, Small Footprint
- Open Research Question
- Can we reduce footprint while maintaining
performance? - Collaboration with Boeing
- nORB small footprint real-time ORB
11Boeing NEST OEP Challenge
- Large Sensor Networks (100-10,000 nodes)
- Small footprint constraints (500K is a lot)
- Real-time communication between services
- E.g., node ping
- Minimalist implementation (features/footprint)
- Real-time performance
- Middleware up to 100Hz
- ACE as our substrate portable, proven
- TAO as our guide techniques, benchmark
- Standard communication protocol IIOP 1.0
12Just Enough Middleware
Client
Server
Client invokes a method call on the remote object
Server method invoked
Servant object located
Simple Object Adapter
Stub creates a collection of parameters
Client Stub
Dispatch method call to servant
Parameters are marshalled
Server Skeleton
Parameters are de-marshalled
CDR Marshaller
IIOP Request sent
IIOP Request received
CDR Demarshaller
ORB Core
ORB Core
13Development Approach
- Refactor/Reengineer components in ACE
- Right-size key ACE components
- Decouple component dependencies
- Build a new component using finer components
- What is the minimum ORB feature set we need?
- Look at minimalist frameworks like UBI-core
- A guide to mine/refine components from ACE
- Project basis onto refined ACE substrate
- Compare nORB performance to TAO
- Use a realistic sensor network application
(ColorCSP) - Highly useful in avoiding mistakes (some subtle)
14Re-factoring the ACE Substrate
- Decoupling concerns
- Pattern-oriented role decomposition
- Reactor
- Acceptor
- Connector
- Event Handler
- Svc Handler
- Intervening classes
- Aspects from other pattern implementations
- Component Configurator
- Active Object
ACE_Event_Handler
ACE_Event_Handler
ACE_Service_Object
PS_Event_Handler
ACE_Task_Base
ACE_Task
Peer stream
ACE_Svc_Handler
15nORB Benchmarking
- Evaluation outside Boeing OEP environment
- Isolate nORB itself from effects of domain/system
services - Offer a quantitative baseline in footprint and
performance - Show experimental approach to apply to OEP
services - Test application
- Distributed graph coloring algorithm
- Simple distributed constraint satisfaction
problem - Represents, e.g., ping node scheduling in Boeing
OEP - We used it as a touchstone for footprint and
performance - Comparisons between ACE, TAO, nORB
16Node Footprint Comparison
Middleware layer with only ACE costs 212KB
Middleware layer with nORBACE costs 345KB
(133KB over ACE) Middleware layer using TAOACE
costs 1.7MB (1.2MB over nORBACE)
Node application alone costs 164KB
17Experimental Trials
node Y
colorZ
choose
colorX
colorY
colorY
improveY
one round
improveX
compare
time
store
improveZ
choose
improveY
- 500,000 repeated trials to generate large sample
population - Want confidence in fine-grain time bounds
distinctions - Measure time of each message passing round
- 100 nodes in 10x10 square mesh on 4 networked
machines - 2.53GHz P4 512MB RAM KURT-Linux, 100Kb/s Ethernet
18CSP Cycle Performance Comparison
19Real-Time, Distributed
- Open Research Question
- Can we schedule distributed groups of events and
threads? - Building Blocks
- ACE
- A low-level framework for portable concurrent
distributed objects - nORB
- An aspect framework for minimal ORB middleware,
built on ACE - Kokyu
- An aspect framework for RT (event) dispatching,
built on ACE - TAO RTCORBA 2.0 scheduler/dispatcher
implementation - Builds on re-factored and extended version of
Kokyu - Appears likely to refine ACE semantics, i.e.,
thread identity - Collaborations with OOMWorks, URI, OU, KU
- Kokyu evolution, RTCORBA 2.0
- Multi-level scheduling aspects
- Group and distributed scheduling, resource
control
20Anatomy of a Kokyu Dispatcher
- Lanes
- Fixed partitions isolate dispatch requests
- Currently separated by priority
Dispatcher
laxity
laxity
21Kokyu Queues
Dispatcher
static
laxity
- Queues
- Hold dispatch requests
- Order according to queue type (deadline, laxity,
etc.) - May be implemented using different data
structures (ring buffer, hash map, linked list,
tree)
22Kokyu Threads
- Dispatch Threads
- May be single, or thread pools
- All threads in a lane are at same priority
- Priorities differ between lanes
Dispatcher
laxity
laxity
23Kokyu Timers
- Timers
- Distinguished by interval, lane
- Currently separated by priority
- Logical timer per distinct rate in each lane
- Future optimized mapping of logical timers onto
physical timers
Dispatcher
laxity
laxity
24Flexible Event Scheduling
Suppliers
Consumers
static
Correlation
static
Proxy
Filtering
laxity
Proxy
Dispatcher
Event Channel
- Threads are all from dispatcher
- Timers trigger suppliers
- Suppliers push to EC
- Events land in queues
- Worker threads pull from queues and push to
consumers
25nORB RMS Request Scheduling
Servants
Clients
nORB
- Innovations and Differences
- Passing GIOP messages instead of events
- With RMS, have FIFO within each lane
- So, we can dispense with queues
- Still have priority threads
- Client may be timer-triggered
26nORB Flexible Request Scheduling
Servants
Clients
Dispatcher
nORB
- Add dynamic scheduling to nORB
- Simply configure the dispatcher with queues for
laxity or deadline ordering - For any FIFO lane, can still omit queue
- Also may only queue on server side!
- So, the queues are an aspect!
27Groups of Events and Threads
Servants
- A new concurrency abstraction
- Distributable Thread (DT) can cross
distribution boundaries - Traverses (and returns from!) OS threads,
sockets, queues - A DT may span different threads queues on an
endsystem - Semantics defined in DSRT CORBA 2.0
- Follows end-to-end computational path
- Generalize Kokyu ? scheduling support for
multiple concurrency models (at once?) - Key issue DTs may start between application
entry points
28Event Dispatching Semantics
Consumers
Suppliers
One Lane (possibly of many)
- Key (potential) scheduling points
- Occur at well defined middleware entry points
- When a timer goes off
- When an event is queued
- When an event is pushed
descriptor
event
- Command Pattern
- Packages event with descriptor
- Descriptor holds ordering info
- Event holds delivery info
29GIOP Message Dispatching Semantics
Servants
Clients
One Lane (possibly of many)
- Key (potential) scheduling points
- Still at well defined middleware entry points
- When a timer goes off
- When a message is sent over network
- When a message is queued
- When a message is delivered
descriptor
GIOP msg
- Command Pattern
- Packages GIOP message with descriptor
- Descriptor holds ordering info
- Event holds delivery info
30Distributable Thread Dispatching Semantics
Distributable Threads
One Lane (possibly of many)
- Key (potential) scheduling points
- May occur within application components!
- So, cant rely on command pattern!
- Must be GUID aware!
- May need additional tables of DT mapping
- Some preemption strategies may require a queue
- May not cover all cancellation points!
descriptor
CV
- Command Pattern
- Packages condition variable (CV) with descriptor
- Descriptor holds ordering info
- CV is GUID aware, allows thread to block on it
- CV logically carries blocked thread through queue
- DT may block on CV after it enqueues command
- But thread to which its mapped may or may not
block - Alternative cancellation strategies
mark-yield/rollback/roll-forward/return/
31Next Groups of Events and Threads
Servants
DT?
DT
DT
RTCORBA 2.0 Dispatcher
RTCORBA 2.0 Dispatcher
TAO
- Need a unified approach to scheduling distributed
invocations - Per RTCORBA 2.0 - a one-way call from a DT spawns
a new DT - So, does a supplier event from a DT carry a
(possibly very short lived) DT? - Also may want to coordinate groups of events and
DTs - Group scheduling semantics
- Different groups may have different progress
semantics - Need a unifying scheduling model
- Need hierarchical composition of scheduling
within that model - Need flexible composition of dispatching
infrastructure aspects - Goals distributed assurances and optimization
(ultimately end-to-end)
32Real-Time, Adaptive
- Open Research Questions
- Can we adapt QoS effectively in a dynamic
environment? - Can we adapt QoS accurately in a dynamic
environment? - Can we make adaptation rigorous?
- Ongoing collaborations with Boeing
- ASFD/ASTD/WSOA (AFRL WSSTS)
- Program in its final stages, flew in December
- Final lab experiments completed
- WSOA results suggest applying Control Theory
- Linear control of requested compression levels
33Weapon System Open Architecture
Key Operational Themes
Collaborative sensor-to-shooter en-route planning
- Collaborative Strike Planning
- Time Critical Target Prosecution
- Browser Image Interface
- JTIDS/SATCOM links used to support tactical Net
Meeting - Demonstrate JTIDS imagery standard (potential
add-on)
First extension of Bold Stroke virtual
backplane across multiple systems
34WSOA Distributed System Architecture
35WSOA QuO Adaptation Domain
- Request higher Q level on next tile
- Finish early
- Request more bandwidth on next tile
- Request higher priority
- Request lower Q level on next tile
- Notify application
Image A
Image B
36WSOA Preliminary Results
- Three experimental trials
- QuORTARM
- Image compression
- Schedule decompression
- Meets deadlines well
- Aggressive compression
- QuO alone
- Image compression only
- Misses deadlines
- Similar compression
- Naïve Control
- Aim for on-time point
- Misses deadlines
- Better compression fit
- Conclusions
- Potential benefit of control
- Need rigorous control
- Integrated w/ WSOA MW
37Can We Make Adaptation Rigorous?
- Adaptation Infrastructure
- Reasonable responsiveness
- Manageable overhead
- But, control is still ad hoc
- How do we improve?
- Apply Control Theory
- E.g., FCS (Lu, Stankovic)
- FCS with Kokyu/nORB
- New project at WU DOC
- Paper submitted to RTAS
- Architectures and patterns
- Where to we put control?
- How do we integrate it?
- WSOA 2 and beyond
Controlled Plant
Middleware
?
Controller(s)
38Real-Time and Fault-Tolerance
- Open Research Questions
- Can we trade off fault-tolerance and real-time
properties? - What are the pragmatic limitations in a COTS
world? - Collaboration with Lockheed Martin
- LMCO Egan MINERS Quality Connector Façade
- Washington University FT/RT Event Channel
- LMCO ATL SCTP pluggable protocol (SCIOP)
39FT/RT Event Channel
Event channels
Supplier
push
push
primary
Consumer
subscribe
replica
Supplier
replicate
Consumer
push
replica
replicate
Consumer
alive bus
send IOGR
Replication Manager
Naming Service
- Provide fault-tolerance (fail-stop) within
real-time constraints - Offer useful configuration knobs to Quality
Connectors - Replicas where and how many, transactional
replication depths - Possibly others e.g., connection topology for
crash detection - Leverage new technologies
- I.e., LMCO ATLs SCTP Inter-ORB Protocol (SCIOP)
for ACE
40First Goal Stream Persistence
- Crash an event channel
- If replica, no effect to suppliers or consumers
- If primary, replica allows all event streams to
resume
41Fault-Detection and Fail-Over
- Maintain connections
- Connected alive
- Among primary/replicas
- To replication manager service
- Currently uses TCP
- Planned Improvements
- Use SCTP
- Tune SCTP heartbeat
- Replication management as a distributed protocol
primary
replica
replica
alive bus
Event channels
Replication Manager
42RT/FT/Programming Trade-Offs
- Moderately Stable State
- Supplier/Consumer Registration
- Transaction depth vs. timeliness
- Future Transient State
- Individual Events
- May require another approach, i.e., parallel data
paths - Message vs. State Replication
- EC has many interfaces
- Message replication
- Bound transaction depth
- A New Vision for EC FT/RT
- Bold Stroke architecture
- Message replication for events
- Semi-Active data replication
- Will SCIOP take us farther?
- Rapid failure detection
- Event replication/recovery?
- Slow or soft real time
primary
subscribe
replica
Assured-replicate
replica
Soft-replicate
Event channels
43Summary and Concluding Remarks
- The Big Picture
- Model Integrated Component Middleware
- Aspect frameworks for middleware infrastructure
- Focus on quality of service dimensions
- And increasingly on interactions between them
- An Invitation
- Send abstracts/papers to OMG/RTAS workshops
- Help lead/propel/shape standards at the OMG
- QoS CCM, RTCORBA 2.0, MDA
- What are your current/next open questions?
- Are there opportunities for further collaboration?