Title: Fault Tolerance in CORBA and Wireless CORBA
1- Fault Tolerance in CORBA and Wireless CORBA
- Chen Xinyu
- 18/9/2002
2Outline
- Introduction to CORBA and Wireless CORBA
- What is Fault Tolerance
- Fault Tolerant CORBA
- Fault Tolerance in Wireless CORBA
- Conclusion
- Future Work
3What is CORBA
- Common Object Request Broker Architecture
- A Distributed Object Computing (DOC) open
standard - Compare to platform/language specific
alternatives - e.g., Java RMI, Microsofts DCOM
- A language-neutral environment
- A middleware infrastructure specification
- Administered by the Object Management Group
- a.k.a., the OMG
4Wireless CORBA Architecture
- Keeps track of the associated access bridges
- Redirects requests for services on the terminal
- Abstract transport-independent tunnel for GIOP
messages - Concrete tunnels for TCP/IP, UDP/IP and WAP.
- Only one GIOP tunnel
- Encapsulates, forwards or ignores incoming GIOP
messages - Decapsulates and forwards messages from the GIOP
tunnel - Generates mobility events
- Lists available services
- Similar to the Access Bridge
- Does not provide forwarding
- Generates mobility events
- Does not list services
Source Telecom Wireless CORBA, OMG Doucment
dtc/01-06-02
5Wireless CORBA
GIOP
GTP
GIOP
IIOP
Access Point
Key
TCP/IP Network
CORBA objects may be invoked anywhere along the
end to end path
GTP Tunnel
6Fault, Error and Failure
Fault tolerance is the ability of a system to
continue providing its specified service despite
component failure
Error
Failure
an anomalous condition occurring in the system
hardware or software
the part of the system state that is liable to
lead to a failure
occurs when the delivered service of a system or
a component deviates from its specification
7Fault Tolerant CORBA Architecture
Source Bell Labs Research
8Object Replication Styles
- Passive Replication
- Only one replica processes each request, other
replicas are available as backups - Lower memory and processing costs
- Slower recovery from faults
- Duplicate message detection during recovery from
faults - Active Replication
- Several replicas process each request
- Faster recovery from faults
- State transfer to initialize new replicas
9Passive Replication
Client invokes a method of Server A
Primary replica
Server A
ORB
ORB
ORB
ORB
ORB
Primary replica
Server B
ORB
ORB
ORB
Source Eternal Systems, Inc
10Active Replication
Client invokes a method of Server A
Server A
ORB
ORB
ORB
ORB
ORB
Reliable totally ordered multicast
Server B
ORB
ORB
ORB
Source Eternal Systems, Inc
11Device, Wireless Mobile Issues
- Device Issues
- Slow processor
- Small memory
- Small disk space
- Low power supply
- Physical damage
- Wireless Issues
- High bit error rate
- Little bandwidth
- Long transfer delay
12Recovery Scheme
- Uncoordinated checkpointing
- time
- predefined number of messages
- Pessimistic message logging
- no extra communication overhead
- Independent rollback recovery
- only failed objects rollback
13Fault Tolerance Architecture
Mobile Host
Access Bridge
Remote Server
ORB
Terminal Bridge
ORB
ORB
Recovery Mechanism
Recovery Mechanism
Logging Mechanism
Recovery Mechanism
Logging Mechanism
Platform
Platform
Platform
14Checkpoint and Logs Collection Strategies
- Pessimistic
- checkpoint and logs are transferred during
handoff - generates heavy volume of data transfer
- Lazy
- creates a linked list of Access Bridges
- complicated recovery
- Frequency-based
- the number of handoffs
- Distance-based
- the distance between mobile host and the Access
Bridge carrying its latest checkpoint
15Mobile Host Crash
16Mobile Host Crash
17Mobile Host Crash
18Mobile Host Crash
19Conclusion
- Fault Tolerant CORBA is based on Object
Replication - Fault tolerance in Wireless CORBA is based on
Rollback-Recovery Protocol - Checkpoint and message logs collection is
important in Wireless CORBA
20Future Work
- Low-cost Checkpointing Algorithm
- forces a minimum number of objects to take
checkpoints - minimizes the number of synchronization messages
- makes checkpointing nonblocking
- Failure Detection in Wireless Environment
21Question and Answer
22Thank You