Lightweight Fault Tolerance for Distributed RealTime Systems - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Lightweight Fault Tolerance for Distributed RealTime Systems

Description:

Does not attempt to handle application state consistency. Mechanisms are provided to allow applications to manage their own state consistency. ... – PowerPoint PPT presentation

Number of Views:130
Avg rating:3.0/5.0
Slides: 26
Provided by: omg3
Category:

less

Transcript and Presenter's Notes

Title: Lightweight Fault Tolerance for Distributed RealTime Systems


1

Lightweight Fault Tolerance for Distributed
Real-Time Systems Revised Submission
realtime/2008-05-01 Ottawa OMG Technical
Meeting, June 2008 Realtime/08-06-01 Andy
Foster, PrismTech Robert Kukura, PrismTech
2
Contents
  • Overview
  • Basic Operation
  • CORBA-specific Details
  • Fault Tolerance
  • LWFT Interfaces
  • Summary of ORB Changes
  • Revisions Made Since the Initial Proposal
  • Conclusions

3
Overview
  • Objective to define a new Lightweight Fault
    Tolerance for Distributed Real-Time Systems
    specification.
  • The existing FT CORBA specification is not
    compatible with the RT CORBA specification.
  • It also requires the implementation of the IOGR
    type.
  • Here we propose an alternative to FT CORBA
  • Makes use of location forwarding techniques to
    provide indirect binding to Server replicas.
  • Uses the standard multi-profile IOR.
  • Requires minimal ORB modifications.
  • Is compatible with the RT CORBA specification.
  • Supports legacy ORB clients

4
Basic Operation
  • Based on the concept of redirecting Clients to a
    suitable replicated Server entity.
  • Servers are registered to a central Registry
    which records their endpoint information.
  • Servers substitute the endpoint information of a
    Forwarder component into any IORs that they
    produce.
  • As a result, clients using these IORs will be
    directed to the Forwarder, which will forward
    them to the IOR of the Server replica that they
    should use.

5
Basic Operation
  • Replication is managed on a per-process
    granularity.
  • Primary focus is on keeping track of a
    registration of transport endpoints that are to
    host fault tolerant entities.
  • Will allow the querying of registered endpoints
    regarding individual objects hosted there. This
    is required for compatibility with Real-time
    CORBA.
  • Does not attempt to handle application state
    consistency.
  • Mechanisms are provided to allow applications to
    manage their own state consistency.
  • Focuses on ensuring the safe delivery of requests
    and responses.

6
Basic Operation
1.
Server 1A
3.
4.
Server 1B
Forwarder
Client
5.
6.
2.
Server 1C
Registry
1. Servers register themselves with the Registry.
2. Forwarder accesses the Registry's records to
update its own.
3. Client requests access to a Fault Tolerant
Object on Server 1.
4. Forwarder selects a replication (1B) to
handle this request.
5. Server 1B returns an object reference to the
Forwarder...
6. ...which is passed to the Client in the
response to their original request.
7
CORBA-Specific Details
  • Object References
  • Will use regular IORs rather than FT CORBA's
    IOGRs.
  • Every replica in the same group will share a
    common Object Key.
  • Every member of the same replication group will
    share a common -ORBServerID property value, which
    should be recoverable from their Object Key.
  • Server's will use their own Object Key and the
    Forwarder's endpoint information (obtained during
    Registration) in any IORs that they create for FT
    Objects.

8
CORBA-Specific Details
  • FT_Locate()
  • Fully compatible ORBs will contact the Forwarder
    using the new reserved operation name
    'FT_Locate', the only correct, non exceptional
    response to which is a reply with a
    LOCATION_FORWARD status value.
  • A response to an FT_Locate call containing a
    BAD_OPERATION exception should be interpreted as
    meaning the object exists, but the server ORB
    does not support LOCATION_FORWARD.
  • In this situation, the Forwarder can build an IOR
    for that Object from the Server's registration
    data, but it will not be suitable for use in
    Real-time CORBA applications.

9
CORBA-Specific Details
  • Behaviour of the Forwarder
  • Upon receiving an FT_Locate request, the
    Forwarder will decode the incoming Object Key.
  • Using the extracted -ORBServerID value to
    identify the replication group, the Forwarder
    will select an appropriate Server to use (each
    replica will have been initialised with its own
    unique - ORBProcessID property value).
  • The Forwarder will use the recorded endpoint
    Information for that Server to send an FT_Locate
    request to it. The response to this request will
    be a LOCATION_FORWARD which points directly to
    the location of an FT Object.
  • The Forwarder will now return LOCATION_FORWARD
    reply to the Client which points to this received
    location.

10
CORBA-Specific Details
11
Fault Tolerance
  • Fault Detection using FT CORBA's FT_HB()
    operation.
  • '23.2.9 Transport Heartbeats' in the CORBA
    specification.
  • Replicas detected to have failed will have their
    registrations removed from the Registry.
  • Fault Recovery using the standard CORBA IOR
    fall-back mechanisms.
  • If a Server replica fails, the direct IOR passed
    to the client by the Forwarder will no longer
    work.
  • The client will simply fall back to the original
    IOR as per normal transparent reinvocation
    behaviour, which will result in them requesting
    another replica's IOR from the Forwarder.

12
LWFT Interfaces
  • Server endpoint data is represented in the system
    using the following data structures
  • Endpoint
  • Location
  • The key components which form this mechanism are
  • LWFTRegistry
  • LWFTRegistryAdmin
  • LWFTForwarder
  • LWFTObjectKeyDecoder
  • LWFTProcessSelector

13
LWFT Interfaces - Data Structures
  • Endpoint
  • struct Endpoint
  • Location endpoint_key
  • Object profiles
  • Used to represent an accessible Server endpoint
    in the system.
  • Built from the information gained by decoding an
    Object Key into its constituent parts.

14
LWFT Interfaces - Data Structures
  • Location
  • struct Location
  • ObjectLifeSpan lifespan
  • unsigned long vendor_orb_id
  • string server_id
  • string orb_id
  • CORBAStringSeq poa_names
  • CORBAOctetSeq object_id
  • Decoded Object Keys will contain some or all of
    these variables (the more the better).
  • The Object Key must contain the server_id
    parameter in order to identify the requested
    replication group.

15
LWFT Interfaces - Registry
  • LWFTRegistry
  • Manages Server replica registrations and passes
    them the Forwarder endpoint information to use in
    their IORs.
  • Responsible for listening to Server's still_alive
    calls and deregistering them should a timeout
    occur.
  • Other Registry components can be registered as
    listeners on a set of defined Locations. The
    level of listening used can be defined by
    policy.
  • The LWFTRegistryAdmin interface can be used by
    applications to access the Registry's records of
    registered processes, and to manipulate the order
    in which backups will be selected within
    individual replication groups.

16
LWFT Interfaces - Registry
interface Registry void register_process(ino
ut ProcessID process_id,
in EndpointSeq process_endpoints,
out EndpointSeq forwarder_endpoints)?
raises (ReplicaMismatch, ProcessIDInUse,
UnParsableEndpoints) void
deregister_process(in ProcessID process_id,
in string message)
raises (NotFound) void still_alive (in
ProcessID from) raises (NotFound) void
process_ready(in ProcessID process_id) enum
LISTEN_LEVELDEREGISTRATION, REGISTRATION,
FULL void register_listener (in Registry
ft_registry, in
LISTEN_LEVEL level,
in LocationSeq locations) void shutdown (in
boolean wait)
17
LWFT Interfaces - RegistryAdmin
  • LWFTRegistryAdmin
  • Can be used by applications to gain direct access
    to the contents of the Registry.
  • Allows applications to query the registered
    details of a Server and manipulate the order of
    backups used by individual replication groups.

18
LWFT Interfaces - RegistryAdmin
  • interface RegistryAdmin
  • readonly attribute Registry the_registry
  • void get_all_processes (out ProcessIDSeq
    all_processes)
  • void get_all_locations (out LocationSeq
    all_locations)
  • void get_location (in Location the_location,
    out ProcessIDSeq processes,
  • out ForwardPolicy policy,
    out RegistrySeq listeners,
  • out ForwarderSeq
    forwarders)
  • void set_location(in Location the_location,in
    ForwardPolicy policy,
  • in ProcessIDSeq ordered_list,
  • in ForwarderSeq forwarders)
  • void get_process (in ProcessID process,
  • out EndpointSeq
    registered_endpoints)

19
LWFT Interfaces - Forwarder
  • LWFTForwarder
  • Responsible for retrieving the direct IOR of an
    FT Object for a client.
  • Uses the LWFTObjectKeyDecoder and
    LWFTProcessSelector interfaces to decode an
    ObjectKey and identify a suitable replica for the
    client to use.
  • Supports the registration of other Forwarders as
    fall-backs if a request is received for an
    unknown replication group, redirect the client to
    the fall-back Forwarder using a LOCATION_FORWARD.

20
LWFT Interfaces - Forwarder
interface Forwarder // Pseudo operation
// void FT_Locate() // Pseudo operation
// void FT_HB() void register_forwarder (in
Forwarder fall_back,
in LocationSeq locations)
21
LWFT Interfaces ObjectKeyDecoder
ProcessSelector
  • LWFTObjectKeyDecoder
  • Provides methods to decode an Object Key octet
    sequence into an LWFTLocation data structure.
  • Two versions of this interface will be used,
    LWFTObjectKeyDecoder and a local interface
    version LWFTObjectKeyDecoderLocal.
  • LWFTProcessSelector
  • A local interface which Provides methods to
    select a replica's unique processID value from a
    particular replication group, given that group's
    unique serverID value.
  • The order in which replicas will be selected is
    determined by the implementation of this
    interface or any interfaces which inherit from
    it.

22
LWFT Interfaces ObjectKeyDecoder
ProcessSelector
interface ObjectKeyDecoder readonly
attribute unsigned long vendor_orb_id
boolean get_key_contents (in CORBAOctetSeq
object_key, out
Location key_contents) // For symmetry
only boolean get_object_key (in Location
contents, out
CORBAOctetSeq object_key) local
interface ProcessSelector void set_registry
(in RegistryAdmin admin) void add_process
(in ProcessID process) void remove_process
(in ProcessID process) ProcessID
select_process (in Location the_location)
ProcessIDSeq ordered_process_set (in Location
the_location)
23
Summary of ORB modifications
  • Additional Object Key parameters
  • Server_id, used to uniquely identify replication
    groups, is required.
  • It is preferred that as many of the Location
    structure's parameters are provided as possible.
  • New reserved operation name FT_Locate
  • The expected response is a LOCATION_FORWARD
    reply.
  • If the requested object does exist but the
    receiving ORB does not support LOCATION_FORWARD,
    return a BAD_OPERATION exception

24
Revisions Made Since the Initial Proposal
  • Separate interfaces to handle Server registration
    and Client forwarding.
  • Introduced a new LWFTRegistryAdmin interface
  • Allows manual manipulation of the Registry and
    its records, including the ordering of backups.
  • Introduced a new LWFTProcessSelector interface
  • Allows developers to define their own methods of
    selecting which replica to use upon each
    request.
  • Can be linked to a LWFTRegistryAdmin, to
    provide access to the full range of data in the
    system when making a decision.

25
Conclusions
  • There is a great need for a Fault Tolerance
    specification which is lightweight, flexible, and
    fully compatible with the Real-time CORBA
    specification.
  • This solution
  • Minimizes ORB changes to a few additional Object
    Key values and a new reserved operation name.
  • Uses regular IORs.
  • Provides replication on a per process
    granularity.
  • Provides developers with the means to implement
    replica state consistency however they choose to
    do so (if at all).
  • Is compatible with RT CORBA.
Write a Comment
User Comments (0)
About PowerShow.com