TOTEM: A FAULT-TOLERANT MULTICAST GROUP COMMUNICATION SYSTEM - PowerPoint PPT Presentation

About This Presentation
Title:

TOTEM: A FAULT-TOLERANT MULTICAST GROUP COMMUNICATION SYSTEM

Description:

Totem provides reliable totally-ordered multicasting of messages over LANs ... Results in a fairer bandwidth allocation among processors than FDDI ... – PowerPoint PPT presentation

Number of Views:87
Avg rating:3.0/5.0
Slides: 29
Provided by: jeha1
Learn more at: https://www2.cs.uh.edu
Category:

less

Transcript and Presenter's Notes

Title: TOTEM: A FAULT-TOLERANT MULTICAST GROUP COMMUNICATION SYSTEM


1
TOTEM A FAULT-TOLERANT MULTICASTGROUP
COMMUNICATION SYSTEM
  • L. E. Moser, P. M. Melliar Smith,D. A. Agarwal,
    B. K. BudhiaC. A. Lingley-Papadopoulos
  • University of California, Santa Barbara

2
INTRODUCTION
  • Totem provides reliable totally-ordered
    multicasting of messages over LANs
  • Intended for complex applications with critical
    requirements for
  • fault tolerance
  • real-time performance
  • Exploits hardware broadcast of most LANs

3
TOTEM SERVICES
  • Built as a hierarchy of protocols
  • Application layer
  • Process group interface
  • Multiple-ring protocol
  • Single-ring protocol
  • Physical medium

4
Single-ring protocol
  • Built on top of a best-effort multicast service,
    using UDP to exploit the hardware broadcasts of
    the LAN
  • Converts these multicasts into the service of
    reliable totally ordered delivery of messages on
    a single LAN
  • Also provides fault-detection, recovery and
    configuration change service

5
Multiple-ring protocol
  • Uses information from the process group interface
    above it
  • Provides total ordering of messages as well as
    network topology maintenance services

6
Process group interface
  • Delivers messages to the application processes in
    the appropriate process groups
  • Provides process group membership services.

7
Services provided by Totem
  • Two reliable totally ordered message delivery
    services
  • Agreed delivery
  • Safe delivery
  • Both services deliver messages in a single
    system-wide total order that respects Lamports
    causal order

8
Agreed Delivery
  • Guarantees that a processor will not deliver a
    message before it has delivered all prior
    messages that
  • Have been issued by processors in the current
    configuration and
  • Have time-stamps within the duration of that
    configuration
  • All processes receive all messages in theorder
    they were sent

9
Safe Delivery
  • Further guarantee that a processor will not
    deliver a message unless all processors in its
    configuration have received it (everyone or
    nobody).
  • All processes receive all messages in the same
    order at the same time

10
Why Lamports causal order?
  • Otherwise processes that belong to two or more
    groups could receive message from different
    groups in different order
  • A and B both in groups G and H
  • A receives m from group G then m from group H
    and finally m from group G
  • B could receive m from group H then m from group
    G and finally m from group G

11
Example
Group G sends messages m and mto A and B
Group H sends message mto A and B
Both A and B will receive m and min the same
order
Without total ordering, A could receive m before
m and B could receive m before m
12
Delivery guarantees
  • Extended virtual synchrony ensures that these
    guarantees are honored within every configuration
  • When a fault occurs, Totem forms a transitional
    configuration with a reduced membership
  • Message order is guaranteed even in the presence
    of network partitions

13
Extended virtual synchrony (I)
  • We want to ensure that
  • Messages are received in the same order by all
    processes
  • All processes share the same view of the process
    group to which they belong

14
Extended virtual synchrony (II)
  • Virtual synchrony model (K. Birman, ISIS) orders
    group membership changes along with the regular
    messages
  • Ensures that failures do not result in
  • Incomplete delivery of multicast messages
  • Holes in the causal delivery order
  • Problems remain if network can partition

15
Extended virtual synchrony (III)
  • Extended virtual synchrony model (Totem) extends
    the virtual synchrony model to systems
  • Processes can fail and recover
  • Network can partition and remerge
  • Guarantees that same message sent to processes in
    two or more components of a partitioned network
    will be in a consistent order in all these
    components

16
Ordering of messages
  • Messages are born-ordered
  • Each message includes a time-stamp
  • Relative order of messages is determined by the
    message themselves as created by their senders

17
The single-ring protocol (I)
  • Uses a circulating token containing among others
  • A seq field with the sequence number of the last
    message that was sent
  • An aru field with the sequence number of the last
    message that has been received by all processors
  • Only the processor that holds the token can send
    a message

18
The single-ring protocol (II)
  • aru field used to implement safe delivery
  • Tells processors which messages have been
    received by every processor in the ring
  • Token also provides information about the
    aggregate message backlog of the processors on
    the ring
  • Results in a fairer bandwidth allocation among
    processors than FDDI

19
Local membership protocol (I)
  • Part of the single-ring protocol
  • Allows
  • Inclusion of new or recovering processors
  • Deletion of faulty processors

20
Local membership protocol (II)
  • Ensures
  • Consensus among all members of a configuration
    about the configuration membership
  • Termination as each configuration will be
    installed on every processor within a bounded
    time or not at all.

21
The multiple-ring protocol (I)
  • Operates over several LANs linked by gateways
  • Each LAN is organized as a virtual token ring
    and managed by the single-ring protocol
  • Offers same services and same guarantees as
    single-ring protocol

Ring A
Ring B
Ring C
22
The multiple-ring protocol (II)
  • Uses Lamports timestamps and delivers messages
    in timestamp order
  • When a gateway forwards a message from one ring
    to another, it gives to the message a new
    sequence number for the new ring
  • Processor faults and network partitions are
    detected by the single-ring protocol

23
Message delivery (I)
  • Each processor maintains one recv_msgs list of
    messages received but not yet delivered for each
    ring from which it can receive messages

24
Message delivery (II)
  • A processor will deliver a message as an agreed
    message as soon as
  • Message has the lowest time stamp of all the
    messages in its recv_msgs lists
  • None of these lists is empty
  • We could wait forever for rings that have no
    messages to send but for guaranteed vector
    messages

25
Example (I)
  • Consider the following recv_msgs list where the
    numbers in parentheses indicate the message
    timestamps.

26
Example (II)
  • We can deliver messages mA!, mB1 and mC1

27
Example (III)
  • We cannot deliver more messages because we might
    miss a message mCnfrom ring C with an earlier
    timestamp, say ts 11.

28
Related issues
  • Totem sends from time to time guaranteed vector
    messages
  • They specify among other things, which rings have
    sent messages
  • Processor faults and network partitions are
    detected by the single-ring protocol
  • Configuration and topology change messages have
    timestamps and are delivered in strict timestamp
    order
Write a Comment
User Comments (0)
About PowerShow.com