CTIS 490 DISTRIBUTED SYSTEMS - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

CTIS 490 DISTRIBUTED SYSTEMS

Description:

There are algorithms for electing a coordinator. ... In general, election algorithms attempt to locate the process with the highest ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 31
Provided by: cneyt
Category:

less

Transcript and Presenter's Notes

Title: CTIS 490 DISTRIBUTED SYSTEMS


1
CTIS 490DISTRIBUTED SYSTEMS
  • WEEK 10
  • RECOVERY
  • OTHER ISSUES

2
RECOVERY
  • Another issue of fault tolerance is the recovery
    from an error.
  • The idea of error recovery is to replace an
    erroneous state with an error-free state.
  • The most widely used recovery method is the
    backward recovery.
  • Backward recovery brings the system from its
    present erroneous state back into a previously
    correct state. To do so, it will be necessary to
    record the systems state from time to time and
    to restore such a recorded state when things go
    wrong.
  • Each time the systems present state is recorded,
    a checkpoint is said to be made.

3
CHECKPOINTING
  • In distributed systems, a consistent global state
    is also called a distributed snapshot.
  • In a distributed snapshot, if a process P has
    recorded the receipt of a message, then there
    should be a process Q that has recorded the
    sending of that message.

4
INDEPENDENT CHECKPOINTING
  • Each process saves its state from time to time to
    a locally available stable storage (which is
    designed to survive anything except major
    disasters), and we have to construct a consistent
    global state from these local states.
  • A recovery line corresponds to the most recent
    collection of checkpoints.
  • The distributed nature of checkpointing may make
    it difficult to find a recovery line. To discover
    a recovery line requires that each process is
    rolled back to its most recently saved state.
  • If these local states jointly do not form a
    distributed snapshot, further rolling back is
    necessary.
  • This process of a cascaded rollback leads to
    domino effect.

5
INDEPENDENT CHECKPOINTING
  • The state saved by P2 indicates the receipt of a
    message m, but no other process can be identified
    as its sender. So, P2 needs to be rolled back to
    an earlier state.
  • P1 has recorded the receipt of message m, but
    there is no recorded event of this message being
    sent.
  • In this example, the recovery line is the initial
    state of the system.

6
COORDINATED CHECKPOINTING
  • In coordinating checkpointing, all processes
    synchronize to jointly write their state to local
    stable storage. The main advantage is that the
    saved state is globally consistent.
  • A coordinator first multicasts a
    CHECKPOINT_REQUEST message to all processes. When
    a process receives such a message, it takes a
    local checkpoint, queues any subsequent messages,
    and acknowledges that it has taken a checkpoint.
  • When the coordinator has received an
    acknowledgment from all processes, it multicasts
    a CHECKPOINT_DONE message to allow the blocked
    processes to continue.

7
MESSAGE LOGGING
  • Many distributed systems combine checkpointing
    with message logging.
  • There are two types of message logging
  • Sender-based logging process logs its messages
    before sending them off.
  • Receiver-based logging process logs its
    messages before executing them.
  • Message logging allows replay of the messages.

8
REPLICA MANAGEMENT
  • Replica management involves two issues where to
    place replicas and which mechanisms to use for
    keeping them consistent.
  • Placing replica servers concerned with finding
    the best locations to place a server that can
    host part of a data store.
  • Placing content finding the best servers for
    placing content.

9
REPLICA-SERVER PLACEMENT
  • The optimal placement of replica servers is not
    an intensively studied problem since it is more
    of a management issue.
  • Analysis of client and network properties are
    useful to come to informed decisions.
  • One approach is to consider the topology of the
    Internet as formed by the Autonomous Systems
    (AS).
  • An AS can best be viewed as a network in which
    the nodes all run the same routing protocol and
    which is managed by single organization,
    typically Internet Service Provider (ISP).

10
CONTENT REPLICATION PLACEMENT
  • There are three types of replicas.

11
PERMANENT REPLICAS
  • Permanent replicas can be considered as the
    initial set of replicas that constitute a data
    store.
  • For example, distribution of a Web site generally
    comes in two forms
  • First, files that constitute a site are
    replicated across number of servers at a single
    location. Whenever a request comes in, it is
    forwarded to one of the servers, for instance
    using a round-robin method.
  • Second, mirroring can bed used. In this case, a
    Web site is copied to a limited number of
    servers, called mirror sites which are
    geographically spread across the Internet.
    Clients choose one of the sites offered to them.

12
SERVER-INITIATED REPLICAS
  • Server-initiated replicas are used to enhance
    performance by placing temporary replicas
    (dynamically placing) to handle sudden burst of
    requests.
  • Used mainly by the Web hosting services.
  • Each server keeps track of access counts per file
    and where access requests come from.
  • Given a client C, each server can determine which
    of the servers in the Web hosting service is
    closest to C (Such information can be obtained
    from routing database).
  • If client C1 and C2 share the same closest server
    P, all access requests for file F jointly
    registered.
  • When the number of requests for a specific file F
    drops below a certain threshold, that file can be
    removed from the server.

13
SERVER-INITIATED REPLICAS
  • Server-initiated replicas are generally used for
    placing read-only copies.

14
CLIENT-INITIATED REPLICAS
  • Client-initiated replicas are more commonly known
    as client caches. A cache is a local storage
    facility that is used by a client to temporarily
    store a copy of data.
  • Managing cache is left to the client. However,
    client can rely on server to inform when cache
    has become stale.
  • When most operations involve only reading data,
    performance can be improved by letting the client
    store requested data in nearby cache. Such a
    cache can be located on the clients machine or
    on a separate machine in the same LAN.
  • Whenever requested data can be fetched from the
    local cached, a cache hit is said to have occured.

15
CONTENT DISTRIBUTION
  • There are three ways to propagate the updated
    content to the replica servers
  • Propagate only a notification of an update
    Other copies are informed that an update has
    taken place, and the data they contain is no
    longer valid. The main advantage here is that use
    of little network bandwidth, and works best when
    there are many update operations compared to read
    operations, that is read-to-write ratio is
    relatively small.
  • Transfer data from one copy to another It is
    useful when read-to-write ratio is relatively
    high. In that case, the probability that an
    update will be effective in the sense that the
    modified data will be read before the next update
    takes place is high.

16
CONTENT DISTRIBUTION
  • Propagate the update operation from one copy to
    another Tell each replica which update
    operation it should perform (sending only
    parameter values that those operations need).
    This approach, also referred as active
    replication assumes that each replica is
    represented by a process capable of actively
    keeping its associated data up to date.

17
ACTIVE REPLICATION
  • Active replication requires that operations need
    to be carried out in the same order everywhere.
  • Such an ordering can be achieved using a central
    coordinator, also called a sequencer.
  • Each operation is forwarded to the sequencer
    which assigns it a unique number and then
    forwards the operation to all replicas.

18
PULL versus PUSH PROTOCOLS
  • Yet another design issue is whether updates are
    pulled or pushed.
  • In a push-based approach, also referred as
    server-based protocols, updates are propagated to
    other replicas without those replicas asking for
    the updates. Push-based approach is used when
    replicas need to maintain a high degree of
    consistency i.e. replicas need to be kept
    identical. The server needs to keep track of all
    client caches. A Web server may need to keep
    track of tens of thousands of client caches.
  • In pull-based approach, a server or client
    requests another server to send it any updates it
    has at that moment. This approach, also called
    client-based protocols are often used by client
    caches, for example by Web caches.

19
PULL versus PUSH PROTOCOLS
  • A comparison between push-based and pull-based
    protocols
  • in the case of multiple-client, single-server
    systems.

20
ELECTION ALGORITHMS
  • Many distributed algorithms require one process
    to act as coordinator, initiator, or perform some
    special role. There are algorithms for electing a
    coordinator.
  • We will assume that each process has a unique
    number, for example, its network address (for
    simplicity, we will assume one process per
    machine).
  • Furthermore, we also assume that every process
    knows process number of every other process. What
    the processes do not know is which processes are
    currently running and which ones are down.
  • In general, election algorithms attempt to locate
    the process with the highest process number and
    designate it as coordinator.
  • The goal of an election algorithm is to ensure
    that when an election starts, it concludes with
    all processes agreeing on who the new coordinator
    is to be.

21
BULLY ALGORITHM
  • When any process notices that the coordinator is
    no longer responding to requests, it initiates an
    election. A process P, holds an election as
    follows
  • 1. P sends an ELECTION message to all processes
    with higher numbers.
  • 2.If no one responds, P wins the election and
    becomes the coordinator.
  • 3. If one of the higher-ups answers, it takes
    over.
  • At any moment, a process can get an ELECTION
    message from one of the lower-numbered processes,
    and it sends an OK message back.
  • It holds another election in the same manner.
    Eventually, the highest numbered process will be
    new coordinator.
  • The biggest guy wins, that is why it is called
    the bully algorithm.

22
BULLY ALGORITHM
The bully election algorithm. (a) Process 4 holds
an election. (b) Processes 5 and 6 respond,
telling 4 to stop. (c) Now 5 and 6 each hold an
election.
23
BULLY ALGORITHM
  • The bully election algorithm. (d) Process 6
    tells 5 to stop.
  • (e) Process 6 wins and tells everyone.

24
RING ALGORITHM
  • Another election algorithm is based on the use of
    a ring. Unlike some ring algorithms, this one
    does not use a token.
  • We assume that the processes are physically or
    logically ordered, so that each process knows who
    its successor is.
  • When any process notices that the coordinator is
    not functioning, it builds an ELECTION message
    containing its own process number.
  • If the successor is down, it skips over and goes
    to the next member along the ring.
  • At each step, the sender adds its own process
    number to the list making itself a candidate to
    be elected.

25
RING ALGORITHM
  • Eventually, the message gets back to the process
    that started it all. The process recognizes this
    event when it receives an incoming message
    containing its own process number.
  • The message type is changed to COORDINATOR and
    circulated once again, this time to inform
    everyone else who the coordinator is (the list
    member with the highest number) and who the
    members of the new ring are.

26
RING ALGORITHM
  • Election algorithm using a ring.

27
NETWORK TIME PROTOCOL
  • The Network Time Protocol (NTP) is a protocol
    built on top of TCP/IP used to synchronize clocks
    of distributed systems. NTP uses the UDP protocol
    on port 123 to communicate between clients and
    servers.

28
THE BERKELEY ALGORITHM
  • In many algorithms such as NTP, the time server
    is passive. Other machines ask it for time, and
    it responds to their queries.
  • In Berkeley algorithm, time server (time deamon)
    is active, polling every machine from time to
    time to ask what time it is there.
  • Based on the answers, it computes and average
    time and tells all the other machines to adjust
    their clocks.
  • The time servers clock is set manually.

29
THE BERKELEY ALGORITHM
30
THE BERKELEY ALGORITHM
  • (c) The time daemon tells everyone how to adjust
    their clock.
Write a Comment
User Comments (0)
About PowerShow.com