Fault Tolerance - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Fault Tolerance

Description:

An important goal in distributed systems design is to construct the system in ... extra bits are added to allow recovery from garbled bits (e.g. Hamming code) ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 40
Provided by: steve1816
Category:

less

Transcript and Presenter's Notes

Title: Fault Tolerance


1
Fault Tolerance
  • Chapter 7

2
Fault Tolerance
  • An important goal in distributed systems design
    is to construct the system in such a way that it
    can automatically recover from partial failure
    without seriously affecting the overall
    performance.
  • A distributed system should tolerate faults and
    continue to operate to some extent even in their
    presence.

3
Basic Concepts
  • The essence of a fault tolerant system is a
    dependable system. Dependability Includes
  • Availability a system is ready to be used
    immediately.
  • Reliability a system can run continuously
    without failure.
  • Safety when a system temporarily fails to
    operate, nothing catastrophic happens.
  • Maintainability how easy a failed system can be
    repaired.

4
Basic Concepts
  • A system is said to fail when it cannot meet its
    promises.
  • An error is a part of a systems state that may
    lead to a failure.
  • The cause of an error is a fault.
  • Fault tolerance means that a system can provide
    its services even in the presence of faults.
  • A transient fault occurs once and then
    disappears.
  • An intermittent fault occurs, then vanishes of
    its own accord, then reappears, and so on.
  • A permanent fault is one that continues to exist
    unitl the faulty component is repaired.

5
Failure Models
  • Different types of failures.

6
Failure Model
  • The most serious failures are arbitrary failures,
    also known as Byzantine failures.
  • Crash failures are also referred to as fail-stop
    failures.
  • Fail-silent systems are systems which do not
    announce they are going to stop.
  • A server is producing random output recognized by
    other processes as plain junk. These faults are
    referred to as being fail-safe.

7
Failure masking by redundancy
  • The key technique for masking faults is to use
    redundancy.
  • With information redundancy, extra bits are added
    to allow recovery from garbled bits (e.g. Hamming
    code).
  • With time redundancy, an action is performed, and
    then, if needed, it is performed again (e.g.,
    message retransmission, transaction restart).
  • With physical redundancy, extra equipment or
    processes are added to make it possible for the
    system as a whole to tolerate the loss or
    malfunctioning of some components.

8
Failure masking by redundancy
  • A TMR (Triple Modular Redundancy) is a design in
    which
  • Each device is replicated 3 times.
  • Following each stage in the circuit is a
    triplicated voter.
  • Each voter is a circuit that has three inputs and
    one output.
  • If two or three of the inputs are the same, the
    output is equal to that input. If all three
    inputs are different, the out is undefined.

9
Failure Masking by Redundancy
  • Triple modular redundancy.

10
Flat Groups versus Hierarchical Groups
  • Communication in a flat group.
  • Communication in a simple hierarchical group

11
Agreement in Faulty Systems
  • The Byzantine generals problem for 3 loyal
    generals and 1 traitor.
  • The generals announce their troop strengths (in
    units of 1 kilosoldiers).
  • The vectors that each general assembles based on
    (a)
  • The vectors that each general receives in step 3.

12
Agreement in Faulty Systems
  • The same as in previous slide, except now with 2
    loyal generals and one traitor.
  • In general, a system with m faulty processes,
    agreement can be achieved only if 2m 1
    correctly functioning processes are present, for
    a total of 3m 1.

13
Reliable client-server communication
  • Point-to-point communication
  • Omission failures occur in the form of lost
    message, and can be masked by using
    acknowledgements and retransmissions.
  • Connection crash failures are often not masked.
    The client can be informed of the channel crash
    by raising an exception.
  • RPC semantics in the presence of failures
  • The client is unable to locate the server.
  • The request message from the client to the server
    is lost.
  • The server crashes after receiving a request.
  • The reply message from the server to the client
    is lost.
  • The client crashes after sending a request.

14
The client is unable to locate the server
  • A possible cause obsolete client stub which does
    not match the current server skeleton.
  • A possible solution raise an exception or signal
    to the client.
  • Drawbacks of the solution
  • Not every language has exceptions or signals.
  • Requiring the programmer to write an exception or
    signal handler destroys the transparency.

15
Lost Request and Reply Messages
  • The Solution retransmission (using timer and
    sequence number)
  • If multiple retransmissions are lost, the client
    gives up and falsely concludes that the server is
    down (back to the Cannot locate server
    problem).

16
Server Crashes
  • A server in client-server communication
  • Normal case
  • Crash after execution
  • Crash before execution
  • The problem is that the client cannot
    differentiate case b) from case c).

17
Server Crashes
Possible RPC semantics (1) at least once (2) at
most once (3) no guarantee (the easiest to
implement) and (4) exact once (impossible to
implement, as explained below).
  • Different combinations of client and server
    strategies in the presence of server crashes.

18
Client Crashes
  • What happens if the client sends a request to a
    server to do some work and crashes before the
    server replies? The left server computation is
    called orphan, which wastes CPU cycles, ties up
    system resources, and causes confusion.
  • Possible solutions to the orphan problem
  • Extermination before send an RPC message, makes
    a log entry in the disk. After rebooting the
    client, the log is checked and the orphan is
    explicitly killed off.
  • Reincarnation divide time into sequentially
    numbered epochs. When a client reboots, it
    broadcasts a message to declare a new epoch.
    After receiving such a message, all remote
    computations for that client are killed off.
  • 3. Gentle reincarnation when an epoch comes in,
    a remote computation is killed only if its own
    cannot be contacted.
  • Expiration each RPC is given a standard amount
    of time T to do the job. If it cannot finish, it
    must ask for another quantum from the client. If
    the client crashed and rebooted after waiting a
    time T, all orphans are sure to be gone.

19
Basic Reliable-Multicasting Schemes
  • A simple solution to reliable multicasting when
    all receivers are known and are assumed not to
    fail
  • Message transmission
  • Reporting feedback

20
Nonhierarchical Feedback Control
  • Several receivers have scheduled a request for
    retransmission, but the first retransmission
    request leads to the suppression of others.

21
Hierarchical Feedback Control
  • The essence of hierarchical reliable
    multicasting.
  • Each local coordinator forwards the message to
    its children.
  • A local coordinator handles retransmission
    requests.

22
Virtual Synchrony
  • The logical organization of a distributed system
    to distinguish between message receipt and
    message delivery

23
Virtual Synchrony
  • The principle of virtual synchronous multicast.

24
Message Ordering
  • Unordered multicasts
  • Three communicating processes in the same group.
    The ordering of events per process is shown along
    the vertical axis.

25
Message Ordering
  • FIFO-ordered multicasts
  • Four processes in the same group with two
    different senders, and a possible delivery order
    of messages under FIFO-ordered multicasting

26
Implementing Virtual Synchrony
  • Six different versions of virtually synchronous
    reliable multicasting.

27
Implementing Virtual Synchrony
  • Process 4 notices that process 7 has crashed,
    sends a view change
  • Process 6 sends out all its unstable messages,
    followed by a flush message
  • Process 6 installs the new view when it has
    received a flush message from everyone else

28
Two-Phase Commit
  • Phase I (Voting)
  • The coordinator sends a VOTE_REQUEST message to
    all participants.
  • When a participant receives a VOTE_REQUEST
    message, it returns either a VOTE_COMMIT message
    to the coordinator telling the coordinator that
    it is prepared to locally commit its part of the
    transaction, or otherwise a VOTE_ABORT message.
  • Phase II (Decision)
  • The coordinator collects all votes from the
    participants. If all have voted to commit, then
    so will the coordinator. In that case, it sends a
    GLOBAL_COMMIT message to all participants.
    However, if one participant had voted to abort
    the transaction, the coordinator will also decide
    to abort and multicast a GLOBAL_ABORT message.
  • Each participant that voted for a commit waits
    for the final reaction from the coordinator. If a
    participant receives a GLOBAL_COMMIT , it locally
    commits the transaction. Otherwise, when
    receiving a GLOBAL_ABORT, it locally aborts the
    transaction as well.

29
Two-Phase Commit
  • The finite state machine for the coordinator in
    2PC.
  • The finite state machine for a participant.

30
Two-Phase Commit
  • Actions taken by a participant P when residing in
    state READY and having contacted another
    participant Q.

31
Two-Phase Commit
actions by coordinator write START _2PC to local
logmulticast VOTE_REQUEST to all
participantswhile not all votes have been
collected wait for any incoming vote
if timeout write GLOBAL_ABORT to local
log multicast GLOBAL_ABORT to all
participants exit record
voteif all participants sent VOTE_COMMIT and
coordinator votes COMMIT write GLOBAL_COMMIT
to local log multicast GLOBAL_COMMIT to all
participants else write GLOBAL_ABORT to
local log multicast GLOBAL_ABORT to all
participants
  • Outline of the steps taken by the coordinator in
    a two phase commit protocol

32
Two-Phase Commit
actions by participant (the main thread) write
INIT to local logwait for VOTE_REQUEST from
coordinatorif timeout write VOTE_ABORT to
local log exitif participant votes
COMMIT write VOTE_COMMIT to local log
send VOTE_COMMIT to coordinator wait for
DECISION from coordinator if timeout
multicast DECISION_REQUEST to other
participants wait until DECISION is
received / remain blocked / write
DECISION to local log if DECISION
GLOBAL_COMMIT write GLOBAL_COMMIT to
local log else if DECISION GLOBAL_ABORT
write GLOBAL_ABORT to local log else
write VOTE_ABORT to local log send VOTE
ABORT to coordinator
  • Steps taken by participant process in 2PC.

33
Two-Phase Commit
actions for handling decision requests /
executed by separate thread / while true
wait until any incoming DECISION_REQUEST is
received / remain blocked / read most
recently recorded STATE from the local log
if STATE GLOBAL_COMMIT send
GLOBAL_COMMIT to requesting participant else
if STATE INIT or STATE GLOBAL_ABORT
send GLOBAL_ABORT to requesting participant
else skip / participant remains
blocked /
  • Steps taken for handling incoming decision
    requests.

34
Three-Phase Commit
  • Two necessary and sufficient conditions for a
    commit protocol to be nonblocking
  • There is no single state from which it is
    possible to make a transition directly to either
    a COMMIT or an ABORT state.
  • There is no state in which it is not possible
    make a final decision, and from which a
    transition to a COMMIT state can be made.
  • Finite state machine for the coordinator in 3PC
  • Finite state machine for a participant

35
Recovery
  • Two forms of error recovery
  • Backward recovery bring the system from its
    present erroneous state back into a previously
    correct state. It is necessary to record the
    systems state from time to time (by state
    checkpointing and message logging). E.g., lost
    message retransmission.
  • Forward recovery bring the system from its
    present erroneous state forward to a correct new
    state from which it can continue to execute. It
    has to know in advance which errors may occur.
    E.g., error correction by special encoding of
    messages.
  • Backward recovery is the most widely used error
    recovery technique due to its generality, but it
    also has the following drawbacks
  • Checkpointing is costly in terms of performance.
  • Not all errors are reversible.

36
Recovery Stable Storage
  • Stable Storage
  • Crash after drive 1 is updated
  • Bad spot

37
Checkpointing
  • A recovery line.

38
Independent Checkpointing
  • The domino effect.

39
Message Logging
  • Incorrect replay of messages after recovery,
    leading to an orphan process.
Write a Comment
User Comments (0)
About PowerShow.com