6. Fault Tolerance
  • Introduction
  • Process Resilient
  • Recovery

Learning Objectives
  • To understand the basic concepts of fault
    tolerance and different types of failures in DS,
    and failure masking by redundancy
  • To study the main design issues in process
    resilient, and how process replication (process
    group) can be used to mask failures and reach
    agreement in faulty systems
  • To understand the concept of error recovery, and
    the basic idea of combining checkpointing with
    message logging.

Introduction Basic Concept
  • A characteristic feature of DSs is the notion of
    partial (Independent) failures Each component of
    the system can fail independently, leaving the
    others still running, which may not immediately
    made known to the other components.
  • Fault tolerance An important goal in DSs is to
    construct the system in such a way that can
    automatically recover from partial failures
    without seriously affecting the overall
  • Being fault-tolerant is strongly related to the
    notion of dependable systems including
    availability, reliability, safety, and

Introduction Basic Concept
  • Availability A system is ready to be used
  • Reliability A system can run continuously
    without failure, which is defined in terms of
    time interval (instead of an instance of time in
  • Safety When a system temporarily fails to
    operate correctly, nothing catastrophic happens.
  • Maintainability How easy a failed system can be
  • A system is said to fail when it cannot meet its
    promises. An error is a part of a systems state
    that may lead to a failure. The cause of an error
    is called a fault.

Introduction Failure Models and Masking Failure
by Redundancy
  • To get a better understanding on how serious a
    failure in DS actually is, several classification
    schemes have been developed, as shown in the next
  • The key technique for masking faults is to use
    redundancy, including information redundancy,
    time redundancy, and physical redundancy.
  • Information redundancy Extra bits are added to
    allow recovery from garbled bit, such as using
    Hamming code.
  • Time redundancy An action is performed, and
    then, if need be, it is performed again, such as
  • Physical redundancy Extra physical components
    exist to provide fault tolerance.

Failure Models Different types of failures
Process Resilience
  • It is critical to protect against process
    failures, which is achieved by replicating
    processes into groups.
  • The purpose of having groups is to allow
    processes to deal with collections of processes
    as a single abstraction. When a message is sent
    to the group itself, all members of the group can
    receive it.
  • Process group may be dynamic. New groups can be
    created and old groups can be removed. A process
    can join or leave a group during system
    operation. A process can be a member of several
    groups at the same time.
  • Internal structure of a group In flat group, all
    processes are equal. In hierarchical group, some
    kind of hierarchy exists, such as one being the
    coordinator, all others are workers.

Flat Groups versus Hierarchical Groups
  • Communication in a flat group.
  • Communication in a simple hierarchical group

  • Once a failure has occurred, it is essential that
    faulty process can recover to a correct state.
  • An error is that part of a system that may lead
    to a failure. There are two forms of error
    recovery forward recovery and backward recovery.
  • In forward recovery, when the system has just
    entered an erroneous state, an attempt is made to
    bring the system in a correct new state for
    further execution. The main issue is how it knows
    in advance which error may occur.
  • The backward recovery is to bring the system from
    its present erroneous state back into a previous
    correct state.

  • For backward recovery, it is necessary to record
    the systems state from time to time. Each time
    (part of) the systems state is recorded, a
    checkpoint is said to be made.
  • Backward recovery techniques have been widely
    applied as a general error recovery mechanism, as
    they are often independent of any specific system
    or process.
  • Taking a checkpoint is often a costly operation.
    Thus many fault-tolerant DSs combine
    checkpointing with message logging.
  • In message logging, after a checkpoint has been
    taken, a process logs its messages before sending
    them off (for late possible replay), which makes
    it possible to restore a state that lies beyond
    the most recent checkpoint without the cost of

  • A system is fault tolerant if it can continue to
    operate in the presence of failures.
  • Several types of failures such as crash failure,
    omission failure, timing failure, response
    failure, and Byzantine (arbitrary) failure in DS
    have been identified.
  • Redundancy is the key technique need to achieve
    fault tolerance, which can be applied to
    processes to have a group of processes working
    closely to provide a service.
  • Recovery in fault-tolerant systems is invariably
    achieved by checkpointing the state of the system
    on a regular basis.

7. Security
  • Introduction and Cryptography
  • Security Channels Authentication
  • Security Channel Message Confidentiality and
  • Access Control
  • Security Management
  • Cast Study

Learning Objectives
  • To become familiar with the range of security
    threats faced by networked and distributed
    systems (DSs)
  • To examine various cryptographic techniques
    fundamental to security in DSs, such as symmetric
    crytosystem and asymmetric crytosystem
  • To fully study the two main parts in security in
    DS secure channel and authorization (access
    control), using main techniques of encryption,
    authentication, and access control
  • To gain an understanding of the major methods in
    security management.

  • The security problems in DS arise from the
    openness of Internet and distributed systems.
  • Security measures must be incorporated into
    computer systems whenever they are potential
    targets for malicious or mischievous attacks.
  • Security in computer systems is strongly related
    to the notion of dependability that we
    justifiably trust to deliver its services.
    Confidentiality and integrity are two major
    properties in such systems.
  • Confidentiality the information is disclosed
    only to authorized parties.
  • Integrity Alteration to a systems assets
    (hardware, software, data etc) can be made only
    in an authorized way.

Security Model Threats and forms of attack
  • Masquerading
  • assuming the identity of another user/principal
  • Eavesdropping (Interception)
  • obtaining private or secret information
  • Message tampering (Modification)
  • altering the content of messages in transit
  • Replaying (Fabrication)
  • storing secure messages and sending them at a
    later date
  • Denial of service (Interruption)
  • flooding a channel or other resource, denying
    access to others

Security Policy and Mechanisms
  • Security policy is a set of requirements and
    guidelines to ensure a desired level of security
    for the activities that are performed in the
  • Security mechanisms are employed to implement the
    security policy.
  • Security in DSs can be roughly divided into two
    major parts secure channel and authorization.
  • Secure channel to ensure secure communication,
    including authentication, message confidentiality
    and integrity.
  • Authorization (access control) to ensure that a
    process gets only those access rights to the
    resources in a DS it is entitled to.

Introduction Secure channels
  • Properties
  • Each process is sure of the identity of the other
  • Data is private and protected against
    eavesdropping (confidentiality)
  • Protection against alternation of data
  • Employs cryptographic techniques
  • Authentication based on proof of ownership of
  • Confidentiality and integrity based on
    cryptographic techniques

Introduction Authorization
  • Object (or resource)
  • Mailbox, system file, part of a commercial web
  • Principal
  • User or process that has authority (rights) to
    perform actions
  • Identity of principal is important

Important Security Mechanisms
  • Encryption Using cryptographic techniques,
    encryption transforms data into something an
    attacker cannot understand (for confidentiality).
    It also provide support for integrity checks.
  • Authentication It is used to verify the claimed
    identity of a user, client, server and so on.
  • Authorization It is necessary to check whether a
    client is authorized to perform the action
  • Auditing It is used to trace which clients
    accessed what, and in which way, for late
    security analysis.

  • Fundamental to security in DSs is the use of
    cryptographic techniques the sender first
    encrypts message P (plaintext) into an
    unintelligible message C (ciphertext) , and then
    sends C to the receiver who must decrypt C into
    its original form P.
  • Encryption and decryption are achieved by using
    cryptographic methods parameterized by keys, as
    shown in the next slide, which can protect
    against eavesdropping (for confidentiality) and
    tampering (for integrity).
  • Notations
  • C EK(P) ciphertext C is obtained by
    encrypting plaintext P using key K in a
    cryptographic method (encryption function).
  • P DK(C) plaintext P is obtained by
    decrypting ciphertext C using key K in a
    cryptographic method.

Cryptography (1)
  • Intruders and eavesdroppers in communication.

Cryptography Symmetric Cryptosystem
  • Message P, key K, published encryption functions
    E, D
  • Symmetric (secret key)
  • C EK(P) P DK(EK(P))
  • Same key K for E and D
  • P must be hard (infeasible) to compute if K is
    not known
  • Usual form of attack is brute-force try all
    possible key values for a known pair P, C.
    Resisted by making K sufficiently large 128

Cryptography Asymmetric Cryptosystem
  • Message P, keys Ke, Kd, published encryption
    functions E, D
  • Asymmetric (public key)
  • Separate encryption and decryption keys Ke and
  • C EKe(P) P DKd(EKe(P))
  • One of the keys is kept private, other is made
  • Depends on the use of a trap-door function to
    make the keys. A trap-door function is a one-way
    function with a secret exit - e.g. product of two
    large numbers easy to multiply, very hard
    (infeasible) to factorize
  • E has high computational cost
  • Very large keys gt 512 bits.

Cryptography (2)
  • Other notations used in this chapter.

Cryptography Examples of Symmetric Cryptographic
  • DES The US Data Encryption Standard (1977). No
    longer strong in its original form. 56-bit key,
    operating on 64-bit blocks of data.
  • TEA A simple but effective algorithm developed
    at Cambridge U (1994) for teaching and
    explanation. 128-bit key.
  • Triple-DES Applies DES three times with two
    different keys. 112-bit key.
  • IDEA International Data Encryption Algorithm
    (1990). Resembles TEA. 128-bit key.
  • AES A proposed US Advanced Encryption Standard
    (1997). 128/256-bit key.
  • There are also many other effective algorithms.

Cryptography Examples of Asymmetric
Cryptographic Algorithm
  • RSA The first practical algorithm (Rivest,
    Shamir and Adelman 1978) and still the most
    frequently used. Key length is variable, 512-2048
    bits. The security of RSA comes from the fact
    that no methods are known to efficiently find the
    prime factors of large numbers.
  • Elliptic curve A recently-developed method,
    shorter keys and faster.
  • Asymmetric algorithms are about1000 times
    slower and are therefore not practical for bulk
    encryption, but their other properties make them
    ideal for key distribution and for authentication

Security Channels
  • The client-server model has been used as
    convenient way to organize a DS. When considering
    security in DSs, it is also useful to think in
    terms of clients and servers.
  • Two major issues in security secure channel and
  • A secure channel should be achieved by
    authentication and protection for message
    confidentiality and integrity.
  • By a common sense, authentication and message
    integrity are strongly related and should go
    together (why?).
  • A session key is a shared key that is generally
    used only for as long as the channel exists. It
    is commonly used for secret-key cryptography to
    ensure message confidentiality and integrity
    after authentication.

Security Channels Authentication Based on Shared
Secret Key
  • Authentication based on a shared secret key
    Suppose that both Alice (A) and Bob (B) have a
    shared secret key KA,B (how they obtain KA,B will
    be discussed later). A protocol known as
    challenge-response protocol is shown in the next
  • In this protocol, after B receives As identity
    and request for setting up a communication
    channel between A and B (Message 1), it sends a
    challenge RB (Message 2, it could be a random
    number) to A A is required to encrypt the
    challenge with KA,B, and return KA,B(RB) (Message
    3) to B A also sends a challenge RA (Message 4)
    which B responds to by returning KA,B(RA)
    (Message 5). If both A and B can decrypt KA,B(RA)
    and KA,B(RB) respectively, then they can be sure
    about the others identity.

Authentication Based on Shared Secret Key (1)
  • Authentication based on a shared secret key.

Security Channels Authentication Using a Key
Distribution Center (KDC)
  • The Key Distribution Center (KDC) shares a secret
    key with each of the hosts, e.g., KDC has a
    shared secret key with A, KA,KDC. The system with
    N hosts now only manages N keys.
  • The principle of using a KDC is shown in Slide
    31. A first sends a message to the KDC, telling
    it that she wants to talk to B. The KDC returns a
    shared secret key KA,B (encrypted with the secret
    key KA,KDC) The KDC also sends KA,B to B
    (encrypted with KB,KDC).
  • A may want to start setting up a secure channel
    with B even before B had received the shared key
    from the KDC. The KDC may just pass KB,KDC(KA,B
    ), called ticket, to A and let A take care of
    connecting to B, as shown in Slide 32, which is
    actually a variation of the well-known
    authentication protocol, Needham-Schroeder
    authentication protocol in Slide 33.

Authentication Using a Key Distribution Center (1)
  • The principle of using a KDC.

Authentication Using a Key Distribution Center (2)
  • Using a ticket and letting Alice set up a
    connection to Bob.

Authentication Using a Key Distribution Center (3)
  • The Needham-Schroeder authentication protocol.

Security Channels Authentication Using a Key
Distribution Center (KDC)
  • In Needham-Schroeder protocol in Slide 33, the
    challenge RA1 in Message 1 from A to the KDC is
    also known as a nonce that is a random number and
    can only be used once.
  • Nonce is mainly used to uniquely relate two
    messages to each other, such as the Message 1 and
    Message 2 in the protocol which can be used to
    avoid replaying attack.
  • The problem with Needham-Schroeder protocol The
    intruder, Chuck (C), may get a hold of an old key
    KA,B, he could replay Message 3, and get B to
    set up a channel. In this case, we need to relate
    Message 3 to Message 1 to void the replaying from

Security Channels Authentication Using Public
Key Cryptography
  • Consider the situation that A wants to set up a
    secure channel to B, and both A and B have the
    others public key. A typical authentication
    protocol based on public-key is shown in the next
  • In the protocol, A first sends a challenge RA to
    B encrypted with Bs public key. Only B can
    decrypt the message.
  • After B receives As request, he returns the
    decrypted challenge, along with his own challenge
    to authenticate A, and a new generated session
    key KA,B, which are put into a message and the
    message is encrypted with As public key.
  • A finally returns her response to Bs challenge
    using the session key, proving her identity by
    showing that she could decrypt Message 2.

Authentication Using Public-Key Cryptography
  • Mutual authentication in a public-key

Security Overview
  • Two major parts of security in DS
  • Security channel authentication, message
    confidentiality, and message integrity
  • Authorization (access control) access control
    list and capabilities
  • Cryptography symmetric (secret key) cryptosystem
    and asymmetric (public key) cryptosystem
  • Examples of symmetric (secret key) cryptosystem
    algorithms (DES, TEA etc) and asymmetric (public
    key) cryptosystem algorithms (RSA, Elliptic curve)

  • Security Channel
  • Authentication based on shared secret key,
    or Key Distribution Center (Needham-Schroeder
    protocol) or based on public key cryptography
  • Message confidentiality using cryptography
  • Message integrity using digit signature
    based on cryptography
  • Authorization (access control) Access control
    list and capabilities
  • Security management Key distribution and
    authorization management
  • Case study Kerberos system

Security Channels Message Confidentiality and
  • In addition to authentication, a secure channel
    should also provide guarantees for message
    confidentiality and integrity.
  • Message confidentiality can be achieved by simply
    encrypting a message before sending it, which can
    be done either through a secret key, or using the
    receivers public key.
  • Protecting message integrity is more complicated.
    Digit signature is often used to ensure message
  • A message m can be signed by a principal A by
    encrypting a copy of M with a key KA, and
    attaching it to a plaintext copy of m (as well as
    As identifier).

Digital Signatures
  • Requirements of digit signature
  • To authenticate stored document files as well as
  • To protect against forgery
  • To prevent the signer from repudiating a signed
    document (denying their responsibility).
  • One popular form is to use a public-key
    (asymmetric) cryptosystem.
  • Public-key method is particular well adapted for
    generation of digit signatures as it is
    relatively simple and does not require any
    communication between the sender and receiver.

Digital Signatures with Public Keys
  • When A (Alice) sends a message m to B (Bob), she
    encrypts it with her private key KA-.
  • If A also wants to keep the message content a
    secret, she can use Bs public key KB to encrypt
    the message, that is, to send KB(m, KA-(m)) to
  • When the message arrives at B, B first decrypts
    it using its own private key KB- to get (m,
    KA-(m)), it then uses As public key KA to
    decrypt the signed version of m (i.e., decrypting
    KA-(m)), and then compare KA KA-(m) with m.
  • If they are the same, it means that the message
    really came from A, and has not been modified.

Digital Signatures (1)
  • Digital signing a message using public-key

Digital Signatures with Public Keys
  • Major points in using public keys
  • B (Bob) must be sure it has the public key
    that is indeed owned by A (Alice) using
    certificate described later
  • The validity of As signature holds only as
    long as As private key remains secret A must
    keep the private key secret
  • Once A changed her private key, her
    statement sent to B becomes worthless using a
    central authority to keep track of the keys
  • A encrypts the entire message whose length
    may be very long, resulting a high cost using
    message digest, shown in the next slide.

Digital Signatures with Public Keys
  • Instead of encrypting message M, we encrypt its
    secure digest instead.
  • A secure digest function computes a fixed-length
    hash H(M) that characterizes the document M
  • A H(M) should be
  • Fast to compute easy to compute H(M) given M
  • Hard to invert hard to compute M given H(M)
  • Given M and H(M), it is computationally
    infeasible to find another M, M?M, such that

Digit Signature with Secure Digest Functions
  • Some popular digests (hash functions)
  • MD5 Developed by Rivest (1992). Computes a
    128-bit digest.
  • SHA (1995) based on Rivest's MD4 but made
    more secure by producing a 160-bit digest.
  • Digitally signing a message using a message
    digest is shown in the next slide.
  • A first computes a message digest H(m), and
    encrypts it using her private key, and send
    KA-(H(m)), together with m, to B. B then decrypts
    the KA- ( H(m)) using As public key and compared
    with the calculated message digest. If they
    match, B knows that the message has been signed
    by A.

Digital Signatures (2)
  • Digitally signing a message using a message

Access Control
  • In the client-server model, the client can issue
    requests that are to be carried out by the server
    once a secure channel between client and server
    has been set up.
  • Requests from a client involve carrying out
    operations on resources controlled by the server.
    Such requests can be carried out only if the
    client has sufficient access right for the
  • Verifying access rights is referred as access
    control, whereas authorization is about granting
    access rights. The two terms are strongly related
    and can be used interchangeably.

Access Control ACL and Capability
  • Access control list (ACL) Each object maintains
    a list (ACL) of the access rights of subjects who
    want to access the object (E.g. Unix file access
    permissions). It corresponds to the case in which
    the matrix is distributed column-wise without
    empty entries.
  • Using ACL, when a client sends a request to a
    server to access an object, the servers
    reference monitor will check whether it knows the
    client and if the client has the right to access
    the object by checking the objects ACL.
  • Capabilities A capability corresponds to an
    entry in the access control matrix (the access
    control matrix is distributed row-wise).

Access Control ACL and Capability
  • A capability is similar to a ticket (or a key)
    its holder is given rights associated with the
    ticket. It should be protected against
  • One way to protect capabilities is to use
    issuers signature. A capability may have the
    format ltresource id, permitted operations,
    checking codegt with signature in the checking
  • Using capabilities, a client simply sends its
    request to the server that will not check the
    clients identity (why?). The server needs only
    check whether the capability is valid and the
    requested operation is in the capability.
  • Problem with capabilities eavesdropping,
    difficulty of cancellation etc.

ACL and capability
  • Comparison between ACLs and capabilities for
    protecting objects.
  • Using an ACL
  • Using capabilities.

Security Management Key Distribution
  • Security management includes key distribution and
    certification, and authorization management.
  • In a symmetric cryptosystem (using secret key),
    the initial shared secret key must be delivered
    along a secure channel that provides
    authentication and confidentiality. If no key is
    available to set up the initial secure channel,
    some other communication means than network
    should be used, such as using snail mail etc.
  • For asymmetric cryptosystem (using public-key),
    the private key needs to be delivered in the same
    way as secret key does. The public key should be
    distributed in such a way that the receivers can
    be sure that the key is indeed paired to the
    claimed private key.

Case Study Kerberos
  • Secures communication with servers on a local
  • Developed at MIT in the 1980s to provide security
    across a large campus network gt 5000 users
  • Based on Needham-Schroeder protocol.
  • Standardized and now included in many operating
  • Internet RFC 1510, OSF DCE
  • BSD UNIX, Linux, Windows 2000, NT, XP, etc.
  • Available from MIT
  • Kerberos server creates a shared secret key for
    any required server and sends it (encrypted) to
    the user's computer.

Case Study Kerberos
  • Kerberos can be viewed as a security system that
    assists clients in setting up a secure channel
    with a server. Security is based on shared secret
  • There are two components Authentication Server
    (AS) that authenticates a user and provides a key
    used to set up secure channel with servers, and
    Ticket Granting Server (TGS) that hands out
    tickets used to convince a server about the
    identity of the client.
  • Kerberos system can be explain by the figure in
    the next slide, in which Alice (A) wants to set
    up a secure channel with Bob (B) using Kerberos.

Example Kerberos (1)
  • Authentication in Kerberos.

Case Study Kerberos
  • A logs onto the system using any workstation
    available (Message 1). The workstation then sends
    her name (plaintext) to the AS (Message 2).
  • The AS then returns a session key KA,TGS and a
    ticket KAS,TGS(A, KA,TGS ) for her to hand over
    to the TGS (Message 3). Message 3 is encrypted
    with the secret key KA,AS shared between A and
  • After the workstation receive the response
    (Message 3) from AS, it prompts A for password
    (Message 4 and Message 5 respectively), and use
    the password to generate the key KA,AS, after
    that As password can be ignored.
  • Now A can consider herself has logged into the
    system, and she can contact other users or

Case Study Kerberos
  • Assume that A now wants to talk to B, she
    requests the TGS to generate a session key for B
    (Message 6).
  • Message 6 contains the ticket KAS,TGS(A, KA,TGS )
    (to prove she is A), and timestamp t encrypted
    with key KA,TGS , which is used to prevent
    intruders from maliciously replaying Message 6
    again, trying to set up a channel to B. If it
    differs more than a few minutes from the current
    time, the request for a ticket is rejected.
  • The TGS then responds with a session key KA,B,
    again encapsulated in a ticket that A will later
    have to pass to B (Message 7)
  • Setting up a secure channel with B is
    straightforward, and is shown in the next slide.

Example Kerberos (2)
  • Setting up a secure channel in Kerberos.

Summary I
  • It is essential to protect the resources,
    communication channels and interfaces of
    distributed systems and applications against any
  • This is achieved by the use of secure channels
    and authorization (access control) mechanisms.
  • A secure channel ensures secure communication
    providing authentication, message confidentiality
    and integrity.
  • Public-key and secret-key cryptography provide
    the basis for setting up secure channels. It is
    common practice to use public-key (asymmetric)
    cryptography for distributing short term shared
    secret keys (session keys).

Summary II
  • Authorization (access control) deals with
    protecting resources in such a way that only
    processes that have the proper access rights can
    actually access and use those resources. Access
    control always takes place after authentication.
  • There are two ways to implementing access
    control access control list (ACL) and
  • Two important issues in security management are
    the key management and authorization management.
  • Kerberos is a widely-used security system based
    on shared secret keys, Its main focus is on
    authentication, although it also incorporates
    protocols for access control and delegation of
    access rights.

  • Q1. Each of the security attacks Masquerading
    (impostoring), eavesdropping, message tampering,
    replaying, and denial of service, is generally
    prevented by which mechanism?
  • Q2. Discuss the differences of using secrete key
    and public key for secure communication. What are
    the advantages and disadvantages of the two
    approaches, respectively?
  • Q3. Assume that Alice wants to send a message m
    to Bob. Instead of encrypting m with Bobs public
    key KB, she generates a session key KA,B and
    then sends KA,B (m), KB (KA,B). Why
    is this scheme generally better? (Hint consider
    performance issues.)

  • Q4. What is the major problem in using shared
    secret key for authentication? How the
    authentication based Key Distribution Center
    (KDC) can be used to overcome it?
  • Q5. Would it be safe to join message 3 and
    message 4 in the authentication protocol shown in
    the Slide 3, into KA,B(RB,RA)?
  • Q6. Why is it not necessary in the figure in
    Slide 4 for the KDC to know for sure it was
    talking to Alice when it receives a request for a
    secret key that Alice can share with Bob?

  • Q1. Can we safely adapt the authentication
    protocol shown in the figure in the next slide,
    such that message 3 consists only of RB?
  • Q2. Can secret key be used for digit signature?
    List the main features if secret key is used for
    digit signature.
  • Q3. Initial exchanges of public keys are
    vulnerable to the man-in-the-middle attack.
    Describe as many ways against it as you can.
  • Q4. Does it make sense to restrict the lifetime
    of a session key? If so, give an example how that
    could be established.

  • Q1 Mutual authentication in a public-key

  • Q5. How are ACLs implemented in a UNIX file
  • Q6. In message 2 of the Needham-Schroeder
    authentication protocol, the ticket is encrypted
    with the secret key shared between Alice and the
    KDC. Is this encryption necessary?
  • Q7. Complete the figure in the next slide by
    adding the communication for authentication
    between Alice and Bob.

  • Q7 Authentication in Kerberos.
