Practical Byzantine Fault Tolerance - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Practical Byzantine Fault Tolerance

Description:

... authentication using message authentication codes (MAC) Public key cryptography ... Implementation of a Byzantine fault tolerant distributed file system (BFS) ... – PowerPoint PPT presentation

Number of Views:139
Avg rating:3.0/5.0
Slides: 26
Provided by: vir62
Category:

less

Transcript and Presenter's Notes

Title: Practical Byzantine Fault Tolerance


1
Practical Byzantine Fault Tolerance
2
Byzantine Generals Problem
3
Motivation
  • Malicious attacks
  • Software errors
  • Faulty nodes exhibit Byzantine (arbitrary)
    behavior

4
Failures of previous algorithms
  • Theoretically feasible but inefficient in
    practice
  • Assumes synchrony known bounds of message
    delays and process speeds
  • Synchrony assumption for correctness denial of
    service

5
Improvements in this algorithm
  • Does not rely on synchrony of safety
  • Magnitude order improvement
  • 1 message round trip for read only operations and
    2 to read write operations
  • Efficient authentication using message
    authentication codes (MAC)
  • Public key cryptography

6
Contribution of paper
  • First state machine replication protocol that
    survives Byzantine faults in asynchronous
    networks
  • Important optimizations
  • Implementation of a Byzantine fault tolerant
    distributed file system (BFS)
  • Experiments that measure cost of replication
    technique

7
System Model and Assumptions
  • Asynchronous distributed system nodes connected
    by network
  • Independent nodes failure
  • Cryptographic techniques- public key signatures,
    MAC, message digest produced by collision
    resistant hash functions
  • Signing digest of message and appending to
    plaintext of message
  • All replicas know each others public keys to
    verify signatures

8
Service Properties
  • Safety
  • - satisfies linearizability like a
    centralized system that executes operations
    atomically one at a time
  • - regardless of faulty clients
  • - insufficient to guard against faulty
    clients
  • - limit damage by access control mechanisms

9
  • Liveness
  • - Must rely on synchrony
  • - clients receive replicas to their requests
    if at most (n-1)/3 replicas are faulty and
    delay(t) does not grow faster than t indefinitely
  • - proof for n gt 3f

10
The Algorithm
  • A client sends a request to invoke a service
    operation to the primary
  • The primary multicasts the request to the backups
  • Replicas execute the request and send a reply to
    the client
  • The client waits for f1 replies from different
    replicas with the same result this is the
    result of the opeartion

11
Normal Case Operation
  • State of replica state of service , a message
    log containing messages the replica has accepted
    and no. denoting replicas current view
  • The primary, p, accepts message, m , starts
    three phase protocol pre-prepare, prepare,
    commit
  • Pre-prepare and prepare to order requests sent in
    same view even if primary faulty
  • Prepare and commit to ensure requests that commit
    are ordered across views

12
(No Transcript)
13
  • In pre-prepare phase, the primary assigns a
    sequence no. n, to the request, multicasts a
    pre-prepare message with m piggybacked to all the
    backups and appends the message to its log.
  • ltltPRE-PREPARE,v,n,dgtro p,mgt
  • A backup accepts a pre-prepare message provided
  • The signatures in the request and the pre-prepare
    message are correct and d is the digest for m
  • It is in view v
  • It has not accepted a pre-prepare message for
    view v and sequence number n containing a
    different digest
  • The sequence number in the pre-prepare message is
    between a low water mark h and a high water mark
    H

14
  • If backup i accepts the ltltPRE-PREPARE,v,n,dgtro p
    ,mgt message, it enters the prepare phase by
    multicasting a ltPREPARE,v,n,d,igtro i message to
    all other replicas and adds both messages to its
    log
  • A replica accepts prepare message if their
    signatures are correct, their view no. equals
    replicas current view, and h lt seq no lt H
  • Def - Prepared(m,v,n,i) to be true iff replica i
    has inserted in its log m, pre-prepare for m in
    view v with seq n and 2f prepares from different
    backups that match the pre-prepare. The replicas
    verify if the prepares match the pre-prepare by
    same view, seq no and digest.

15
  • Replica i multicasts ltCOMMIT, v,n,D(m),igtro i to
    other replicas if prepared(m,v,n,i) is true.
  • Commit phase begins
  • Replicas accept commit messages and insert in log
    if properly signed, view no. in message is equal
    to the replicas current view and hltseq no ltH

16
  • Def - committed(m,v,n) is true iff
    prepared(m,v,n,i) is true for all i in some set
    of f1 non-faulty replicas
  • Def - committed-local(m,v,n,i) is true iff
    prepared(m,v,n,i) is true and i has accepted 2f1
    commits from different replicas that match the
    pre-prepare for m
  • A commit matches a pre-prepare if they have same
    view, seq no. and digest
  • If committed-local(m,v,n,i) is true for some
    non-faulty i then committed(m,v,n) is true.
  • It ensures that non-faulty replicas agree on the
    sequence nos of requests that commit locally even
    if they commit in different views at each replica
  • Also any request that commits locally at a
    non-faulty replica will commit at f1 or more
    non-faulty replicas eventually.
  • Each replica i executes the operation requested
    by m after committed-local(m,v,n,i) is true and
    is state reflects the sequential execution of
    all requests with lower sequence numbers ensuring
    that all non-faulty replicas execute requests in
    the same order as required to provide the safety
    property.
  • After executing the requested operation replicas
    send a reply to the client.

17
Garbage Collection
  • Periodical generation of proofs
  • Checkpoints and stable checkpoints (checkpoint
    with proof of 2f1 messages)
  • Discards pre-prepare, prepare, commit messages lt
    n from log and earlier checkpoints and checkpoint
    messages
  • Lower water mark h sequence no. of last stable
    checkpoint
  • Higher water mark H hk k big enough not to
    stall waiting for checkpoint to be stable

18
View Changes
  • VIEW CHANGE MESSAGE
  • NEW VIEW NESSAGE

19
Correctness
  • Safety
  • - If all non-faulty replicas agree on the
    sequence nos. of requests that commit locally.
  • In same view
  • In different views

20
  • Liveness
  • - Avoid starting a view change soon
  • - Prevents the next view change too late
  • - Unable to impede progress by forcing frequent
    view changes

21
Optimization
  • Avoid sending most large replies
  • Reduces no. of message delays for an operation
    from 5 to 4
  • Improves the performance of read only operations
    that do not modify the service state.
  • Cryptography
  • - use digital signatures for view change and
    new view messages
  • - all others use message authentication codes
    (MAC)
  • - less storage for MAC

22
Implementation and Results
  • BFS , a byzantine fault tolerant NFS service
    using the replication library
  • The performance of BFS is 3 worse than standard
    NFS implementation in Digital Unix
  • Can not mask errors at all replicas
  • Mask errors that occur independently at different
    replicas including non-deterministic software
    errors

23
Conclusions
  • First to work correctly in an asynchronous system
  • Improves the previous algorithms by more than an
    order of magnitude

24
Future Work
  • Problem of fault tolerance privacy
  • Reducing the amount of resources required by
    system
  • - no of replicas
  • - no of copies of state to f1

25
  • Thanks for being patient!!
Write a Comment
User Comments (0)
About PowerShow.com