From Byzantine Agreement to Practical survivability - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

From Byzantine Agreement to Practical survivability

Description:

October 02. Dahlia Malkhi. From Byzantine Agreement to ... Active State Machine Replication. Method for consistently replicating arbitrarily typed objects ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 42
Provided by: wwwise4I
Category:

less

Transcript and Presenter's Notes

Title: From Byzantine Agreement to Practical survivability


1
From Byzantine Agreement to Practical
survivability
  • Dahlia Malkhi
  • The Hebrew University of Jerusalem

2
Why Replicate?
  • Cache data
  • NFS, DNS, WWW
  • Fault tolerance
  • Clusters
  • Remote backup

3
A replication system
  • Servers store x and timestamp t, both initially -

x - t -
x - t -
x - t -
x - t -
x - t -
4
A replication system
  • Clients update new values other clients obtain
    them

x 7
x 7, t 1
5
Active State Machine Replication
  • Method for consistently replicating arbitrarily
    typed objects
  • Start from the same initial state
  • Apply operations in the same order without gaps
    at each replica

a
a
a
b
b
c
6
A replication system
  • The decentralized approach
  • Each server contends for next operation
  • When all proposals are collected, everyone decides

x 7, t 1
x 3, t 1
7
A simple ordering protocol
  • Each server proposes a value in each round

X7,t1 X8,t2
X7
X8
none
none
none
X9,t3 X1,t4 X3,t5
none
none
X1
X3
X9
8
Analysis of simple protocol
  • Each delivery may cost N messages
  • cost is amortized during busy times
  • Need to respond to each failure
  • Reconfigure costly agreement on membership

9
Searching for efficient replication
  • The leader-based approach
  • Client sends write to leader, leader broadcasts
    to everybody
  • Leader may rotate, dynamically change, etc.

x 7, t 1
10
Group communication
  • Leader sets order
  • variations revolving token, dynamic leader

X7,t1
11
Analysis of Group Communication
  • Without further interaction, delivery is
    optimistic and may lead to inconsistency
  • Need to respond to leader failure
  • Costly agreement on membership
  • Virtual synchrony simplify recovery from
    partitioned views

12
Where the costs hurt
  • Servers need to monitor for failures
  • Reconfiguration
  • Recovery from optimistic delivery

13
Some design choices
  • Scale
  • Survivability
  • Trusted clients

14
Why scalability?
  • Yesterday
  • NFS
  • Fault tolerant replicated file system (cluster)
  • Four computers flying a shuttle
  • Today
  • Digital archiving Andersons Eternity
  • Ubiquitous computing
  • Peer-to-peer resource sharing
  • eCommerce and eApplication on the Internet

A mobile user
15
Electronic voting system
  • Develop electronic voting system for national
    elections
  • Build on experience with Costa Rica Project (with
    ATT Secure Systems Research Dept)
  • Goals
  • vote from any polling station
  • usable by all voters
  • security
  • double voting, voter privacy, vote coercion, ...

16
Preventing double voting
  • Scope 3,000,000 voters,1000 polling stations
  • Simple problem Prevent using voter id twice
  • For privacy, binding between vote voter id is
    not kept
  • detecting double voting afterwards doesnt help
  • Globally permanently lock voter id when vote is
    cast
  • Centralized server or global protocol is no good
  • lose availability, performance
  • Need scalable, survivable solution!

17
Why survivability?
  • Yesterday
  • Closely coupled, locally administered system
  • Today
  • Wide spread computing
  • Internet hackers
  • More

18
(No Transcript)
19
(No Transcript)
20
Survivable systems
  • The last frontier of protection
  • Component penetrations will occur, so we should
    build systems to anticipate them
  • Survivable system makes meaningful progress when
    components fail to behave as expected, even when
    they conspire to undermine the operation of the
    system as a whole

21
(No Transcript)
22
Could clients be faulty?
  • Benign faults yes
  • Byzantine faults no
  • Employ access control
  • If bypassed, who cares?
  • A malicious client can mess up the data anyway

23
Summary of design choices
  • Scaling
  • thousands of servers, millions of clients
  • Survivability
  • Servers may be penetrated, hence use voting
  • Trusted clients

24
From replicated process to replicated storage
model
  • Fault-tolerant computing in Storage Area Networks
  • Fault-tolerant client/server computing with
    passive servers
  • Servers are playing the role of data stores
  • Servers are not communicating with one another
  • Protocols are carried out by clients

25
Byzantine quorum systems example Malkhi and
Reiter 98
  • At most one server can be penetrated
  • Read/write safe register

26
Byzantine quorum systems example
  • At most one server can be penetrated
  • Read/write safe register

27
Masking quorums
  • A b-masking quorum system over a universe U of
    servers is a set such that
  • Justification let B be set of actually faulty
    servers

28
Replication using masking quorum systems
  • Write(v)
  • Read timestamps from quorum
  • Choose higher, unique timestamp
  • Read()
  • Read (value, timestamp) pairs from quorum
  • Identify correct values that appear b1
    identical times
  • Return highest-timestamp correct value

29
Byzantine Quorums - surprisingly
efficientMalkhi, Reiter and Wool 98
30
Quorum-based replicationFleet, Malkhi and
Reiter 00
Persistent object servers
Server 1
Server 3
Server 2
Server 4
Server 5
No centralized management No locking No
server-to-server interaction No client-to-client
interaction Quorum tuning - benign/Byzantine
faults - strict/probabilistic
guarantees Simple, secure, modular
Q-RPC
Q-RPC
Object-stub
Object-stub
application
application
Client 1
Client 2
31
Universal object emulation
Servers are data containers
Determine total order of object-operations
x.a()
x.b()
32
The Approach Lamport 98
PAXOS
  • Assume a weak leader election primitive
  • Eventually there is a unique leader
  • ? failure detector, partially synchronous/timed
    asynchronous systems, etc.
  • To order operations, the leader invokes an
    instance of the agreement protocol
  • Never disagree on the operation order
  • Might fail to make progress if there is no unique
    leader

33
Identifying an agreement building-block
34
Adding Ranks (ballots)
35
Ranked Register Boichat et al. 02, Chockler and
Malkhi 02
  • The interface
  • rr-read(R), returns ltr,vgt
  • rr-write(R,v), commits or aborts
  • The Paxos Agreement
  • Collect proposals with a rank
  • Make a new proposal with a rank
  • If rr-write with rank R1 commits, then rr-read
    with rank R2gtR1 must see it
  • return the value written by this rr-write (or by
    a write with rank Rgt R1)

36
Agreement using RR
Shared A single ranked register propose(inp) wh
ile (true) do choose a unique monotonically
increasing rank R ltr,vgtrr-read(R) if
(v ) vinp if (rr-write(R,v)
commit) return v od
37
The complete system
RR
RR
38
Implementability of RR
39
Some historical quotes
  • The Byzantine agreement problem has received
    more attention from the computer science
    community than any other problem
  • Chor and Dwork, 89
  • Essay on the Application of Analysis to the
    Probability of Majority Decisions
  • Condorcet, 85
  • Only five computers would be needed for the
    entire country
  • Thomas Watson Sr., 1943

40
More quotes
  • The challenge of reliability in distributed
    computing is perhaps the unavoidable challenge of
    the coming decade, just as performance was the
    challenge of the past decade
  • Ken Birman, 1996

41
Why is this working?
42
Replication models
43
Operation ordering
  • Easy if there are no failures and/or the system
    is synchronous
  • E.g., leader based, LTS based
  • Real systems are both asynchronous and
    failure-prone
  • Accurate failure detection is impossible
    multiple/no leaders might exist at times
  • Solution

PAXOS
44
Fault-tolerant client/server (1)
  • Scalable dynamic fault-tolerant services (e.g.,
    Fleet)
  • Replication groups are created on-the-fly by
    clients out of dynamic server universe
  • Servers need neither monitor nor be aware of each
    other
  • Accommodates Byzantine failures
  • Database servers
  • Client-server middleware

45
Paxos in Replicated Storage Model
client
client
client
46
Disk Paxos Pros and Cons
  • Assumes very weak memory objects
  • Regular registers
  • Not suitable for dynamic environments
  • The number of clients and their Ids are known a
    priori
  • Employs data structure whose size grows with the
    number of clients

47
Our Contribution
  • Identified an abstract building block for Paxos
    agreement
  • Ranked register (RR)
  • Follows deconstruction of BG.. (round-based
    register )
  • Implemented RR in a setting with infinitely many
    dynamic client processes
  • Proven a lower bound on number of R/W registers

48
Identifying an agreement building block
Propose(V) Begin RMW if (val
) valV return val End RMW
consensus
RMW
  • Agreement is trivial with a single RMW
  • RMW cannot be emulated out of faulty memory
    objects of any type Jayanti, Chandra, Toueg 9?

49
Conclusions and Future Work
  • Paxos with infinitely many processes based on new
    ranked register abstraction
  • Fault-tolerant replication in SANs
  • Fault-tolerant client/server applications
  • Future work
  • Handling Byzantine memory failures (NR-arbitrary)
  • Specifying/implementing the leader election
    primitive
Write a Comment
User Comments (0)
About PowerShow.com