FaultTolerant SemiFast Implementations of Atomic ReadWrite Registers - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

FaultTolerant SemiFast Implementations of Atomic ReadWrite Registers

Description:

Pairs value, tag are used for ordering operations ... WACK, ts=1. ts=2. 9/27/09. 9. SF Implementation (Cont.) Reader. Inquire timestamp from S-t servers ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 36
Provided by: nicolasn
Category:

less

Transcript and Presenter's Notes

Title: FaultTolerant SemiFast Implementations of Atomic ReadWrite Registers


1
Fault-Tolerant SemiFast Implementations of
Atomic Read/Write Registers
  • Nicolas Nicolaou, University of Connecticut
  • Joint work with
  • C. Georgiou, University of Cyprus
  • A. A. Shvartsman, University of Connecticut

2
What is an Atomic R/W Register?
Register
Read
Write(7)
Write(0)
3
Prior Results
  • Attiya et al. 1995 - Single Writer Multiple
    Reader (SWMR) model where lt1/2 of processes may
    crash
  • Pairs ltvalue, taggt are used for ordering
    operations
  • Writer increases tag and sends ltvalue, taggt to a
    majority
  • Reader
  • Phase 1 obtains maximum tag from a majority
  • Phase 2 propagates the tag to a majority and then
    returns the value associated with that tag
  • Lynch, Shvartsman 1997 and Englert, Shvartsman
    2000 extend the above result for MWMR
  • Quorums instead of majorities
  • 2 round protocols for read/write operations

4
Fast Implementations
  • Dutta, Guerraoui, Levy, Chakraborty 2004
  • SWMR model
  • Single communication round for all write and read
    operations
  • Requires R lt (S/t) 2
  • R readers, S servers, t max server
    failures
  • Not applicable to MWMR

Question Can one introduce SemiFast
Implementations (with fast reads or fast writes)
to relax the bound on the number of readers?
5
Our Contributions
  • Formally define semifast implementations
  • Develop a semifast implementation
  • Based on Fast implementation of Dutta et al. 04
  • Introduce the notion of virtual nodes
  • Bounds On the Number of Virtual Nodes
  • Show that no SemiFast implementations are
    possible for MWMR
  • Simulation Results
  • A small percentile of read operations require a
    second communication round.

6
Model
Writer
Reliable communication channels (For
performance, not for safety)
Servers
Up to t Failures tlt(S/2)
sS
s2
s1
Readers
Any subset of readers /writer may fail by crash.
r2
r1
Siblings
rR
Virtual Nodes Vlt(S/t)-2
vrV
vr2
vr1
7
Semifast Implementations
  • Def. An implementation I is semifast if it
    satisfies the following properties (informally)
  • All writes are fast
  • All complete read operations perform one or two
    communication rounds
  • ?f a read operation ?1 performs two communication
    rounds, then all read operations that precede or
    succeed ?1 and return the same value as ?1 are
    fast
  • ?here exists some execution of I which contains
    only fast read and write operations
  • Assuming all written values are unique

8
SF Implementation
  • Replica consists of
  • Timestamp associated with 2 values
  • Current value and Previous value
  • Writer
  • Send the new timestamp to S-t servers
  • Increase its own timestamp

ts1
ts2
S-t WACKs receivedgt ts, ret(O.K.)
WRITE, ts1, w
WACK, ts1
ts0 ps0
ts0 ps0
ts0 ps0
ts1 ps0 w
ts1 ps0 w
ts0 ps0
ts0 ps0
ts1 ps0 w
ts1 ps0 w
s3
s4
s5
s1
s2
9
SF Implementation (Cont.)
  • Reader
  • Inquire timestamp from S-t servers
  • Server on receipt of a read/write message
  • Record Virtual Identifiers of nodes inquire the
    servers timestamp ts, into a set (seen set).
  • If tsltts gt seen vid ts ts
  • If tsgtts gt seen seen U vid
  • If msgType Inform gt postit ts

10
The servers(Example)
11
The servers(Example)
12
SF Implementation (Cont.)
  • Reader
  • Consider return of timestamps
  • Return timestamp as follows
  • If Predicate True return Maximum Timestamp
  • If Postit MaxTS return Maximum Timestamp
  • Otherwise return Maximum Timestamp -1
  • The definition of the Predicate will be given
    later

13
Predicate(Key Idea)
Completed
ts0 ps0 vr1
ts1 ps0 w,vr1
ts1 ps0 w,vr1
ts1 ps0 w
ts1 ps0 w,vr1
s5
s1
s2
s3
s4
I have to return 1
S-2t servers with ts1
ts0 r1(vr1)
14
Predicate(Key Idea)
ts1 ps0 w,vr1,vr2
ts1 ps0 w,vr1
ts1 ps0 w,vr1,vr2
ts0 ps0 vr2
ts0 ps0 vr1,vr2
s2
s3
s1
s4
s5
MS S-3t
Completed, returned 1
I have to return 1
ts0 r2(vr2)
ts1 r1(vr1)
15
Predicate(Final Form)
  • Predicate is true if a read operation
  • Receives maxTS from MS S-at servers (i.e.
    S-3t)
  • Observes that (i.e. )
  • Formally

16
Sibling Problem
ts1 ps0 w,vr1
ts1 ps0 w,vr1
ts0 ps0 vr1
ts0 ps0 vr1
ts1 ps0 w,vr1
s2
s3
s1
s4
s5
MS S-3t
Completed, returned 1
Predicate is false! return 0
ts0 r2(vr1)
ts1 r1(vr1)
17
SF Implementation (Cont.)
  • Reader must perform second comm. round if
  • Predicate True nm.seena
  • Postit MaxTS Postits lt t1

18
Which readers must write?
  • Observation
  • Two read operations r1 and r2
  • MS1 gt servers replied with maxTS to r1
  • MS2 gt servers replied with maxTS to r2
  • Then MS1-MS2 t gtIf MS1S-at
    then MS2 S-(a1)t
  • If r1 and r2 are siblings
  • Let si ? MS1nMS2
  • si sent m1 and m2 to r1 and r2 resp.
  • It may be the case m1.seen m2.seen
  • Thus if then

19
Correctness
  • We need to show the following
  • Writes are globally ordered
  • If a read() returns a value x then a write(x)
    operation immediately precedes or is concurrent
    with that read
  • A read operation does not return an older value
    than a preceding read operation
  • Reads done by sibling readers
  • Reads done by non-siblings readers

20
Impossibility
  • Consider Algorithms
  • With no virtual nodes
  • With grouping mechanisms similar to our approach
  • Theorem There is no semifast implementation if
    the number of virtual nodes is V (S/t) - 2.

21
MWMR model
  • Theorem There is no semifast implementation for
    the MWMR model
  • Proved in the case of 2 writers, 2 readers and 1
    failure
  • We consider n communication rounds

22
Simulation Results
  • NS2 Simulator
  • Only 10 of read operations need to perform 2nd
    communication round
  • Stochastic Environment
  • Fix Interval Environment

23
Conclusions
  • Semifast implementation is defined
  • Only one complete read operation has to perform 2
    comm. rounds for every write operation
  • SF implementation presented
  • Virtual Nodes lt (S/t) - 2
  • No semifast implementation possiblefor MWMR
    model

24
References
  • Partha Dutta, Rachid Gerraoui, Ron R. Levy and
    Arindam Chakraborty, How Fast can a Distributed
    Atomic Read be, Proceedings of the 23rd annual
    ACM Symposium on Principles of distributed
    computing (PODC 2004), pp. 236- 245, ACM press
    2004.
  • S. Dolev, S. Gilbert, N.A.Lynch,A.A.Shvartsman,J.
    L.Welch GeoquorumsImplementing Atomic Memory in
    Mobile Ad-Hoc Networks, Technical Report
    LCS-TR-900, MIT (2003)
  • Nancy Lynch and Alex Shvartsman. Rambo A
    reconfigurable atomic memory service for dynamic
    networks. In Proceedings of the 16th
    International Symposium on Distributed Computing,
    pages 173-- 190, 2002
  • H.Attiya, A.Bar-Noy, and D.Dolev Sharing memory
    robustly in message-passing systems, Journal of
    the ACM, January 1995.
  • B. Englert and A. A. Shvartsman. Graceful quorum
    reconfiguration in a robust emulation of shared
    memory.In International Conference on Distributed
    Computing Systems, pages 454463, 2000
  • N. A. Lynch and A. A. Shvartsman. Robust
    emulation of shared memory using dynamic
    quorumacknowledged broadcasts. In Symposium on
    Fault-Tolerant Computing, pages 272281, 1997

25
  • Questions?

26
Atomicity
  • Lynch96
  • Valid Executions
  • Invalid Executions

write(8)
write(8)
ack( )
Time
Time
read( )
ret(0)
read( )
read( )
ret(0)
ret(8)
write(8)
ack( )
write(8)
Time
Time
read( )
ret(0)
read( )
ret(0)
read( )
ret(8)
read( )
ret(0)
27
Definitions
  • Each process invokes 1 operation at a time.
  • Each operation consists of
  • Invocation Step
  • Matching Response Step
  • Incomplete Operation no matching response for
    the invocation. Complete operation
  • op1 precedes op2 gt response for op1 precedes
    invocation for op2.
  • If op is a read we write rd
  • If op is a write we write wr

28
Definitions (Cont.)
  • Algorithm implements a register gt satisfies
    termination and atomicity properties
  • Termination Every operation by correct process
    completes.
  • Atomicity (SWMR, wrkkth write)
  • If rd returns x then there is wrk s.t. valkx
  • If wrk precedes rd and rd returns valj, then j
    k
  • If rd returns valk then wrk precedes or is
    concurrent to rd
  • If rd1 returns valk and a succeeding rd2 returns
    valj then j k

29
Atomic vs Shared Register
  • Shared Register
  • Accessible from Single Process
  • Write(v) Stores the value v and returns OK
  • Read() Read the last value stored
  • Atomic Register
  • A distributed data structure
  • Accessed by multiple processes concurrently
  • Behaves as a sequential register.
  • (Recall Atomicity)

30
Atomic vs Shared Register(Graphical)
  • Sequential Register
  • Atomic Register

Register0
Register8
Read(0)
WriteAck()
Read(8)
Register
Write(8)
WriteAck( )
ReadAck2(0)
Read1( )
ReadAck1(8)
Read2( )
31
Non-Triviality
  • A semifast implementation is not trivial if
  • For any execution of , if contains the
    operations and some , performs 2comm.
    rounds, then any , , must be fast.
  • For any execution of , if two read
    operations rd1 and rd2 return the same value and
    rd

32
Strict Communication Scheme
  • Only messages from the invoking processes to the
    servers are delivered.
  • No messages between any servers
  • No messages between any invoking processes

33
When a SemiFast Impl. is Impossible?
  • When Vlt(S/t)-2
  • If V(S/t)-1 then No fast implementation even in
    the case of a skip-free write operation.
    (violates non-triv. Property 3)
  • If V(S/t)-2 then there is an execution where we
    need 2 complete read operations to perform 2 com.
    rounds. (violates Property 1)
  • When V(S/t)-2
  • There exists an execution where 2 read operations
    return the same value and they both perform 2
    com. rounds (violates Prop. 2).

34
No Semifast for MWMR model.
  • Proof Sketch
  • Split multiple round operations into
  • Read phases
  • Write phases
  • Show that as soon as an operation performs a
    write phase cannot change its return value.
  • Show a construction where W2, R2 and t1 and
    atomicity is violated.

35
Challenge
  • How fast can a general implementation of an
    Atomic Register can be?
  • Dynamic Environment (Mobility)
  • Hybrid implementations with some read and write
    operations to perform multiple roundtrips.
  • Communication Overhead in such impl.?
  • Quorum based algorithms. How fast can they be?
Write a Comment
User Comments (0)
About PowerShow.com