Title: CPSC 668 Distributed Algorithms and Systems
1CPSC 668Distributed Algorithms and Systems
- Fall 2006
- Prof. Jennifer Welch
2Problems Solvable in Failure-Prone Asynchronous
Systems
- Although consensus is not solvable in
failure-prone asynchronous systems (neither
message passing nor read/write shared memory),
there are some interesting problems that are
solvable - set consensus
- approximate agreement
- renaming
- k-exclusion
weakenings of consensus
- "opposite" of consensus
- fault-tolerant variant of mutex
3Model Assumptions
- asynchronous
- shared memory with read/write registers
- at most f crash failures of procs.
- results can be translated to message passing if f
lt n/2 (cf. Chapter 10) - may be a few asides into message passing
4Set Consensus Motivation
- By judiciously weakening the definition of the
consensus problem, we can overcome the
asynchronous impossibility - We've already seen a weakening of consensus
- weaker termination condition for randomized
algorithms - How about weakening the agreement condition?
- One weakening is to allow more than one decision
value - allow a set of decisions
5Set Consensus Definition
- Termination Eventually, each nonfaulty
processor decides. - k-Agreement The number of different values
decided on by nonfaulty processors is at most k. - Validity Every nonfaulty processor decides on a
value that is the input of some processor.
new
6Set Consensus Algorithm
- Uses a shared atomic snapshot object X
- can be implemented with read/write registers
- update your segment of X with your input
- repeatedly scan X until there are at least n - f
nonempty segments - decide on minimum value appearing in any segment
7Correctness of Set Consensus Algorithm
- Termination at most f crashes.
- Validity every decision is some proc's input
- Why does k-agreement hold?
- We'll show it does as long as k gt f.
- Sanity check When k 1, we have standard
consensus. As long as there is less than 1
failure, we can solve the problem.
8k-Set Agreement Condition
- Let S be set of min values in final scan of each
nf proc these are the nf decisions - Suppose in contradiction S gt f 1.
- Let v be largest value in S, the decision of pi.
- So pi's final scan misses at least f 1 values,
contradicting the code.
9Set Consensus Lower Bound
- Theorem There is no algorithm for solving k-set
consensus in the presents of f failures, if f
k. - Straightforward extensions of consensus
impossibility result fail even proving the
existence of an initial bivalent configuration is
quite involved. - Original proof of set-consensus impossibility
used concepts from algebraic topology - Textbook's proof uses more elementary machinery,
but still rather involved
10Approximate Agreement Motivation
- An alternative way to weaken the agreement
condition for consensus - Require that the decisions be close to each
other, but not necessarily equal - Seems appropriate for continuous-valued problems
(as opposed to discrete)
11Approximate Agreement Definition
- Termination Eventually, each nonfaulty
processor decides. - ?-Agreement All nonfaulty decisions are within
? of each other. - Validity Every nonfaulty decision is within the
range of the input values.
new
new
12Approximate Agreement Algorithm
- Assume procs know the range from which input
values are drawn - let D be the length of this range
- up to n - 1 procs can fail
- algorithm is structured as a series of
"asynchronous rounds" - exchange values via a snapshot object, one per
round - compute midpoint for next round
- continue until spread of values is within ?,
which requires about log2 D/? rounds
13Approximate Agreement Algorithm
- Initially local variable v pi's input
- Initially local variable r 1
- update pi's segment of ASOr to be v
- let scan be set of values obtained by scanning
ASOr - v midpoint(scan)
- if r ?log2 (D/?)? 1 then decide v and
terminate - else r
14Analysis of Algorithm
- Definitions w.r.t. a particular execution
- M ?log2 (D/?)? 1
- U0 set of input values
- Ur set of all values ever written to
ASOr
15Helpful Lemma
- Lemma (16.8) Consider any round r lt M. Let u be
the first value written to ASOr. Then the
values written to ASOr1 are in this range
u
min(Ur)
max(Ur)
(min(Ur)u)/2
(max(Ur)u)/2
elements of Ur1 are in here
16Implications of Lemma
- The range of values written to the ASO object for
round r 1 is contained within the range of
values written to the ASO object for round r. - range(Ur1) ? range(Ur)
- The spread (max - min) of values written to the
ASO object for round r 1 is at most half the
spread of values written to the ASO object for
round r. - spread(Ur1) spread(Ur)/2
17Correctness of Algorithm
- Termination Each proc executes M asynchronous
rounds. - Validity The range at each round is contained
in the range at the previous round. - ?-Agreement
- spread(UM) spread(U0)/2M
- D/2M
- ?
18Handling Unknown Input Range
- Range might not be known.
- Actual range in an execution might be much
smaller than maximum possible range. - First idea have a preprocessing phase in which
procs try to determine input range - but asynchrony and possible failures makes this
approach problematic
19Handling Unknown Input Range
- Use just one atomic snapshot object
- Dynamically recalculate how many rounds are
needed as more inputs are revealed - Skip over rounds to try to catch up to maximum
observed round - Only consider values associated with maximum
observed round - Still use midpoint
20Unknown Input Range Algorithm
- shared atomic snapshot object A initially all
segments ? - updatei(A,x,1,x), where x is pi's input
- repeat
- scan A
- let S be spread of all inputs in non-?
segments - if S 0 then maxRound 0
- else maxRound log2(S/?)
- let rmax be largest round in non-? segments
- let values be set of candidates in segments
with round - number rmax
- update pi's segment in A with
x,rmax1,midpt(values) - until rmax maxRound
- decide midpoint(values)
21Analysis of Unknown Input Range Algorithm
- Definitions w.r.t. a particular execution
- U0 set of all input values
- Ur set of all values ever written to A with
round number r - M largest r s.t. Ur is not empty
- With these changes, correctness proof is similar
to that for known input range algorithm.
22Key Differences in Proof
- Why does termination hold?
- a proc's local maxRound variable can only
increase if another proc wakes up and increases
the spread of the observable inputs. This can
happen at most n - 1 times. - Why does ?-agreement hold?
- If pi's input is observed by pj the last time pj
computes its maxRound, same argument as before. - Otherwise, when pi wakes up, it ignores its own
input and uses values from maxRound or later.
23Renaming
- Procs start with unique names from a large domain
- Procs should pick new names that are still
distinct but that are from a smaller domain - Motivation Suppose original names are serial
numbers (many digits), but we'd like the procs to
do some kind of time slicing based on their ids
24Renaming Problem Definition
- Termination Eventually every nonfaulty proc pi
decides on a new name yi - Uniqueness If pi and pj are distinct nonfaulty
procs, then yi ? yj. - We are interested in anonymous algorithms procs
don't have access to their indices, just to their
original names. Code depends only on your
original name.
25Performance of Renaming Algorithm
- New names should be drawn from 1,2,,M.
- We would like M to be as small as possible.
- Uniqueness implies M must be at least n.
- Due to the possibility of failures, M will
actually be larger than n.
26Renaming Results
- Algorithm for wait-free case (f n - 1) with M
n f 2n - 1. - Algorithm for general f with M n f.
- Lower bound that M must be at least n 1, for
wait-free case. - Proof similar to impossibility of wait-free
consensus - Stronger lower bound that M must be at least n
f, if f is the number of failures - Proof uses algebraic topology and is related to
lower bound for set consensus
27Wait-Free Renaming Algorithm
- Shared atomic snapshot object A initially all
segments ? - s 1 // suggestion for my new name
- while true do
- update pi's segment of A to be x,s, where x
is pi's - original name
- scan A
- if s is also someone else's suggestion then
- let r be rank of x among original names
of non-? - segments
- let s be r-th smallest positive integer
not currently - suggested by another proc
- else decide on s for new name and terminate
28Analysis of Renaming Algorithm
- Uniqueness Suppose in contradiction pi and pj
choose same new name, s.
pj's last scan before deciding s
pi's last update before deciding suggests s
pi's last scan before deciding s
sees s as pi's suggestion and doesn't decide s
contradiction!
29Analysis of Renaming Algorithm
- New name space is 1,,2n - 1.
- Why?
- rank of a proc pi's original name is at most n
(the largest one) - worst case is when each of the n - 1 other procs
has suggested a different new name for itself,
say 1,,n - 1. - Then pi suggests n n - 1 2n - 1.
30Analysis of Renaming Algorithm
- Termination Suppose in contradiction some set T
of nonfaulty procs never decide in some
execution. - Consider the suffix ? of the execution in which
- each proc in T has already done at least one
update and - only procs in T take steps (others have either
already crashed or decided).
31Analysis of Renaming Algorithm
- Let F be the set of new names that are free (not
suggested at the beginning of ? by any proc not
in T) -- the trying procs need to choose new
names from this set. - Let z1, z2, be the names in F in order.
- By the definition of ?, no proc wakes up during ?
and reveals an additional original name, so all
procs in T are working with the same set of
original names during ?. - Let pi be proc whose original name has smallest
rank (among this set of original names). Let r
be this rank.
32Analysis of Renaming Algorithm
- Eventually procs other than pi stop suggesting zr
as a new name - After ? starts, every scan indicates a set of
free names that is no larger than F. - Every trying proc other than pi has a larger rank
and thus continually suggests a new name for
itself that is larger than zr, once it does the
first scan in ?.
33Analysis of Renaming Algorithm
- Eventually pi does suggest zr as its new name
- By choice of zr as r-th smallest free new name,
and fact that eventually other trying procs stop
suggesting z1 through zr, eventually pi sees zr
as free name with r-th smallest rank. - Contradicts assumption that pi is trying (i.e.,
stuck). - So termination holds.
34General Renaming
- Suppose we know that at most f procs will fail,
where f is not necessarily n - 1. - We can use the wait-free algorithm, but it is
wasteful in the size of the new name space, 2n -
1, if f lt n - 1. - We can do better (if f lt n - 1) with a slightly
different algorithm - keep track in the snapshot object of whether you
have decided - an undecided proc suggests a new name only if its
original name is among the f 1 lowest names of
procs that have not yet decided.
35k-Exclusion Problem
- A fault-tolerant version of mutual exclusion.
- Processors can fail by crashing, even in the
critical section (stay there forever). - Allow up to k processors to be in the critical
section simultaneously. - If lt k processors fail, then any nonfaulty
processor that wishes to enter the critical
section eventually does so.
36k-Exclusion Algorithm
- cf. paper by Afek et al. 5.
37k-Assignment Problem
- A specialization of k-Exclusion to include
- Uniqueness Each proc in the critical section
has a variable called slot, which is an integer
between 1 and m. If pi and pj are in the C.S.
concurrently, then they have different slots. - Models situation when there is a pool of
identical resources, each of which must be used
exclusively - k is number of procs that can be in the pool
concurrently - m is the number of resources
- To handle failures, m should be larger than k
38k-Assignment Algorithm Schema
k-assignment entry section
k-exclusion entry section
renaming using m 2k-1 names
what about repeated invocations?
k-assignment exit section
39k-Assignment Algorithm Schema
k-assignment entry section
k-exclusion entry section
k-assignment exit section
k-exclusion entry section