CPSC 668 Distributed Algorithms and Systems - PowerPoint PPT Presentation

About This Presentation
Title:

CPSC 668 Distributed Algorithms and Systems

Description:

Although consensus is not solvable in failure-prone ... elements of Ur 1 are in here. CPSC 668. Set 19: Asynchronous Solvability. 16. Implications of Lemma ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 40
Provided by: jennife380
Category:

less

Transcript and Presenter's Notes

Title: CPSC 668 Distributed Algorithms and Systems


1
CPSC 668Distributed Algorithms and Systems
  • Fall 2006
  • Prof. Jennifer Welch

2
Problems Solvable in Failure-Prone Asynchronous
Systems
  • Although consensus is not solvable in
    failure-prone asynchronous systems (neither
    message passing nor read/write shared memory),
    there are some interesting problems that are
    solvable
  • set consensus
  • approximate agreement
  • renaming
  • k-exclusion

weakenings of consensus
- "opposite" of consensus
- fault-tolerant variant of mutex
3
Model Assumptions
  • asynchronous
  • shared memory with read/write registers
  • at most f crash failures of procs.
  • results can be translated to message passing if f
    lt n/2 (cf. Chapter 10)
  • may be a few asides into message passing

4
Set Consensus Motivation
  • By judiciously weakening the definition of the
    consensus problem, we can overcome the
    asynchronous impossibility
  • We've already seen a weakening of consensus
  • weaker termination condition for randomized
    algorithms
  • How about weakening the agreement condition?
  • One weakening is to allow more than one decision
    value
  • allow a set of decisions

5
Set Consensus Definition
  • Termination Eventually, each nonfaulty
    processor decides.
  • k-Agreement The number of different values
    decided on by nonfaulty processors is at most k.
  • Validity Every nonfaulty processor decides on a
    value that is the input of some processor.

new
6
Set Consensus Algorithm
  • Uses a shared atomic snapshot object X
  • can be implemented with read/write registers
  • update your segment of X with your input
  • repeatedly scan X until there are at least n - f
    nonempty segments
  • decide on minimum value appearing in any segment

7
Correctness of Set Consensus Algorithm
  • Termination at most f crashes.
  • Validity every decision is some proc's input
  • Why does k-agreement hold?
  • We'll show it does as long as k gt f.
  • Sanity check When k 1, we have standard
    consensus. As long as there is less than 1
    failure, we can solve the problem.

8
k-Set Agreement Condition
  • Let S be set of min values in final scan of each
    nf proc these are the nf decisions
  • Suppose in contradiction S gt f 1.
  • Let v be largest value in S, the decision of pi.
  • So pi's final scan misses at least f 1 values,
    contradicting the code.

9
Set Consensus Lower Bound
  • Theorem There is no algorithm for solving k-set
    consensus in the presents of f failures, if f
    k.
  • Straightforward extensions of consensus
    impossibility result fail even proving the
    existence of an initial bivalent configuration is
    quite involved.
  • Original proof of set-consensus impossibility
    used concepts from algebraic topology
  • Textbook's proof uses more elementary machinery,
    but still rather involved

10
Approximate Agreement Motivation
  • An alternative way to weaken the agreement
    condition for consensus
  • Require that the decisions be close to each
    other, but not necessarily equal
  • Seems appropriate for continuous-valued problems
    (as opposed to discrete)

11
Approximate Agreement Definition
  • Termination Eventually, each nonfaulty
    processor decides.
  • ?-Agreement All nonfaulty decisions are within
    ? of each other.
  • Validity Every nonfaulty decision is within the
    range of the input values.

new
new
12
Approximate Agreement Algorithm
  • Assume procs know the range from which input
    values are drawn
  • let D be the length of this range
  • up to n - 1 procs can fail
  • algorithm is structured as a series of
    "asynchronous rounds"
  • exchange values via a snapshot object, one per
    round
  • compute midpoint for next round
  • continue until spread of values is within ?,
    which requires about log2 D/? rounds

13
Approximate Agreement Algorithm
  • Initially local variable v pi's input
  • Initially local variable r 1
  • update pi's segment of ASOr to be v
  • let scan be set of values obtained by scanning
    ASOr
  • v midpoint(scan)
  • if r ?log2 (D/?)? 1 then decide v and
    terminate
  • else r

14
Analysis of Algorithm
  • Definitions w.r.t. a particular execution
  • M ?log2 (D/?)? 1
  • U0 set of input values
  • Ur set of all values ever written to
    ASOr

15
Helpful Lemma
  • Lemma (16.8) Consider any round r lt M. Let u be
    the first value written to ASOr. Then the
    values written to ASOr1 are in this range

u
min(Ur)
max(Ur)
(min(Ur)u)/2
(max(Ur)u)/2
elements of Ur1 are in here
16
Implications of Lemma
  • The range of values written to the ASO object for
    round r 1 is contained within the range of
    values written to the ASO object for round r.
  • range(Ur1) ? range(Ur)
  • The spread (max - min) of values written to the
    ASO object for round r 1 is at most half the
    spread of values written to the ASO object for
    round r.
  • spread(Ur1) spread(Ur)/2

17
Correctness of Algorithm
  • Termination Each proc executes M asynchronous
    rounds.
  • Validity The range at each round is contained
    in the range at the previous round.
  • ?-Agreement
  • spread(UM) spread(U0)/2M
  • D/2M
  • ?

18
Handling Unknown Input Range
  • Range might not be known.
  • Actual range in an execution might be much
    smaller than maximum possible range.
  • First idea have a preprocessing phase in which
    procs try to determine input range
  • but asynchrony and possible failures makes this
    approach problematic

19
Handling Unknown Input Range
  • Use just one atomic snapshot object
  • Dynamically recalculate how many rounds are
    needed as more inputs are revealed
  • Skip over rounds to try to catch up to maximum
    observed round
  • Only consider values associated with maximum
    observed round
  • Still use midpoint

20
Unknown Input Range Algorithm
  • shared atomic snapshot object A initially all
    segments ?
  • updatei(A,x,1,x), where x is pi's input
  • repeat
  • scan A
  • let S be spread of all inputs in non-?
    segments
  • if S 0 then maxRound 0
  • else maxRound log2(S/?)
  • let rmax be largest round in non-? segments
  • let values be set of candidates in segments
    with round
  • number rmax
  • update pi's segment in A with
    x,rmax1,midpt(values)
  • until rmax maxRound
  • decide midpoint(values)

21
Analysis of Unknown Input Range Algorithm
  • Definitions w.r.t. a particular execution
  • U0 set of all input values
  • Ur set of all values ever written to A with
    round number r
  • M largest r s.t. Ur is not empty
  • With these changes, correctness proof is similar
    to that for known input range algorithm.

22
Key Differences in Proof
  • Why does termination hold?
  • a proc's local maxRound variable can only
    increase if another proc wakes up and increases
    the spread of the observable inputs. This can
    happen at most n - 1 times.
  • Why does ?-agreement hold?
  • If pi's input is observed by pj the last time pj
    computes its maxRound, same argument as before.
  • Otherwise, when pi wakes up, it ignores its own
    input and uses values from maxRound or later.

23
Renaming
  • Procs start with unique names from a large domain
  • Procs should pick new names that are still
    distinct but that are from a smaller domain
  • Motivation Suppose original names are serial
    numbers (many digits), but we'd like the procs to
    do some kind of time slicing based on their ids

24
Renaming Problem Definition
  • Termination Eventually every nonfaulty proc pi
    decides on a new name yi
  • Uniqueness If pi and pj are distinct nonfaulty
    procs, then yi ? yj.
  • We are interested in anonymous algorithms procs
    don't have access to their indices, just to their
    original names. Code depends only on your
    original name.

25
Performance of Renaming Algorithm
  • New names should be drawn from 1,2,,M.
  • We would like M to be as small as possible.
  • Uniqueness implies M must be at least n.
  • Due to the possibility of failures, M will
    actually be larger than n.

26
Renaming Results
  • Algorithm for wait-free case (f n - 1) with M
    n f 2n - 1.
  • Algorithm for general f with M n f.
  • Lower bound that M must be at least n 1, for
    wait-free case.
  • Proof similar to impossibility of wait-free
    consensus
  • Stronger lower bound that M must be at least n
    f, if f is the number of failures
  • Proof uses algebraic topology and is related to
    lower bound for set consensus

27
Wait-Free Renaming Algorithm
  • Shared atomic snapshot object A initially all
    segments ?
  • s 1 // suggestion for my new name
  • while true do
  • update pi's segment of A to be x,s, where x
    is pi's
  • original name
  • scan A
  • if s is also someone else's suggestion then
  • let r be rank of x among original names
    of non-?
  • segments
  • let s be r-th smallest positive integer
    not currently
  • suggested by another proc
  • else decide on s for new name and terminate

28
Analysis of Renaming Algorithm
  • Uniqueness Suppose in contradiction pi and pj
    choose same new name, s.

pj's last scan before deciding s
pi's last update before deciding suggests s
pi's last scan before deciding s
sees s as pi's suggestion and doesn't decide s
contradiction!
29
Analysis of Renaming Algorithm
  • New name space is 1,,2n - 1.
  • Why?
  • rank of a proc pi's original name is at most n
    (the largest one)
  • worst case is when each of the n - 1 other procs
    has suggested a different new name for itself,
    say 1,,n - 1.
  • Then pi suggests n n - 1 2n - 1.

30
Analysis of Renaming Algorithm
  • Termination Suppose in contradiction some set T
    of nonfaulty procs never decide in some
    execution.
  • Consider the suffix ? of the execution in which
  • each proc in T has already done at least one
    update and
  • only procs in T take steps (others have either
    already crashed or decided).

31
Analysis of Renaming Algorithm
  • Let F be the set of new names that are free (not
    suggested at the beginning of ? by any proc not
    in T) -- the trying procs need to choose new
    names from this set.
  • Let z1, z2, be the names in F in order.
  • By the definition of ?, no proc wakes up during ?
    and reveals an additional original name, so all
    procs in T are working with the same set of
    original names during ?.
  • Let pi be proc whose original name has smallest
    rank (among this set of original names). Let r
    be this rank.

32
Analysis of Renaming Algorithm
  • Eventually procs other than pi stop suggesting zr
    as a new name
  • After ? starts, every scan indicates a set of
    free names that is no larger than F.
  • Every trying proc other than pi has a larger rank
    and thus continually suggests a new name for
    itself that is larger than zr, once it does the
    first scan in ?.

33
Analysis of Renaming Algorithm
  • Eventually pi does suggest zr as its new name
  • By choice of zr as r-th smallest free new name,
    and fact that eventually other trying procs stop
    suggesting z1 through zr, eventually pi sees zr
    as free name with r-th smallest rank.
  • Contradicts assumption that pi is trying (i.e.,
    stuck).
  • So termination holds.

34
General Renaming
  • Suppose we know that at most f procs will fail,
    where f is not necessarily n - 1.
  • We can use the wait-free algorithm, but it is
    wasteful in the size of the new name space, 2n -
    1, if f lt n - 1.
  • We can do better (if f lt n - 1) with a slightly
    different algorithm
  • keep track in the snapshot object of whether you
    have decided
  • an undecided proc suggests a new name only if its
    original name is among the f 1 lowest names of
    procs that have not yet decided.

35
k-Exclusion Problem
  • A fault-tolerant version of mutual exclusion.
  • Processors can fail by crashing, even in the
    critical section (stay there forever).
  • Allow up to k processors to be in the critical
    section simultaneously.
  • If lt k processors fail, then any nonfaulty
    processor that wishes to enter the critical
    section eventually does so.

36
k-Exclusion Algorithm
  • cf. paper by Afek et al. 5.

37
k-Assignment Problem
  • A specialization of k-Exclusion to include
  • Uniqueness Each proc in the critical section
    has a variable called slot, which is an integer
    between 1 and m. If pi and pj are in the C.S.
    concurrently, then they have different slots.
  • Models situation when there is a pool of
    identical resources, each of which must be used
    exclusively
  • k is number of procs that can be in the pool
    concurrently
  • m is the number of resources
  • To handle failures, m should be larger than k

38
k-Assignment Algorithm Schema
k-assignment entry section
k-exclusion entry section
renaming using m 2k-1 names
what about repeated invocations?
k-assignment exit section
39
k-Assignment Algorithm Schema
k-assignment entry section
k-exclusion entry section
k-assignment exit section
k-exclusion entry section
Write a Comment
User Comments (0)
About PowerShow.com