Title: Lectures 23 24 Distributed algorithms: Consensus and Broadcast
1Lectures 23 - 24 Distributed algorithmsConsensu
s and Broadcast
- COMP 523 Advanced Algorithmic Techniques
- Lecturer Dariusz Kowalski
2Overview
- Previous lectures
- Parallel machine model
- Problem of finding maximum
- Prefix computation
- This lectures
- Distributed message-passing model
- Consensus in synchronized setting
- Broadcast in asynchronous setting
3Distributed message-passing model
- Set of n processors with different IDs
p1,...,pn - In each step each processor can either (depending
on the algorithm) - send a message to any subset of other processors
- receive incoming messages
- perform local computation
- Computation can be either (depending on the
adversary) - in synchronized rounds in a round every
processor performs three steps local
computation, sending and receiving, e.g., (p1,p2,
p3), (p1,p2, p3), (p1,p2, p3),... - in asynchronous pattern steps are done according
to some arbitrary order unknown to the
processors, e.g., p1,p2,p2,p3,p2,p3,p2,p1,...
4Fault-tolerance
- Failures in the system
- Lack of synchrony unknown order of steps is
generated by the adversary - Processors crashes adversary decides which
processors crash and choose steps for these
events - Messages are lost (not properly sent or
received) malicious processors/links are
selected by the adversary - Byzantine failures processors may cheat, e.g.,
can behave on the way described above, mess up
content of messages, pretend they have different
ID, etc.
5Analysis of distributed algorithms
- Designing the algorithm, our goal is to prove
- Correctness because the lack of central
information and because of failures - Termination because of the lack of central
control - Efficiency
- Time
- Work (total number of processors steps)
- Number of messages sent
- Total size of messages sent
6Consensus in synchronous crash model
- Consensus
- Each processor has its initial value
- Goal processors decide on the same value among
initial ones - We require from the algorithm
- Agreement no two processors decide on different
value - Termination each processor decides eventually
unless fails - Validity if all initial values are the same then
this value is a decision
7Model for consensus problem
- We consider model with crash failures (easier
than - others, e.g., Byzantine failures) processor
stops every - activity and messages sent during crash are
delivered or - lost arbitrarily (depending on adversary)
- Asynchronous impossible to solve even if one
processor can crash - Synchronous requires at least f 1 rounds if f
processors crash - Consensus can be viewed as a kind of
maximum-finding - problem lets agree on the largest initial value
(although - could be easier, since we could agree on any
initial value)
8Flooding algorithm for consensus
- f-resilient algorithm algorithm that solves
consensus problem if at most f crashes occur - Flooding Algorithm
- During each round 1 ? j ? f 1 each processor
sends to all other processors all the initial
values about which it has already learnt - Decision of a processor if the set of collected
initial values is a singleton then decide on this
value, otherwise decide on default value
9Flooding algorithm - example
- 4 processors, f 2 crashes, default maximum
- Init R1 R2 R3 Decision
- p1 1 --- --- --- ---
- p2 0 0,1 --- --- ---
- p3 0 0 0,1 0,1 1
- p4 0 0 0 0,1 1
10Analysis of Flooding algorithm
- Agreement there is a round j (clean) when no
crash occurs. During this round all non-faulty
processors exchange messages, hence sets of
collected values will be the same after this
round. Obviously they will not change after this
round, and consequently all non-faulty processors
decide the same - Termination after round f 1
- Validity if all initial values are the same, set
of collected initial values is always a
singleton, and decision is on this value - Message complexity - total number of messages
sent O(f n2)
11Decreasing message complexity
- Modification of the algorithm
- Processor sends messages to all processors during
the first round and during round j gt 1 only if in
the previous round it has learnt about a new
initial value - Termination and Validity remain the same
- Agreement similar argument the only difference
that the message exchange may not happen in a
clean round, but by the end of the clean round
all previously learnt values were sent before
this round, new ones are sent during this round - Communication there are at most n different
values and each of them is sent as new at most
n-1 times, hence O(n2)
12Broadcast in asynchronous model
- Communication problems
- Broadcasting one processor, called the source,
has to inform all other processors about its
initial value - Gossip each processor has to inform all other
processors about its initial value - Other
- Asynchronous model
- Execution sequence (C0,e1,C1,e2,C2,)
- where ei is an event (local comp., sending a
message, receiving a message) and Ci is a
configuration of the whole system (collection of
states of all processors) after configuration
Ci-1 and event ei - Trace sequence of events (e1,e2,) extracted
from the execution - Enable event it may eventually happen, e.g.,
after event of sending a message M from p to q an
event of receiving message M by q from p is
enabled immediately after
13Broadcast definitions
- Broadcasting one processor, called the source,
has to inform all other processors about its
initial value - Basic Broadcast (BB) no order of messages is
required - Source-Ordered Broadcast (SOB) if a processor
sends message M1 before message M2 then every
processor receives them in the same order - Total-Ordered Broadcast (TOB) all orders of
received messages are the same in other words if
a processor receives message M1 before message M2
then every processor receives them in the same
order
14SO Broadcast on the top of BB
- Assume we have an implementation of Basic
Broadcast in the system. Using it we can
implement Source-Ordered Broadcast as follows - Each processor initializes tag t 1
- sendSOB(p,M) processor p
- Enables event sendBB(p, M,t) in
BB-implementation - Increases tag t by 1
- receiveSOB(p,q,M)
- If receiveBB(p,q,M,t) happens then processor q
enables receiveSOB(p,q,M) if there is no pending
message M from p with smaller tag t lt t,
otherwise puts M,t into pending until there is
no smaller tag t lt t pending message from p
15Analysis of Source Broadcast
- Each message sent is eventually received it is
received using BB, and since all with previous
tags also must be received eventually, they all
will be enabled from pending - Messages are source-ordered message with bigger
tag can not be enabled before the message with
smaller one
16TO Broadcast on the top of SOB
- Assume we have an implementation of
Source-Ordered - Broadcast in the system. Using it we can
implement - Total-Ordered Broadcast as follows
- Each processor initializes tags Tq 1 for
every q - sendTOB(p,M) processor p
- Enables event sendSOB(p,M,Tp) in
SOB-implementation - Increases tag Tp by 1
17TO Broadcast on the top of SOB cont.
- If receiveSOB(p,q,M,t) happens in processor q
then - Processor q adds triple (p,M,t) to pending
- Tp t
- If t gt Tq then
- Tq t
- Enable sendSOB(q,update,Tq)
- If receiveSOB(p,q,update,t) happens in
processor q then - Tp t
- If triple (p,M,t) is pending in processor q and
(t,p) is smallest possible (lexicographically)
and t ? Tp for every p then processor q - Enables event receiveTOB(p,q,M)
- Removes triple (p,M,t) from pending
18Total-Ordered Broadcast - Analysis
- Each message sent is eventually received
- induction by the length of execution
- Messages are total-ordered
- proof by considering cases
- If two messages are in the same time pending then
first is accepted one with smallest tag (or id if
tags are equal) - If message with smaller pair (tag,id) is put to
pending after the later one is accepted, then it
is a contradiction
19Conclusions
- Distributed models
- Message-passing
- Synchronous/asynchronous
- Fault-tolerance
- Distributed problems and algorithms
- Consensus in synchronous crash setting
- Ordered Broadcast in asynchronous setting
20Exercises
- Prove in details that TOB algorithm is correct
(each message is delivered and total-order
condition is satisfied)