Title: Distributed System
1Distributed System
- Collection of communicating/interacting entities
- Communication
- through shared memory
- by exchanging messages
- Entity
- node, site, processor,
- autonomous (has local storage and processing
capabilities) - can communicate with other entities
- has local alarm clock
2Entity Behavior
- Each entity can only be in a finite set of
states. - more precisely, each entity has a status
register, which can only take finite number of
values, the remaining local storage need not to
be constant - the status register loosely corresponds to IC
- Each entity is strictly reactive it performs
actions only as a response to events. - Behavior B(x) (protocol, program,) of entity x
is a set of rules of the form - State x Event Action
3Events
- Event
- receiving a message
- ringing of the local alarm clock
- spontaneous(external) impulse
- only one per entity per execution
- corresponds to wakeup user starting the
program, initiating activity
4Actions
- Action
- finite, indivisible sequence of operations
- internal/local computation
- changing local state and storage
- arming the alarm clock
- sending messages
5Entity Behavior II
- Behavior of each entity is
- deterministic (unambiguous)
- fully specified
- for each (status, event) pair there is a
corresponding action defined
6System Behavior
Collection of behaviors of its entities
B B(x) x ? V
A system behavior is homogeneous if all the
entities have the same behavior
? x,y ? V B(x) B(y)
- Observation Every system can be made
homogeneous. - big switch at the beginning to choose the
appropriate behavior
7Termination
- Implicit
- there are no more messages in transit, all
entities are have finished their actions and have
either terminated or are waiting for messages - Explicit
- each entity has terminated and would not process
any incoming messages - Discussion
- it is easier/cheaper to write a protocol with
implicit termination, however in practice we need
explicit termination, as the implicitly
terminated entities still consume computing
resources - implicit termination is a global property, while
explicit termination is local - there are protocols for converting implicit
termination to explicit one, we will discuss them
later
8Communication Topology
- Ni(x)
- in neighbours of x
- the entities which can directly send message to
x - No(x)
- out neighbours of x
- the entities to which x can directly send
message - Ni(x) and No(x) define directed graph G (V,E)
- network, communication topology
- V - the set of entitites/nodes/vertices
- E - directional communication links/edges
9Communication Topology II
- We usually assume
- ? x Ni(x) No(x), denoted by N(x)
- bidirectional links
- yields undirected communication graph
- Implicit assumption
- N(x) does not change with time
10Basic Axioms
Each entity can distinguish among its in(out)
neighbours In absence of failures,
communication delays are finite
5
1
6
4
2
3
11Restrictions
- Communication restrictions
- FIFO links (no message overtaking on a link)
- bidirectional links
- Reliability restrictions
- reliable communication (every message will be
delivered uncorrupted in finite time) - detectable link/node faults
- restricted types of faults (crash, omission,
corruption)
12Restrictions II
- Timing restrictions
- there is a known upper bound on the message
delivery delay - each message delivery takes 1 time unit
- all local clocks are synchronized
- Topological restrictions
- the network is connected
13Structural Knowledge
- knowledge available to all entities
- can be seen as a restriction
- Topological knowledge
- knowing the number of entities
- knowing the communication topology (i.e. G is a
mesh, hypercube, ) - having a (labelled) map G
- Input data knowledge
- all input data are distinct
- there are k distinct input values
- System knowledge
- there is a unique entity in status leader
- there is a unique initiator
14Complexity Measures
- Message complexity
- worst case number of exchanged messages
- there are usually many possible executions,
although the protocol is deterministic, because
of unpredictable message delays/spontaneous
wake-ups - our most important complexity measure
- Bit complexity
- worst case number of exchanged bits
- more precisely measures communication overhead
- The following hybrid complexity is often used
- number of messages of size O(log n), where n is
the number of entities.
15Time Complexity
- Problem
- there is no universally good notion of time in
asynchronous systems - message delays (enqueuing, transmission,
dequeuing) can be arbitrarily large - in order to get fully realistic estimate about
the real life time complexity, one has to work
with the real distributions of link delays - unfortunately, this is way too unwieldy,
moreover these distributions are usually not
known - Note
- we assume that the time for local computation is
negligible with respect to communication delays
16Time Complexity II
- Bounded delay time complexity
- maximum time taken by a computation assuming
each message is delivered in at most one time
unit (message delay is a positive real number and
can be arbitrarily small, but at most 1) - gets rid of arbitrarily large message delays by
normalizing everything with respect to the
longest delay encountered - Causal time complexity
- the longest causal chain of messages encountered
during a computation - Ideal time complexity
- maximum time taken by a computation assuming
each message is delivered in exactly one time unit
17Time Complexity - Discussion
- bounded delay and causal time complexity
consider all possible executions, while ideal
time complexity considers only few, ideal ones - bounded delay is more appropriate/realistic when
the delays are reasonably distributed around
average , causal time complexity is usually more
appropriate when there are few long delays
18Time-Event Diagram
- captures specific execution
x
y
z
w
entities
time
19Example Broadcasting
- Problem
- each node must receive information I
- Restrictions
- single initiator with the information I
- connected network
- no failures
- bidirectional links
- Main idea
- upon receiving the information, send it to all
your neighbours
20Example Broadcasting II
- States
- initiator, sleeping, done
- done is terminal state
- Protocol for node x
- initiator x wake-up send I to N(x),
status done - sleeping x wake-up
- sleeping x I send I to
N(x) \ sender, status done - done x anything
- Convention we will use
- group rules by status
- omit the empty rules
21Example Broadcasting III
S INITIATOR, SLEEPING, DONE // states I
INITIATOR, SLEEPING //
initial states T DONE
// terminal
states Algorithm for node x INITIATOR
upon wake-up send I to N(x)
become(DONE) SLEEPING upon
receiving I from sender send I to
N(x) \ sender become(DONE)
22Example Broadcasting Complexity
- Message complexity Cm
- there are at most 2 messages on each link
- a link carrying I to a node for the first time
carries only 1 message - let m is the number of links and n the number of
nodes - total complexity is 2m-(n-1)
- Bit complexity Cb CmI
- each message carries only I
- Time complexity T
- bounded delay and ideal complexity
- Max(D(x, y)) ? n-1 (eccentricity of G)
- causal time complexity
- n-1
23Lower Bound on Broadcasting Complexity
- Can we use less then m messages?
- NO! (if we know nothing about the topology)
- Proof by contradiction
- assume there is an edge (x,y) in G which was not
used - construct graph G such that the algorithm will
fail in G
G'
G
y
y
x
x
z
G' (V? z, E-e ? (x,z),(y,z)
G(V,E)
24Flooding
- The above broadcasting algorithm is called
flooding - Broadcasting can be performed more efficiently
- if the topology is known and nice
- if a spanning tree is given
- flood the spanning tree, cost O(n)
25Broadcasting in oriented 2D Mesh
S INITIATOR, SLEEPING, DONE I
INITIATOR, SLEEPING T
DONE M east, west, north, south //
messages
Algorithm for node x INITIATOR
upon wake-up if you have an east
neighbour, send east to it if you have
a west neighbour, send west to it if
you have a north neighbour, send north to it
if you have a south neighbour, send south
to it become(DONE)
26Broadcasting in oriented 2D Mesh II
SLEEPING upon receiving message east
if you have an east neighbour, send east
to it become(DONE) upon
receiving message west if you have a
west neighbour, send west to it
become(DONE) upon receiving message
north if you have an east neighbour,
send east to it if you have a west
neighbour, send west to it if you have
a north neighbour, send north to it
become(DONE)
27Broadcasting in oriented Hypercube
INITIATOR spontaneously send I to all
neighbours SLEEPING upon receiving
message I over link l send I over
links 1, 2, , l-1 Messages n-1 Time log n
110
111
010
011
100
101
000
001
28Broadcasting in unoriented compact chordal rings
- Unoriented compact k-chordal ring
- node i connected to i1, i2, , ik
- that means also i-1, i-2, , i-k
- unoriented do not know which port leads where
- INITIATOR or LEADER
- send visited to all your neighbours
- wait for old and new replies, until you get
reply from each neighbour - send you are leader to an arbitrary node
from which new was received - if there is no such node, initiate
termination - else become(ACTIVE)
29Broadcasting in unoriented compact chordal rings
SLEEPING upon receiving message visited
over link h send reply new over h
become(ACTIVE) upon receiving message you are
leader over link h become(LEADER) ACTIVE
upon receiving message visited over link
h send reply old over h upon receiving
message you are leader over link h
become(LEADER)
30Broadcasting in unoriented compact chordal rings
- in two rounds (moves of the leader) the leader
moves at least k positions (and at most 2k),
therefore there are at most 2n/k rounds - the message cost of one round is 4k messages, so
the overall message complexity is 8n - termination is implicit, to make it explicit, we
can - build a spanning tree an edge leads to a son
if a new reply was received over this edge - broadcast the termination (initiate termination
statement) message by flooding this spanning tree
31References and other results
- All mentioned broadcasting algorithms are
folklore - Some related, more difficult results
- broadcasting can also be performed in 2D
unoriented mesh/torus using less then m( cca 2n)
messages - 2no(n) algorithm and 25n/24 o(n) lower bound
- K. Diks, E. Kranakis, and A. Pelc. Broadcasting
in unlabeled tori. Parallel Processing Letters,
8177--188, 1998. - 10n/7o(n) algorithm and 8n/7-o(n) lower bound
- Stefan Dobrev, Peter Ruzicka Broadcasting on
Anonymous Unoriented Tori. In Proc. of WG 1998,
LNCS 1517, Springer-Verlag, 50-62, 1998. - the upper and lower bound still do not match
32References and other results
- Some related, more difficult results
- broadcasting can also be efficiently performed
in unoriented hypercubes - Krzysztof Diks, Stefan Dobrev, Evangelos
Kranakis, Andrzej Pelc, Peter Ruzicka
Broadcasting in Unlabeled Hypercubes with a
Linear Number of Messages. Information - the same can be achieved even with linear number
of communicated bits, and in optimal log n
(exactly) time - S. Dobrev, P. Ruzicka, and G. Tel. Time and bit
optimal broadcasting in anonymous unoriented
hypercubes. In Proc. of 5th International
Colloquium on Structural Information and
Communication Complexity, Carleton Press, pages
173--187, Amalfi, 1998