CS542 Topics in Distributed Systems

About This Presentation

Title:

CS542 Topics in Distributed Systems

Description:

CS542 Topics in Distributed Systems Diganta Goswami – PowerPoint PPT presentation

Number of Views:93

Avg rating:3.0/5.0

Slides: 32

Provided by: Meh138

Category:

more less

Transcript and Presenter's Notes

Title: CS542 Topics in Distributed Systems

1
CS542 Topics inDistributed Systems
Diganta Goswami
2
Algorithms to Find Global States

Why?
(Distributed) garbage collection think multiple
processes sharing and referencing objects
(Distributed) deadlock detection, termination
think database transactions
Global states most useful for detecting stable
predicates once true always stays true (unless
you do something about it)
e.g., once a deadlock, always stays a deadlock
What?
Global statestates of all processes states of
all communication channels
Capture the instantaneous state of each process
And the instantaneous state of each communication
channel, i.e., messages in transit on the
channels
How?
Well see this lecture!

3
Obvious First Solution

Synchronize clocks of all processes
Ask all processes to record their states at known
time t
Problems?
Time synchronization possible only approximately
(but distributed banking applications cannot take
approximations)
Does not record the state of messages in the
channels
Again synchronization not required causality
is enough!

4
Two Processes and Their Initial States
5
Execution of the Processes
6
Cuts

Cut time frontier, one at each process
f ? cut C iff f is to the left of the frontier C

7
Consistent Cuts

f ? cut C iff f is to the left of the frontier C
A cut C is consistent if and only if
?e ? C (if f ? e then f ? C)
A global state S is consistent if and only if it
corresponds to a consistent cut
A consistent cut a global snapshot

Lamports happens-before
8
The Snapshot Algorithm

Problem Record a set of process and channel
states such that the combination is a global
snapshot/consistent cut.
System Model
There is a uni-directional communication channel
between each ordered process pair (Pj ? Pi and Pi
? Pj)
Communication channels are FIFO-ordered
No failure, all messages arrive intact, exactly
once
Any process may initiate the snapshot (by sending
a special message called Marker)
Snapshot does not require application to stop
sending messages, does not interfere with normal
execution
Each process is able to record its state and the
state of its incoming channels (no central
collection)

9
The Snapshot Algorithm (2)

1. Marker sending rule for initiator process P0
After P0 has recorded its own state
for each outgoing channel C, send a marker
message on C
2. Marker receiving rule for a process Pk
on receipt of a marker over channel C
if Pk has not yet received a marker
record Pks own state
record the state of C as empty
for each outgoing channel C, send a marker on C
turn on recording of messages over other incoming
channels
else
record the state of C as all the messages
received over C since Pk saved its own state
stop recording state of C

10
Chandy and Lamports Snapshot Algorithm
Marker receiving rule for process pi On pis
receipt of a marker message over channel c if
(pi has not yet recorded its state) it records
its process state now records the state of c as
the empty set turns on recording of messages
arriving over other incoming channels else
pi records the state of c as the set of messages
it has received over c since it saved its
state. end if Marker sending rule for process
pi After pi has recorded its state, for each
outgoing channel c pi sends one marker message
over c (before it sends any other message over
c).
11
Snapshot Example

e10
e13
P1
a
e23
P2
e20
b
P3
e30
12
Provable Assertion Chandy-Lamport algo.
determines a consistent cut

Let ei and ej be events occurring at pi and pj,
respectively such that ei ? ej
The snapshot algorithm ensures that
if ej is in the cut then ei is also in the cut.
if ej ? ltpj records its stategt, then it must be
true that ei ? ltpi records its stategt.
By contradiction, suppose ltpi records its stategt
? ei
Consider the path of app messages (through other
processes) that go from ei ? ej
Due to FIFO ordering, markers on each link in
above path precede regular app messages
Thus, since ltpi records its stategt ? ei , it must
be true that pj received a marker before ej
Thus ej is not in the cut gt contradiction

13
Formally Speaking. Process Histories

For a process Pi , where events ei0, ei1,
occur
history(Pi) hi ltei0, ei1, gt
prefix history(Pik) hik ltei0, ei1, ,eik gt
Sik Pi s state immediately after kth event
For a set of processes P1 , ,Pi , .
global history H ?i (hi)
global state S ?i (Siki) ?channels
a cut C ? H h1c1 ? h2c2 ? ? hncn
the frontier of C eici, i 1,2, n

14
Global States useful for detecting Global
Predicates

A cut is consistent if and only if it does not
violate causality
A Run is a total ordering of events in H that is
consistent with each his ordering
A Linearization is a run consistent with
happens-before (?) relation in H (history of all
events).
Linearizations pass through consistent global
states.
A global state Sk is reachable from global state
Si, if there is a linearization, L, that passes
through Si and then through Sk.
The distributed system evolves as a series of
transitions between global states S0 , S1 , .

15
Reachability between states in the snapshot
algorithm
'
16
Distributed debugging

Examine the problem of recording a systems
global state so that we may make useful
statements about whether a transitory state as
opposed to a stable state occurred in an actual
execution
This is what we require, in general, when
debugging a distributed system
Is xi xj lt ? where xi is a variable in
process Pi

17
Distributed debugging

Chandy and Lamports algorithm collects state in
a distributed fashion
The processes in the system can send the state
they gather to a monitor process for collection
Algorithm Marzullo and Neiger, 91 The
observed processes send their states to a process
called a monitor, which assembles globally
consistent states from what it receives
The monitor lie outside the system, observing its
execution

18
Distributed debugging

Goal is to determine cases when a given global
state predicate ? was definitely True at some
point in the execution we observed, and cases
when it was possibly True
Possibly because we may extract a consistent
global state S from an executing system and find
that ?(S) is True.
No single observation of a consistent global
state allows us to conclude whether a non-stable
predicate ever evaluated to True in the actual
execution

19
Distributed debugging

Possibly ? There is a consistent global state S
through which a linearization of H passes such
that ?(S) is True
Definitely ? For all linearization L of H,
there is a consistent global state S through
which L passes such that ?(S) is True

20
Distributed debugging

We now describe
How the process states are collected
How the monitor extracts consistent global states
How the monitor evaluates possibly ? and
definitely ? in both asynchronous and synchronous
systems

21
Distributed debugging

The observed processes pi (I 1, 2, N) send
their initial state to monitor process initially,
and thereafter from time to time, in state
messages
No need to send state except initially and when
it changes
Global state predicate may depend only on certain
parts of the process states hence need only
send relevant state
Need only send state at times when the predicate
may become True or cease to be True
The monitor process records the state messages
from process pi in a separate queue Qi, for each
i 1, 2, N

22
Distributed debugging

In order that the monitor can distinguish
consistent global states from inconsistent global
states, the observed processes enclose their
vector clock values with their state messages
Each queue Qi is kept ordered in sending order
(can be established by examining the i-th
component of the vector clock)

23
Distributed debugging

Let S (s1, s2, , SN) be a global statedrawn
from the state messages that the monitor has
received. Let V(si) be the vector clock of the
state si received from pi
S is a consistent global state iff
V(si)i gt V(sj)i for i, j 1, 2, , N
That is, the no. of pis events known at pj when
it sent sj is no more than the no. of events that
have occurred at pi when it sent si.
Hence, if one processs state depends upon
another, then the global state also encompasses
the state upon which it depends

24
Distributed debugging

The monitor process may establish whether a
given global state is consistent, using the
vector timestamps sent by the observed processes
It can construct a lattice of consistent global
states corresponding to the execution of the
processes captures the relation of reachability
between consistent global states
The nodes denote global states, and the edges
denote possible transitions between these states

25
Vector timestamps and variable values
26
The lattice of global states for the execution
of previous Fig
27
Distributed debugging

A linearization traverses the lattice from any
global state to any global state reachable from
it on the next level that is, in each step some
process experiences one event. For ex. S22 is
reachable from S20, but S22 is not reachable from
S30.
The lattice shows all linearizations
corresponding to a history
A monitor process can now evaluate possibly ? and
definitely ?

28
Distributed debugging

To evaluate possibly ?, the monitor process
starts at the initial state and steps through all
consistent states reachable from that point,
evaluating ? at each stage. It stops when ?
evaluates to True
To evaluate definitely ?, the monitor process
must attempt to find a set of states through
which all linearizations must pass, and at each
of which ? evaluates to True
Note that, the state S is reachable from S iff
V(sj)j gt V(si)j for j 1, 2, , N, j ? i

29
Algorithms to evaluate possibly ? and
definitely ?
30
Global State Predicates

A global-state-predicate is a function from the
set of global states to true, false , e.g.,
deadlock, termination
A global state S0 satisfies liveness property P
iff
liveness(P(S0)) ? ? L? linearizations from S0 L
passes through an SL P(SL) true
Ex P(S) the computation will terminate
A global state S0 satisfies this safety property
P if
safety(P(S0)) ? ?S reachable from S0, P(S)
false
Ex P(S) S has a deadlock
Global states often useful for detecting stable
global-state-predicate it is one that once it
becomes true, it remains true in subsequent
global states, e.g., an object O is orphaned, or
deadlock
A stable predicate may be a safety or liveness
predicate

31
Liveness versus Safety

Can be confusing, but terms are very important
Livenessguarantee that something good will
happen, eventually
Guarantee of termination is a liveness property
Guarantee that at least one of the atheletes in
the 100m final will win gold is liveness
A criminal will eventually be jailed
Completeness in failure detectors
Safetyguarantee that something bad will never
happen
Deadlock avoidance algorithms provide safety
A peace treaty between two nations provides
safety
An innocent person will never be jailed
Accuracy in failure detectors
Can be difficult to satisfy both liveness and
safety!

Write a Comment

User Comments (0)

About PowerShow.com

CS542 Topics in Distributed Systems - PowerPoint PPT Presentation

CS542 Topics in Distributed Systems

CS542 Topics in Distributed Systems Diganta Goswami – PowerPoint PPT presentation