Logical Time and Global States - PowerPoint PPT Presentation

1 / 36

About This Presentation

Title:

Logical Time and Global States

Description:

Logical time Vs. absolute time (from UTC) ... Lamport's algorithm corrects the clocks. Lamport's algorithm can only achieve partial ordering ... – PowerPoint PPT presentation

Number of Views:47

Avg rating:3.0/5.0

Slides: 37

Provided by: CIT788

Category:

more less

Transcript and Presenter's Notes

Title: Logical Time and Global States

1
Logical Time and Global States

Logical Clocks and Logical Time
Lamports Logical Clock and Vector Clock
Global States and Termination
Chandy and Lamports snapshot Algorithm

2
Logical Time and Logical Clock

Logical time Vs. absolute time (from UTC)
To obtain the absolute time Require frequent
clock synchronization. Frequent clock
synchronization incurs heavy overheads, i.e.,
short synchronization period gt many
synchronization messages
For many systems, it is sufficient that a group
of related machines in the system agree on the
same logical time, i.e., for events ordering
In many applications (i.e., non-real-time), it is
not essential that the logical time agrees with
the absolute time as announced on the broadcast
stations
Also, for two processes, if they are unrelated,
their occur times are unrelated
Event ordering
If two events occurred at the same process pi,
they occurred in the order in which pi observes
them, i.e., e1 then e2
If a message is sent between two processes, the
event of sending the message occurs before the
receipt of the message, i.e., send before receive
Partial ordering gt NOT total ordering, also
called casual ordering

3
Logical Time and Logical Clock

Lamport causal ordering x-P-gty means x happened
before y in process P
happened-before (HB) -gt (precedence
relationship)
HB1 If for process P x-P-gty, then x-gt y.
HB2 For the same message m, send(m)-gtreceive(m)
HB3 If x, y, and z are events, x-gty, y-gtz then
x-gtz (transitive)
I.e. a -gtc b-gt d
How about the event order between a and e?
Nothing can be said
They can be done in any order or even in parallel
a e (i.e., partial order)

p1
b
a
m1
p2
c
d
m2
e
f
p3
4
Logical Time and Logical Clock

Logical clock are used to capture happened-before
ordering
If e is defined to be happened-before e, i.e., e
-gt e gt L(e) lt L(e)
I.e., L(e) the logical clock value at the occur
time of event e
A logical clock needs not bear any particularly
relationship with physical clock but it needs to
be monotonic (i.e., increasing)
Each process/computing unit just keeps its own
(local) clock Cp to timestamp events
The timestamp is monotonic and the initial value
may be zero (or any number/integer)
What is the maximum bound of a logical clock
integer?
Does it need to be reset at after reaching the
maximum value?
Notation Use Lp(a) and Lp(b) to timestamp events
a and b happen in process p and L(b) for event b
at whatever process it occurred
If a happens before b in the same process, Lp(a)
lt Lp (b)
If a and b represent the sending and receiving of
a message, respectively, L(a) lt L(b)
For all distinctive events a and b, L(a) ltgt L(b)

5
Logical Time and Logical Clock

Logical clock update and transmission rules
Logical clock rule 1
Lp is incremented before each event is issued at
p Lp Lp1
Logical clock rule 2
When p sends a message m, it piggybacks on m the
value t Lp
On receiving (m, t), a process q computes
Lqmax(Lq, t) and then applies rule 2 before
timestamping the event receive(m)
Why increment the value by 1 instead of a larger
number or even a negative number?
All clocks run at the same rate, every time
increases the value by one
Could they run at different rates? One clock
advances by 1 before each event but others
advance by 10 before each event
Could they run at variable rate? Sometimes faster
and sometimes slower
A process consists of a sequence of events gt a
sequence of states
After finishing an event, the process enters a
state
Each process state is associated with a logical
clock (a timestamp)
e -gt e gt L(e) lt L(e) but the converse is not
always true. L(e) lt L(e) ???

6
Lamports Logical Clocks
Note there is no increment after receiving the
messages. Is it a problem? What are the rates of
the logical clocks?
Fr. Tanenbaum

Three processes, each with its own local clock
Note, the logical clocks run at different rates.
Between two events, the physical clock tick must
at least advance once

7
Lamports Logical Clocks

Lamports algorithm corrects the clocks
Lamports algorithm can only achieve partial
ordering
Non-related events are unordered. What are
non-related events?
What are related events?

Fr. Tanenbaum
8
Lamports Logical Clocks
Fr. Tanenbaum
The positioning of Lamports logical clocks in a
distributed system (middleware)
9
Total Order and Logical Clock

Partial order to total order (changing a set of
partial orders to a total order)
Total order e1-gte2-gt -gten (all pair of events
are ordered)
Assign a unique timestamp to each process
(following the Lamports algorithm in timestamp
assignment, e-gte gtTS(e) lt TS(e))
For any two events (even unrelated), you can
determine their orderings based on the timestamps
assigned
Why? I.e., comparing their timestamps for data
synchronization
Totally ordered logical clocks (how? Adding
process id)
For pairs of distinct events, we take process id
in setting the timestamps
If a is an event occurring at pa with local
timestamp Ta, and b is an event occurring at pb
with local timestamp Tb
Define the global logical timestamp for those
events as (Ta, a) and (Tb, b), respectively
(Ta, a) lt (Tb, b) if and only if either Ta lt Tb
or (Ta Tb and pa lt pb)
Note this method is just to serialize the
ordering of a set of events. Event a may not
really execute before event b in real time
Making each process has a unique time-stamp
(total order)
TS(Pa) gt TS(Pb) or TS(Pa) lt TS(Pb) but NOT TS(Pa)
ltgt TS(Pb)

10
Example Totally-Ordered Multicasting

Multicast A message is sent to multiple
receivers
Totally-ordered multicast all multicast messages
are delivered in the same order to each receiver
For example, to improve query performance, a bank
may place copies of an account database in two
different cities, say A and B
A customer in B wants to add 100 to his account
that currently contains 1,000 (update 1)
At the same (similar) time, a bank employee in A
initiates an update by which the customers
account is to increase with 1 interest (update
2)
Both updates should be carried out at both copies
(in A and B) of the database (no locking or
synchronization)
If update 2 is performed before update 1 in A,
the A database records 1,110
If update 1 is performed before update 2 in B,
the B database records 1,111
An inconsistency occurs if the two updates are
not performed in the same order at the two sites
Solution Using the Lamports algorithm to assign
logical times to implement totally-ordered
multicast (for update messages), so that the
update operations are performed in the same order
at each copy. How?

11
Example Totally-Ordered Multicasting

Updating a replicated database and leaving it in
an inconsistent state.

To ensure the two updates are performed in the
same order at each site. How?
12
Example Totally-Ordered Multicasting

Each update generates two updates to update the
two copies of the record
We have four combinations of the execution orders
at the two sites

Execution order I
City A
Update 2 gt 1010
Update 1 gt 1110
City B
Update 1 gt 1100
Update 2 gt 1111
Execution order II
City A
Update 1 gt 1100
Update 2 gt 1111
City B
Update 2 gt 1010
Update 1 gt 1110

Execution order III
City A
Update 2 gt 1010
Update 1 gt 1110
City B
Update 2 gt 1010
Update 1 gt 1110
Execution order IV
City A
Update 1 gt 1100
Update 2 gt 1111
City B
Update 1 gt 1100
Update 2 gt 1111

13
Totally-Ordered Multicasting Using Logical Time

For a group of processes, multicasting messages
to each other, we assume
Each message is time-stamped with the current
logical time of its sender
The sender is also a receiver of its own sending
message
The messages from the same sender are received in
the order they were sent, and no messages are
lost
When a process receives a message, it is put into
a local queue, ordered according to its timestamp
The receiver multicasts an acknowledgement to the
other processes. The timestamp assigned to the
acknowledgement according to the Lamports
algorithm and is larger than the timestamp of the
original message

14
Totally-Ordered Multicasting Using Logical Time

A process can deliver a queued message to the
application it is running only when the message
is at the head of the queue and has been
acknowledged by each other process
Thus, all the processes will eventually have the
same copy of the local queue ordered by Lamports
timestamps
The Lamports algorithm ensures that no two
messages have the same timestamp, and the
timestamps reflect a consistent global order of
the events, e-gte gt TS(e) lt TS(e)
Therefore, all messages are delivered in the same
order everywhere. That is, we have established
totally-ordered multicasting
Problems The delay in update and higher
communication overhead
How to solve the problem of loss of messages?

15
Totally-Ordered Multicasting Using Logical Time

Site B
Receive Update 1
Generate M1 and sends to site A and itself
Receive Ack 1 from A
Receive M2 containing Update 2
Generate Ack 2 and sends to site A
Compare the timestamps of M2 with Update 1
Process Update 1
Site A
Receive Update 2
Receive M1 containing Update 1
Generate Ack 1 and send to site B
Generate M2 and sends to site B and itself

Local queue Update 1 Ack 1,A Update 2
Local queue Update 1 Ack 1,A Update 2
16
Vector Clock

Shortcoming of Lamports algorithm L(e) lt L(e)
cannot conclude e-gt e
Using a unique (total order) timestamp from the
Lamports algorithm cannot solve this problem.
Why?
In the previous example, we serialize a set of
events and the sequence order may not be the same
as their execution orders following the absolute
time
Causality can be captured by vector timestamp
(clock)
If L(e) lt L(e) then e-gt e
How to achieve this?
What are the differences in implications between
(1) If L(e) lt L(e), then e-gt e
(2) If e-gt e, then L(e) lt L(e)
With (1), by checking the time-stamps, the system
can determine the event orders. Note for some
cases, the events are unordered
(2) is to assign time-stamps to the events based
on their event orders
Vector clock for a system with N processes is an
array of N integers for each process
Each process Pi keeps its own vector clock Vi to
timestamp its local events
Processes piggyback vector timestamps on the
messages they send

17
Vector Clock

Rules
VC1 Initially Vi j 0 for i, j 1, 2, , N
VC2 Just before pi timestamps an event, it sets
Vi i Vi i 1
VC3 pi includes the value t Vi i in every
message it sends
VC4 When pi receives a timestamp t in a message,
it sets Vi j max(Vi j, tj), for j 1,
2, N (merge operation)
Two properties
For a vector clock Vi , Vi i is the number of
events that pi has time-stamped (why?)
Vi j (j ltgt i) is the number of events that have
occurred at pj that pi has recorded (why?)
Based on a message m, a timestamp vt of m tells
the receiver how many events in other processes
have preceded m and on which m may causally
depends on

18
Vector Clock

V V iff Vj Vj for j 1 , 2, , N
V lt V iff Vj lt Vj for j 1, 2, , N
V lt V iff V lt V and V ltgt V
e-gte gt V(e) lt V(e)
V(e) lt V(e) gt e-gt e
Compare events a with f
Compare events c with e
What are the cost and benefit comparing with the
Lamports logical clock?

19
Global State

Event S gt (e) gt S. An event changes the state
of a process
To get the states of a process, record and
time-stamp the state of the process after each
event has been served
How to get the state of a distributed system
(distributed processes)?
Collect the states of all the processes in the
system. How?
Not so simple due to communication delay and
changing process states
What are the purposes of getting the global
state of a distributed system?
Examples detect the termination of a distributed
computation, garbage collection, verification of
a program correctness and deadlock detection,
etc.
Garbage collection no process (including message
in transit) is referring to the object which may
be collected as a garbage
Deadlock two or more processes are
blocked/waiting. They are blocking each other
Termination when to terminate a process?
Inactive and will not become active again. It may
be waiting a message

20
Detecting Global Properties
21
Global State

Global state of a distributed system The local
state of each process, and the messages that are
currently in transit (the state is distributed)
How to obtain the global state of a distributed
system?
The collection takes time and cannot be done
instantaneously. Why?
Distributed snapshot reflects a (consistent
global) state in which the DS might have been (at
a particular time point?)
What is a snapshot? (at a particular time point)
But, the snapshots are distributed
Cut A graphical representation of global state,
as shown in the next slide
What is the implication of a cut? A distributed
snapshot
Inconsistent state Incorrect state
I.e., a snapshot contains a receipt of a message
but not the sending of the message
What is the implication of an inconsistent state?
Incorrect state may generate incorrect results

22
Global State

Problems of obtaining a global state
Lack of global clock (what is the state to be
collected from each site to form the global
state/global snapshot?) If you have a global
clock,
Transmission delay Vs. always changing statues of
processes
We consider two types of events
Internal events of a process, i.e., logic
operations and computations
Communication events sending and receiving of
messages
History(pI) hi lte1,I, e1,I, gt
A prefix history hi,k lte1,I, e1,I, , ek,igt
sk,i is the state of process pi immediately
before the kth event occurs from the initial
state s0,i -gt -gt sk,i
s0,i is the initial state of pi. After the
occurrence of each event, a new state is created
for a process, ek,i gt sk,i
Global history H h1 U h2 U hn, the union of
all process histories

23
Global State and Consistent Cut

A consistent cut
An inconsistent cut

24
Global State and Consistent Cut

How to collect the states/distributed snapshot
from distributed processes? Follow a cut
A cut of the systems execution is a subset of
its global state S that is a union of prefixes of
process histories C h1,c1 U h2,c2 U U hn,cn
The state si in S corresponding to C is that of
pi immediately after the last event processed by
the cut ei,ci
The set of events ei,ci i 1, 2, , N is
called the frontier of the cut C
A cut C is consistent if for each event it
contains. It also contains all the events that
happened-before that event e belong C, f -gt e gt
f belong C
A consistent global state is one that corresponds
to a consistent cut S0 -gt S1 -gtS2-gt Each
transition represents an event occurred in one of
the processes in the system
A run is a total ordering of all the events in a
global history that is consistent with each local
history ordering
A linearization (consistent run) is an ordering
of the events in a global history that is
consistent with happened-before relationship -gt
on H
S is reachable from S if there is a
linearization that passes through S and then S
(from state S to S. What is the implication if
S is not reachable from S?)

25
Global State and Consistent Cut
L1 e1,0 e2,0 e1,1 e1,2, e1,3 e2,1 e2,2 L2
e1,0 e1,1 e2,0 e2,1 e1,2 e2,2 e1,3 L1
L2 S is reachable from S if there is a
linearization that passes through S and the S
26
Global State Predicates

Global state predicate is a function that maps
from the set of global states of processes in the
system to True False
Stability once the system enters a state in
which the predicate is true and it remains True
in all future reachable from that state
Once the system enters the state and the
predicate becomes true, it will remain true in
all future states reachable from that state
I.e., Deadlock and garbage collection
Safety Let S0 be the original state of the
system. Safety with respect to a is the assertion
that a evaluates to False for all state S
reachable from S0
There is an undesirable property a (deadlocked)
that is a predicate of the systems global state.
Safety respect to a is that the assertion that a
evaluates to false for all states S reachable
from the initial state S0
Liveness with respect to ß is the property that
for any linearization L starting in the state S0,
ß evaluates to True for some state SL reachable
from S0. ß may be a desirable and reachable
property

27
Chandy Lamports snapshot Algorithm

Goal to record a set of processes and channel
states for a set of processes pi such that even
though the combination of recorded states may
never have occurred at the same time, the
recorded global state is consistent
What is a channel state? State of a process
including the messages that are in transmission
Assumptions
The communication amongst the processes are
reliable and messages are delivered in order
No process or communication failure
Channels are unidirectional and provide FIFO
transmission
Any process may initiate a global snapshot at any
time
The graph of processes and channels is strongly
connected
The processes may continue execution while the
snapshot takes place
Incoming channels for process pi are those
channels that other processes send messages to pi
Outgoing channels for process pi are those
channels that pi sends messages to other
processes
Each process records its state and also for each
incoming channel a set of messages sent to it

28
Chandy Lamports snapshot Algorithm

Any initiating process, say P, may start by
recording its own local state then it sends a
marker along each of its outgoing channels
When a process P receives a marker from an
incoming channel C,
If P has not saved its local state, it first
saves the state, then sends a marker along each
of its own outgoing channels
If P has recorded the local state, it records the
state of channel C the sequence of messages that
have been received by P since the last time P
recorded its local state, and before it received
the marker
When a process has received a marker along each
of its incoming channels, and processed each one,
its recorded local state and state of each
channel are collected and sent to process P
Because any process can initiate the algorithm,
several snapshots can be constructed at the same
time. To identify different processes of snapshot
construction, a marker can be tagged with
identifier (even version) of process that
initiates the snapshot

29
Chandy Lamports snapshot Algorithm
Marker receiving rule for process pi On pis
receipt of a marker message over channel c if
(pi has not yet recorded its state) it records
its process state now records the state of c as
the empty set turns on recording of messages
arriving over other incoming channels else pi
records the state of c as the set of messages it
has received over c since it saved its
state end if Marker sending rule for process
pi After pi has recorded its state, for each
outgoing channel c pi sends one marker message
over c (before it sends any other message over
c)
30
Global State

Organization of a process and channels for a
distributed snapshot

31
Global State

Process Q receives a marker for the first time
and records its local state
Q records all incoming message
Q receives a marker for its incoming channel and
finishes recording the state of the incoming
channel

32
Chandy Lamports snapshot Algorithm

The biggest problem on getting a distributed
snapshot is how to collect the states of other
processes and those messages in transmission
No message is received but the sending process is
not included in the distributed snapshot
The marker is like a cut on the state of a
process
All other processes follow the same cut sequence
(cut after the first cut) to collect their states
The marker sending rule obligates processes to
send a marker after they have recorded their
state
The marker receiving rule obligates a process
that has not recorded its state to do so
If a process that has already saved its state
receives a marker, it records the state of the
channel as the set of messages it received on it
since it saved its state
What are the importance of reliable communication
and in order transmission?

33
Two processes and their initial states
P2 has already received an order for five
widgets, which it will shortly dispatch to P1
34
The execution of the processes
35
Chandy Lamports snapshot Algorithm

Process P1 records its state in the global state
S0, when P1s state is lt1000, 0gt
Following the marker sending rule, P1 emits a
marker message over its outgoing channel c2
before it sends the next application-level
message (order 10, 100) over channel c2. Global
state S1
Before P2 receives the marker, it emits an
application message (five widgets) over c1 in
response to P1s previous order. Global state S2
Process P1 receives P2s message (five widgets),
and P2 receives the marker. Following the marker
receiving rule, P2 records its state as lt50,
1995gt and that of the channel c2 as empty
sequence. Following the marker sending rule, it
sends a marker message over c1
When P1 receives P2s marker message, it records
the state of channel c1 as the single message
(five widgets) that it received after it first
recorded its state. Global state S3
The final recorded state P1 lt1000,0gt P2
lt50, 1995gt c1 lt(five widgets)gt c2 ltgt