Logical Time - PowerPoint PPT Presentation

1 / 41

About This Presentation

Title:

Logical Time

Description:

'Causality among events, more formally the causal precedence relation, is a ... where vt(ei ) is the timestamp assigned to event ei ... – PowerPoint PPT presentation

Number of Views:125

Avg rating:3.0/5.0

Slides: 42

Provided by: cseS2

Category:

more less

Transcript and Presenter's Notes

Title: Logical Time

1
Logical Time

M. Liu

2
Introduction

The concept of logical time has its origin in a
seminal paper by Leslie Lamport Time, Clocks,
and the Ordering of Events in a Distributed
System, Communications of ACM, July 1978.
The topic remains of interest a recent paper
appeared in Computer Capturing Causality in
Distributed System by Raynal and Singhal (see
handout).

3
Application of Logical Time

Logical Time in Visualizations Produced by
Parallel Computations
Banker system algorithm.
Efficient solutions to the Replicated Log and
Dictionary problems by Wuu Bernstein.

4
Background 1 source Raynal and Singhal

A distributed computation consists of a set of
processes that cooperate and compete to achieve a
common goal. These processes do not share a
common global memory and communicate solely by
passing messages over a communication network.

5
Background 2 source Raynal and Singhal

In a distributed system, a process's actions are
modeled as three types of events internal,
message send, and message receive.
An internal event affects only the process at
which it occurs, and the events at a process are
linearly ordered by their order of occurrence.
Send and receive events signify the flow of
information between processes and establish
causal dependency from the sender process to the
receiver process.

6
Background 3 source Raynal and Singhal

The execution of a distributed application
results in a set of distributed events produced
by the process.
The causal precedence relation induces a partial
order on the events of a distributed computation.

7
Background 4 source Raynal and Singhal

Causality among events, more formally the
causal precedence relation, is a powerful concept
for reasoning, analyzing, and drawing inferences
about a distributed computation. Knowledge of the
causal precedence relation between processes
helps programmers, designers, and the system
itself solve a variety of problems in distributed
computing.

8
Background 5 source Raynal and Singhal

The notion of time is basic to capturing the
causality between events. Distributed systems
have no built-in physical time and can only
approximate it. However, in a distributed
computation, both the progress and the
interaction between processes occur in spurts.
Consequently, logical clocks can be used to
accurately capture the causality relation between
events.
This article presents a general framework of a
system of logical clocks in distributed systems
and discusses three methods--scalar, vector, and
matrix--for implementing logical time in these
systems.
.

9
Notations

A distributed program is composed of a set of n
independent and asynchronous processes p1, p2, ,
pi, , pn. These processes do not share a global
clock.
Each process can execute an event spontaneously
when sending a message, it does not have to wait
for the delivery to be complete.
The execution of each process pi produces a
sequence of events ei0,ei1,.,eix,ei x1, .
The set of events produced by pi have a total
order determined by the sequencing of the events
eix ? ei x1
We say that eix happens before ei x1.
The happen-before relation ? is transitive eii
? eij for all i lt j.

10
Notations - 2

Events that occur between two concurrent
processes are generally unrelated, except for
those that are causally related as follows
for every message m exchanged between two
processes Pi and Pj, we have eix send(m),
ejyreceive(m), and
eix ? ejy
Events in a distributed execution are partially
ordered
Local events are totally ordered.
Causal events are totally ordered.
All other events are unordered.
For any two events e1 and e2 in a distributed
execution, either
(i) e1?e2, (ii) e2?e1, or (iii) e1e2 (that is,
e1 and e2 are concurrent).

11
Which of these events are ? related? Which ones
are concurrent?
12
Clock conditions

In a system of logical clocks, every
participating process has a logical clock that is
advanced according to a protocol.
Every event is assigned a timestamp in such a
manner that satisfy the clock consistency
condition
if e1?e2 then C(e1 ) lt C(e2 )
where C(ei ) is the timestamp assigned to
event ei
If the protocol satisfies the following condition
as well, then the clock is said to be strongly
consistent
if C(e1 ) lt C(e2 ) then e1?e2

13
A logical clock implementation - the Lamport
Clock

R1 Before executing an event(send, receive, or
internal), pi executes the following
Ci Ci d (d gt 0, usually d 1)
R2 Each message carries the clock value of its
sender at sending time. When pi receives a
message with the timestamp Cmsg, it executes the
following
Ci max(Ci , Cmsg )
Execute R1.
Deliver the message.
The logical clock at any process is
monotonically increasing.

14
Fill in the logical clock values
15
Correctness of the Lamport Clock

Does the Lamport clock satisfy the clock
consistency condition?
Does the Lamport clock satisfy the strong clock
consistency condition?

16
Logical Clock Protocols

The Lamport Clock is an example of a logical
clock protocol. There are others.
The Lamport Clock is a scalar clock it uses a
single integer to represent the clock value.

17
Lamport clock paper

PODC Influential Paper Award 2000,
http//www.podc.org/influential/2000.html
Time, clocks, and the ordering of events in a
distributed system by Leslie Lamport, obtainable
from the ACM Digital Library.

18
An application of scalar logical time bank
system algorithm

See bank system algorithm slides

19
Vector Logical Clock

Developed by several persons independently.
Each Pi of n participating processes maintains a
integer vector (array) of size n
vti1,n, where vtii is the local logical
clock of pi,
vtij represents pis latest knowledge of Pjs
local time.

20
Vector clock protocol

At process Pi
Before executing an event, Pi updates its local
logical time as follows
vtii vtii d (d gt 0)
Each sender process piggybacks a message m with
its vector clock value at sending time. Upon
receiving such a message (m, vt), Pi updates its
vector clock as follows
For 1 lt k lt n vtik max(vtik , vtk)
vtii vtii d (d gt 0)

21
Vector clock

The system of vector clocks is strongly
consistent
Every event is assigned a timestamp in such a
manner that satisfies the clock consistency
condition
if e1?e2 then vt(e1 ) lt vt(e2 ), using vector
comparison
where vt(ei ) is the timestamp assigned to
event ei
If the protocol satisfies the following condition
as well, then the clock is said to be strongly
consistent
if vt(e1 ) lt vt(e2 ) then e1?e2 , using vector
comparison

22
Vector comparison

Given two vectors V1 and V2, both of size n
V1 lt V2 if V1i lt V2i for i 1, , n
And there exists some k, 0 lt k lt n1, such that
V1k lt V2k
Example V1 1, 2, 3, 4 V2 2, 3, 4, 5
V1 lt V2
Example V1 1, 2, 3, 4 V2 2, 2, 4, 4
V1 (not) lt V2
Example V1 1, 2, 3, 4 V2 2, 3, 4, 1
V1 (not) lt V2

23
Vector clock

Because vector clocks are strongly consistent, we
can use them to determine whether two events are
causally related by comparing their vector time
stamps, using vector comparison.

24
Matrix Time

Proposed by Michael and Fischer in 1982.
A process Pi maintains a matrix
mti1n, 1n where
mtii, i denotes the logical clock of Pi
mtii, j denotes the latest knowledge that Pi
has about the local clock, mtjj, j of Pj (row i
is the vector clock of Pi .
mtij, k represents what Pi knows about the
latest knowledge that Pj has about the local
logical clock mtkk, k of Pk.

25
Matrix Time Protocol

At process Pi
Before executing an event, Pi updates its local
logical time as follows
mtii, i mtii, i d (d gt 0)
Each sender process piggybacks a message m with
its matrix clock value at sending time. Upon
receiving such a message (m, vt) from Pj, Pi
updates its matrix clock as follows
for 1 lt k lt n mtii, k max(mtii, k ,
mtj, k )
for 1 lt k lt n
for 1 lt q lt n
mtik, q max(mtik, q , mtk, q )
3. mtii, i mtii, i d (d gt 0)

26
matrix clock consistency

The system of matrix clocks is strongly
consistent
Every event is assigned a timestamp in such a
manner that satisfy the clock consistency
condition
if e1? e2 then mt(e1 ) lt mt(e2 ), using matrix
comparison
where mt(ei ) is the timestamp assigned to
event ei
If the protocol satisfies the following condition
as well, then the clock is said to be strongly
consistent
if mt(e1 ) lt mt(e2 ) then e1?e2 , using matrix
comparison

27
Matrix comparison

Given two matrixes M1 and M2, both of size n by
n
M1 lt M2 if M1i, j lt V2i, j
for i 0, 1, , n, j 0, 1, , n
And there exist some k, 0 ltk ltn1, and some p, 0
ltp ltn1, such that M1k, p lt V2i, j
Because matrix clocks are strongly consistent, we
can use them to determine whether two events are
causally related by comparing their vector time
stamps

28
An application of matrix time Wuu and Bernstein
paper

The dictionary problem a dictionary is
replicated among multiple nodes. Each node
maintains a view of the dictionary independently
by performing operations on the dictionary
independently.
The network may be unreliable.
The dictionary data must be consistent among the
nodes.
Serializability (using locking) is the database
approach to address such a problem.
The paper (as did other papers preceding it)
describes an algorithm which does not require
serializability.

29
Wuu and Bernstein protocol

A replicated log is used to achieve mutual
consistency of replicated data in an unreliable
network.
The log contains records of invocations of
operations which access a data object.
Each node updates its local copy of the data
object by performing the operations contained in
its local copy of the log.
The operations are commutative so that the order
in which operations are performed does not affect
the final state of the data.

30
The problem environment

n nodes N1, N2, , Nn are connected over a
network.
Each node maintains a data dictionary V a set
of words s1, s2, , sn, stored in stable
storage impervious to crashes.
Vi denotes the local view of the dictionary at
Ni.
Two types of operations may be issued by any node
to perform on the dictionary
insert(x)
delete(x)
delete(x) can be invoked at Ni only if x is in Vi
note that the operation may be issued by
multiple nodes.
insert(x) can only be issued by one node.

31
The problem environment - 2

The unique event which inserts x is denoted ex.
An event which deletes x is called an x-delete
event
If V(e) is the dictionary view at a node after
the occurrence of event e, then x is in V(e) iff
ex -gt e and there does not exist an x-delete
event, g, such the g -gt e.

32
The log

Each node maintains a log of events L and a
distributed algorithm is employed to keep the
dictionary views up to date.
An event is recorded in the log as a
record/object containing these fields operation,
time, nodeID. For example
(add a, 3, 2) if Node 2 issued add a at its
local time 3.
The event record describing event e is denoted
eR
eR.node is the node that issues the event, eR.op
is the operation eR.time is the value of time
that the operation was issued.

33
The log

Nodes exchange messages containing appropriate
portions of the individually maintained log in
order to achieve data consistency.
L(e) denotes the contents of the log at a node
immediately after the event e completes.
The log problem
(p1) f-gte iff fR is in L(e)

34
A trivial solution

Each node i that generates an event e adds a
record for the event, eR, to its local log Li.
Each time the node sends a message, it includes
its log Li in the message.
Upon receiving a message, a node j looks at the
log enclosed in the message, and applies the
event in each record to its dictionary view Vj
The logs are maintained indefinitely. If a node
j is cut off from the network due to failures,
its dictionary view may fall behind other nodes,
but as soon as the network is repaired and
messages can be sent to node j again, then the
events logged by other nodes will be made known
to j eventually.

35
Trivial solution

The trivial solution
is fault-tolerant.
satisfies the log problem and the dictionary
problem.
The log maintained by each node i, Li, grows
unboundedly, which has these ramifications
The entire log is sent with each message
excessive communication costs
A new view of the dictionary is repeatedly
computed based on the log received in each
message excessive computational costs
The entire log is stored at each node excessive
storage costs.

36
Wuu and Bernstiens improved solutions

Uses matrix time to purge event records that have
already been seen by all participants.
Each node i maintains a matrix clock Ti
When i receives a log which contains a record for
event e, eR, initiated by node eR.node, it
determines if process k has already seen this
record by this predicate (boolean function)
boolean hasrec(Ti , eR, k)
return (Tik, eR.node gt eR.time)

37
Wuu and Bernstiens improved solutions pp.236-7

Kept at each node are
Vi the dictionary view, e.g .a, b, c
Pli a partial log of events
Initialization
Vi Pli // set both empty,
set matrix clock to all 0

38
Wuu and Bernstiens improved solutions pp.236-7

When node i issues insert(x)
Update matrix clock
Add the event record to the partial log Pli
Add x to Vi
When node i issues delete(x)
Update matrix clock
Add the event record to the partial log Pli
delete x from Vi

39
Wuu and Bernstiens improved solutions pp.236-7

When node i sends to node k
Create a subset of the partial log Pli,, NP,
consisting of those entries such that
Hasrec((Ti , eR, k) returns false.
Send the NP and Ti to node k.

40
Wuu and Bernsteins improved solutions pp.236-7

When node i receives from node k
Extract from the log received a subset, NE,
consisting of those entries such that
Hasrec((Ti , eR, i) returns false.
These entries have not already been seen by i.
Update the dictionary view Vi based on NE.
Update the matrix clock Ti
Add to the partial log Pli (note not NE) those
records in the log received such that
Hasrec((Ti , eR, j) returns false for at least
one j
Such a record has not been seen by at least one
other node.

41
Wuu and Bernsteins improved solutions pp.236-7

The size of the log sent with each message is
minimized based on the matrix clock.
The number of log entries based on which the
local dictionary view is updated is minimized,
again based on the matrix clock.
The algorithm will allow each log record to be
maintained by at least one node, so that
eventually that knowledge will be propagated to a
recovered node.

Write a Comment

User Comments (0)