Logical Time - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Logical Time

Description:

'Causality among events, more formally the causal precedence relation, is a ... where vt(ei ) is the timestamp assigned to event ei ... – PowerPoint PPT presentation

Number of Views:123
Avg rating:3.0/5.0
Slides: 42
Provided by: cseS2
Category:
Tags: events | logical | time | vermont

less

Transcript and Presenter's Notes

Title: Logical Time


1
Logical Time
  • M. Liu

2
Introduction
  • The concept of logical time has its origin in a
    seminal paper by Leslie Lamport Time, Clocks,
    and the Ordering of Events in a Distributed
    System, Communications of ACM, July 1978.
  • The topic remains of interest a recent paper
    appeared in Computer Capturing Causality in
    Distributed System by Raynal and Singhal (see
    handout).

3
Application of Logical Time
  • Logical Time in Visualizations Produced by
    Parallel Computations
  • Banker system algorithm.
  • Efficient solutions to the Replicated Log and
    Dictionary problems by Wuu Bernstein.

4
Background 1 source Raynal and Singhal
  • A distributed computation consists of a set of
    processes that cooperate and compete to achieve a
    common goal. These processes do not share a
    common global memory and communicate solely by
    passing messages over a communication network.

5
Background 2 source Raynal and Singhal
  • In a distributed system, a process's actions are
    modeled as three types of events internal,
    message send, and message receive.
  • An internal event affects only the process at
    which it occurs, and the events at a process are
    linearly ordered by their order of occurrence.
  • Send and receive events signify the flow of
    information between processes and establish
    causal dependency from the sender process to the
    receiver process.

6
Background 3 source Raynal and Singhal
  • The execution of a distributed application
    results in a set of distributed events produced
    by the process.
  • The causal precedence relation induces a partial
    order on the events of a distributed computation.

7
Background 4 source Raynal and Singhal
  • Causality among events, more formally the
    causal precedence relation, is a powerful concept
    for reasoning, analyzing, and drawing inferences
    about a distributed computation. Knowledge of the
    causal precedence relation between processes
    helps programmers, designers, and the system
    itself solve a variety of problems in distributed
    computing.

8
Background 5 source Raynal and Singhal
  • The notion of time is basic to capturing the
    causality between events. Distributed systems
    have no built-in physical time and can only
    approximate it. However, in a distributed
    computation, both the progress and the
    interaction between processes occur in spurts.
    Consequently, logical clocks can be used to
    accurately capture the causality relation between
    events.
  • This article presents a general framework of a
    system of logical clocks in distributed systems
    and discusses three methods--scalar, vector, and
    matrix--for implementing logical time in these
    systems.
  • .

9
Notations
  • A distributed program is composed of a set of n
    independent and asynchronous processes p1, p2, ,
    pi, , pn. These processes do not share a global
    clock.
  • Each process can execute an event spontaneously
    when sending a message, it does not have to wait
    for the delivery to be complete.
  • The execution of each process pi produces a
    sequence of events ei0,ei1,.,eix,ei x1, .
  • The set of events produced by pi have a total
    order determined by the sequencing of the events
    eix ? ei x1
  • We say that eix happens before ei x1.
  • The happen-before relation ? is transitive eii
    ? eij for all i lt j.

10
Notations - 2
  • Events that occur between two concurrent
    processes are generally unrelated, except for
    those that are causally related as follows
  • for every message m exchanged between two
    processes Pi and Pj, we have eix send(m),
    ejyreceive(m), and
  • eix ? ejy
  • Events in a distributed execution are partially
    ordered
  • Local events are totally ordered.
  • Causal events are totally ordered.
  • All other events are unordered.
  • For any two events e1 and e2 in a distributed
    execution, either
  • (i) e1?e2, (ii) e2?e1, or (iii) e1e2 (that is,
    e1 and e2 are concurrent).

11
Which of these events are ? related? Which ones
are concurrent?
12
Clock conditions
  • In a system of logical clocks, every
    participating process has a logical clock that is
    advanced according to a protocol.
  • Every event is assigned a timestamp in such a
    manner that satisfy the clock consistency
    condition
  • if e1?e2 then C(e1 ) lt C(e2 )
  • where C(ei ) is the timestamp assigned to
    event ei
  • If the protocol satisfies the following condition
    as well, then the clock is said to be strongly
    consistent
  • if C(e1 ) lt C(e2 ) then e1?e2

13
A logical clock implementation - the Lamport
Clock
  • R1 Before executing an event(send, receive, or
    internal), pi executes the following
  • Ci Ci d (d gt 0, usually d 1)
  • R2 Each message carries the clock value of its
    sender at sending time. When pi receives a
    message with the timestamp Cmsg, it executes the
    following
  • Ci max(Ci , Cmsg )
  • Execute R1.
  • Deliver the message.
  • The logical clock at any process is
    monotonically increasing.

14
Fill in the logical clock values
15
Correctness of the Lamport Clock
  • Does the Lamport clock satisfy the clock
    consistency condition?
  • Does the Lamport clock satisfy the strong clock
    consistency condition?

16
Logical Clock Protocols
  • The Lamport Clock is an example of a logical
    clock protocol. There are others.
  • The Lamport Clock is a scalar clock it uses a
    single integer to represent the clock value.

17
Lamport clock paper
  • PODC Influential Paper Award 2000,
    http//www.podc.org/influential/2000.html
  • Time, clocks, and the ordering of events in a
    distributed system by Leslie Lamport, obtainable
    from the ACM Digital Library.

18
An application of scalar logical time bank
system algorithm
  • See bank system algorithm slides

19
Vector Logical Clock
  • Developed by several persons independently.
  • Each Pi of n participating processes maintains a
    integer vector (array) of size n
  • vti1,n, where vtii is the local logical
    clock of pi,
  • vtij represents pis latest knowledge of Pjs
    local time.

20
Vector clock protocol
  • At process Pi
  • Before executing an event, Pi updates its local
    logical time as follows
  • vtii vtii d (d gt 0)
  • Each sender process piggybacks a message m with
    its vector clock value at sending time. Upon
    receiving such a message (m, vt), Pi updates its
    vector clock as follows
  • For 1 lt k lt n vtik max(vtik , vtk)
  • vtii vtii d (d gt 0)

21
Vector clock
  • The system of vector clocks is strongly
    consistent
  • Every event is assigned a timestamp in such a
    manner that satisfies the clock consistency
    condition
  • if e1?e2 then vt(e1 ) lt vt(e2 ), using vector
    comparison
  • where vt(ei ) is the timestamp assigned to
    event ei
  • If the protocol satisfies the following condition
    as well, then the clock is said to be strongly
    consistent
  • if vt(e1 ) lt vt(e2 ) then e1?e2 , using vector
    comparison

22
Vector comparison
  • Given two vectors V1 and V2, both of size n
  • V1 lt V2 if V1i lt V2i for i 1, , n
  • And there exists some k, 0 lt k lt n1, such that
    V1k lt V2k
  • Example V1 1, 2, 3, 4 V2 2, 3, 4, 5
  • V1 lt V2
  • Example V1 1, 2, 3, 4 V2 2, 2, 4, 4
  • V1 (not) lt V2
  • Example V1 1, 2, 3, 4 V2 2, 3, 4, 1
  • V1 (not) lt V2

23
Vector clock
  • Because vector clocks are strongly consistent, we
    can use them to determine whether two events are
    causally related by comparing their vector time
    stamps, using vector comparison.

24
Matrix Time
  • Proposed by Michael and Fischer in 1982.
  • A process Pi maintains a matrix
  • mti1n, 1n where
  • mtii, i denotes the logical clock of Pi
  • mtii, j denotes the latest knowledge that Pi
    has about the local clock, mtjj, j of Pj (row i
    is the vector clock of Pi .
  • mtij, k represents what Pi knows about the
    latest knowledge that Pj has about the local
    logical clock mtkk, k of Pk.

25
Matrix Time Protocol
  • At process Pi
  • Before executing an event, Pi updates its local
    logical time as follows
  • mtii, i mtii, i d (d gt 0)
  • Each sender process piggybacks a message m with
    its matrix clock value at sending time. Upon
    receiving such a message (m, vt) from Pj, Pi
    updates its matrix clock as follows
  • for 1 lt k lt n mtii, k max(mtii, k ,
    mtj, k )
  • for 1 lt k lt n
  • for 1 lt q lt n
  • mtik, q max(mtik, q , mtk, q )
  • 3. mtii, i mtii, i d (d gt 0)

26
matrix clock consistency
  • The system of matrix clocks is strongly
    consistent
  • Every event is assigned a timestamp in such a
    manner that satisfy the clock consistency
    condition
  • if e1? e2 then mt(e1 ) lt mt(e2 ), using matrix
    comparison
  • where mt(ei ) is the timestamp assigned to
    event ei
  • If the protocol satisfies the following condition
    as well, then the clock is said to be strongly
    consistent
  • if mt(e1 ) lt mt(e2 ) then e1?e2 , using matrix
    comparison

27
Matrix comparison
  • Given two matrixes M1 and M2, both of size n by
    n
  • M1 lt M2 if M1i, j lt V2i, j
  • for i 0, 1, , n, j 0, 1, , n
  • And there exist some k, 0 ltk ltn1, and some p, 0
    ltp ltn1, such that M1k, p lt V2i, j
  • Because matrix clocks are strongly consistent, we
    can use them to determine whether two events are
    causally related by comparing their vector time
    stamps

28
An application of matrix time Wuu and Bernstein
paper
  • The dictionary problem a dictionary is
    replicated among multiple nodes. Each node
    maintains a view of the dictionary independently
    by performing operations on the dictionary
    independently.
  • The network may be unreliable.
  • The dictionary data must be consistent among the
    nodes.
  • Serializability (using locking) is the database
    approach to address such a problem.
  • The paper (as did other papers preceding it)
    describes an algorithm which does not require
    serializability.

29
Wuu and Bernstein protocol
  • A replicated log is used to achieve mutual
    consistency of replicated data in an unreliable
    network.
  • The log contains records of invocations of
    operations which access a data object.
  • Each node updates its local copy of the data
    object by performing the operations contained in
    its local copy of the log.
  • The operations are commutative so that the order
    in which operations are performed does not affect
    the final state of the data.

30
The problem environment
  • n nodes N1, N2, , Nn are connected over a
    network.
  • Each node maintains a data dictionary V a set
    of words s1, s2, , sn, stored in stable
    storage impervious to crashes.
  • Vi denotes the local view of the dictionary at
    Ni.
  • Two types of operations may be issued by any node
    to perform on the dictionary
  • insert(x)
  • delete(x)
  • delete(x) can be invoked at Ni only if x is in Vi
    note that the operation may be issued by
    multiple nodes.
  • insert(x) can only be issued by one node.

31
The problem environment - 2
  • The unique event which inserts x is denoted ex.
  • An event which deletes x is called an x-delete
    event
  • If V(e) is the dictionary view at a node after
    the occurrence of event e, then x is in V(e) iff
    ex -gt e and there does not exist an x-delete
    event, g, such the g -gt e.

32
The log
  • Each node maintains a log of events L and a
    distributed algorithm is employed to keep the
    dictionary views up to date.
  • An event is recorded in the log as a
    record/object containing these fields operation,
    time, nodeID. For example
  • (add a, 3, 2) if Node 2 issued add a at its
    local time 3.
  • The event record describing event e is denoted
    eR
  • eR.node is the node that issues the event, eR.op
    is the operation eR.time is the value of time
    that the operation was issued.

33
The log
  • Nodes exchange messages containing appropriate
    portions of the individually maintained log in
    order to achieve data consistency.
  • L(e) denotes the contents of the log at a node
    immediately after the event e completes.
  • The log problem
  • (p1) f-gte iff fR is in L(e)

34
A trivial solution
  • Each node i that generates an event e adds a
    record for the event, eR, to its local log Li.
  • Each time the node sends a message, it includes
    its log Li in the message.
  • Upon receiving a message, a node j looks at the
    log enclosed in the message, and applies the
    event in each record to its dictionary view Vj
  • The logs are maintained indefinitely. If a node
    j is cut off from the network due to failures,
    its dictionary view may fall behind other nodes,
    but as soon as the network is repaired and
    messages can be sent to node j again, then the
    events logged by other nodes will be made known
    to j eventually.

35
Trivial solution
  • The trivial solution
  • is fault-tolerant.
  • satisfies the log problem and the dictionary
    problem.
  • The log maintained by each node i, Li, grows
    unboundedly, which has these ramifications
  • The entire log is sent with each message
    excessive communication costs
  • A new view of the dictionary is repeatedly
    computed based on the log received in each
    message excessive computational costs
  • The entire log is stored at each node excessive
    storage costs.

36
Wuu and Bernstiens improved solutions
  • Uses matrix time to purge event records that have
    already been seen by all participants.
  • Each node i maintains a matrix clock Ti
  • When i receives a log which contains a record for
    event e, eR, initiated by node eR.node, it
    determines if process k has already seen this
    record by this predicate (boolean function)
  • boolean hasrec(Ti , eR, k)
  • return (Tik, eR.node gt eR.time)

37
Wuu and Bernstiens improved solutions pp.236-7
  • Kept at each node are
  • Vi the dictionary view, e.g .a, b, c
  • Pli a partial log of events
  • Initialization
  • Vi Pli // set both empty,
  • set matrix clock to all 0

38
Wuu and Bernstiens improved solutions pp.236-7
  • When node i issues insert(x)
  • Update matrix clock
  • Add the event record to the partial log Pli
  • Add x to Vi
  • When node i issues delete(x)
  • Update matrix clock
  • Add the event record to the partial log Pli
  • delete x from Vi

39
Wuu and Bernstiens improved solutions pp.236-7
  • When node i sends to node k
  • Create a subset of the partial log Pli,, NP,
    consisting of those entries such that
  • Hasrec((Ti , eR, k) returns false.
  • Send the NP and Ti to node k.

40
Wuu and Bernsteins improved solutions pp.236-7
  • When node i receives from node k
  • Extract from the log received a subset, NE,
    consisting of those entries such that
  • Hasrec((Ti , eR, i) returns false.
  • These entries have not already been seen by i.
  • Update the dictionary view Vi based on NE.
  • Update the matrix clock Ti
  • Add to the partial log Pli (note not NE) those
    records in the log received such that
  • Hasrec((Ti , eR, j) returns false for at least
    one j
  • Such a record has not been seen by at least one
    other node.

41
Wuu and Bernsteins improved solutions pp.236-7
  • The size of the log sent with each message is
    minimized based on the matrix clock.
  • The number of log entries based on which the
    local dictionary view is updated is minimized,
    again based on the matrix clock.
  • The algorithm will allow each log record to be
    maintained by at least one node, so that
    eventually that knowledge will be propagated to a
    recovered node.
Write a Comment
User Comments (0)
About PowerShow.com