Time and Global States - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

Time and Global States

Description:

Notationally, when UTC time is t, the value of the clock on machine p is Cp(t) In a perfect world, Cp (t) = t for all p and all t. November 2, 2005 ... – PowerPoint PPT presentation

Number of Views:133
Avg rating:3.0/5.0
Slides: 47
Provided by: ruthda
Category:
Tags: clock | global | states | time | world

less

Transcript and Presenter's Notes

Title: Time and Global States


1
Time and Global States
  • ECEN5053 Software Engineering of Distributed
    Systems
  • University of Colorado, Boulder

2
Topics
  • Clock synchronization
  • Logical clocks
  • Global State

3
How processes can synchronize
  • Multiple processes must be able to cooperate in
    granting each other temporary exclusive access to
    a resource
  • Also, multiple processes may need to agree on the
    ordering of events, such as whether message m1
    from process P was sent before or after message
    m2 from process Q.

4
Centralized system
  • Time is unambiguous
  • If a process wants to know the time, it makes a
    system call and finds out
  • If process A asks for the time and gets it and
    then process B asks for the time and gets it, the
    time that B was told will be later than the time
    that A was told.
  • Simple, no?

5
Physical Clocks
  • Physical computer clocks are not clocks they are
    timers
  • Quartz crystal that oscillates at a well-defined
    frequency that depends on physical properties
  • Two registers counter and a holding register
  • Each oscillation decrements the counter by one
  • When counter reaches zero, generates an interrupt
    and the counter is reloaded from the holding
    register
  • Each interrupt is called a clock tick
  • Interrupt service procedure adds 1 to time stored
    in memory so the software clock is kept up to date

6
The one and the many
  • What if the clock is off by a little?
  • All processes on single machine use the same
    clock so they will still be internally consistent
  • What matters is relative time
  • Impossible to guarantee that crystals in
    different computers run at exactly the same
    frequency
  • Gradually software clocks get out of synch --
    skew
  • A program that expects time to be independent of
    the machine on which it is run ... fails

7
Hey buddy, can you spare me a second?
  • To provide UTC (translates as Universal
    Coordinated Time) to those who need precise time,
    NIST operates a short wave radio station WWV from
    Fort Collins, CO
  • WWV broadcasts a short pulse at the start of each
    second
  • There are stations in other countries plus
    satellites
  • Using either short wave or satellite services
    requires an accurate knowledge of the relative
    position of the sender and receiver. Why?

8
To WWV or not to WWV
  • If one computer has a WWV receiver, the goal is
    keeping all the others synchronized to it.
  • If no machines have WWV receivers, each machine
    keeps track of its own time
  • Goal -- keep all machines together as well as
    possible
  • There are many algorithms

9
Underlying model for synchronization models
  • Each machine has a timer that interrupts H times
    a second
  • Interrupt handler adds 1 to a software clock that
    keeps track of the number of ticks since some
    agreed-upon time in the past
  • Call the value of the clock C
  • Notationally, when UTC time is t, the value of
    the clock on machine p is Cp(t)
  • In a perfect world, Cp (t) t for all p and all t

10
Back to reality
  • Theoretically, a timer with H60 should generate
    216,000 ticks per hour
  • Relative error is about 10-5 meaning a
    particular machine gets a value in the range
    215,998 to 216,002
  • There is a constant called the maximum drift rate
    and a timer will work with perfect maximum
    drift rate.
  • If two clocks are drifting in the opposite
    direction at a time delta-t after they were
    synchronized
  • may be as much as twice the max drift rate apart
  • To differ by no more than delta, clocks must be
    resynchronized every (delta/2max-drift-rate)
    seconds

11
Cristians algorithm
  • Well suited to one machine with a WWV receiver
    and a goal to have all other machines stay
    synchronized with it.
  • Call the one with the WWV receiver the time
    server
  • Periodically, each machine sends a message to the
    time server asking for the current time
  • Machine responds with CUTC as fast as it can
  • 1st approximation, requester sets its clock to
    CUTC
  • Whats wrong with that?

12
Big Trouble
  • Major problem
  • Time really should never run backward -- why?
  • If senders clock was fast, CUTC will be smaller
    than the senders current value of C
  • Change must be introduced gradually
  • If timer generates 100 interrupts/second, each
    interrupt adds 10 ms to the time
  • To slow down, ISR adds only 9 ms until correct
  • To speed up, add 11 ms at each interrupt

13
Little Trouble
  • Minor problem
  • Takes a nonzero amount of time for the time
    servers reply to get back to the sender
  • Delay may be large and vary with network load
  • Cristian attempts to measure send and receive
    times, subtract, divide by 2 add this to
    received CUTC
  • Better length of time servers ISR, I, and
    incoming message processing time (T1 - T0 -
    I)/2
  • To improve accuracy, measure several and average

14
If no WWV Receiver
  • Berkeley UNIX algorithm
  • The time server (actually time daemon) is active,
    not passive
  • It polls every machine and asks what time it is
  • Based on answers, it computes an average time and
    tells all machines to adjust their clocks to the
    new time
  • The time daemons time is set manually by the
    operator periodically
  • Centralized algorithm though the time daemon does
    not have a WWV receiver

15
Decentralized synchronization
  • Cristian and Berkeley UNIX are centralized
    algorithms with the usual downside. What?
  • There are several decentralized algorithms, for
    example
  • Divide time into fixed length resynchronization
    intervals
  • At the beginning of each interval, every machine
    broadcasts its current time
  • Each starts a local timer to collect all
    broadcasts arriving during a certain interval
  • Algorithm to compute a new time based on some/all

16
Internet Synchronization
  • New hardware and software technology in the past
    few years make it possible to keep millions of
    clocks synchronized to within a few ms of UTC
  • New algorithms using these synchronized clocks
    are beginning to appear
  • Synchronized clocks can be used
  • to achieve cache consistency
  • to use time-out tickets in distributed system
    authentication
  • to handle commitment in atomic transactions

17
Logical Clocks
  • See also notes from 3 weeks ago
  • For many purposes, it is sufficient that machines
    agree on the same time even if it is not the
    right time
  • Internal consistency of the clocks matters
  • Clock synchronization is possible but does not
    have to be absolute
  • If 2 processes do not interact, their clocks need
    not be synchronized the lack of synch would not
    be seen
  • What is important is that all processes agree on
    the order in which events occur

18
Lamport timestamps
  • a happens-before b means that all processes agree
    that first event a occurs, then afterward, event
    b occurs
  • We write a happens-before b as a --gt b
  • If a occurs before b in the same process, we say
    a --gt b is true
  • If the event a sends a message and event b
    receives that message in another process, a --gt b
    is also true because a message cannot be
    received until after it is sent.
  • happens-before is transitive

19
Ya caint say
  • If x and y happen in different processes that do
    not exchange messages, then
  • we cannot say x --gt y
  • we cannot say y --gt x
  • nothing can be said about when the events
    happened or which event happened first
  • we call these events concurrent

20
Invent time
  • Need a way of measuring time so that for every
    event a we can assign a time C(a) on which all
    processes agree.
  • Such that, if a --gt b, then C(a) lt C(b)
  • If a and b are two events in the same process and
    a happens before b, then C(a) lt C(b)
  • If a is the sending of a msg by one process and b
    is the receiving of that msg by another, then
    C(a) and C(b) must be assigned so that everyone
    agrees on the values of C(a) and C(b) with C(a) lt
    C(b)
  • Corrections to C can only be made by addition,
    never subtraction so that the clock time always
    goes forward

21
If msg leaves at time N, it arrives at gt N1
  • Each message carries the time according to its
    senders clock
  • When it arrives, if the receivers clock shows a
    value prior to the time the message was sent, the
    receiver fast forwards its clock to be 1 more
    than the sending time
  • Between every two events the clock must tick at
    least once
  • If a process sends or receives 2 messages in
    quick succession, it must advance its clock by
    (at least) 1 tick in between
  • No 2 events ever occur at exactly the same time

22
Totally-ordered Multicast
  • Consider a bank with replicated data in San
    Francisco and New York City.
  • Customer in SF wants to add 100 to the account
    of 1000
  • Meanwhile, a bank employee in NY initiates an
    update by which the customers account will be
    increased with 1 interest.
  • Due to communication delays, the instructions
    could arrive at the replicated sites in different
    orders with differing final answers
  • Should have been performed at both sites in same
    order

23
Using Lamport timestamps to get totally ordered
multicast
  • Consider group of processes multicasting messages
    to each other
  • Each message is timestamped with the current
    (logical) time of its sender
  • Conceptually, if multicast, the msg is also sent
    to its sender
  • We assume msgs from the same sender are received
    in the order they were sent and that no messages
    were lost

24
totally ordered multicast (cont.)
  • When a process receives a message, it goes into a
    local queue ordered according to its timestamp
  • The receiver multicasts an acknowledgement
  • Using Lamports algorithm for adjusting local
    clocks, the timestamp of the received msg is
    lower than the timestamp of the acknowledgement
  • All processes will eventually have the same copy
    of the local queue because each msg is multicast,
    plus acks
  • We assumed msgs are delivered in the order sent
    by sender

25

totally ordered multicast (cont. more)
  • Each process inserts a received msg in its local
    queue according to the timestamp in that msg.
  • Lamports clocks ensure no two messages have the
    same timestamp
  • Also, the timestamps reflect a consistent global
    ordering of events
  • A process delivers a queued msg to the
    application it is running when that message is at
    the head of the queue and has been acknowledged
    by each other process
  • The msg removed from queue associated acks
    removed.

26
Vector Timestamps
  • With Lamport timestamps, nothing can be said
    about the relationship between a and b simply by
    comparing their timestamps C(a) and C(b).
  • Just because C(a) lt C(b), doesnt mean a happened
    before b (remember concurrent events)
  • Consider network news where processes post
    articles and react to posted articles
  • Postings are multicast to all members
  • Want reactions delivered after associated postings

27
Will totally-ordered multicasting work?
  • That scheme does not mean that if msg B is
    delivered after msg A, B is a reaction to msg A.
    They may be completely independent.
  • Whats missing?
  • If causal relationships are maintained within a
    group of processes, then receipt of a reaction to
    an article should always follow the receipt of
    the article.
  • If two items are independent, their order of
    delivery should not matter at all

28
Vector Timestamps capture causality
  • VT(a) lt VT(b) means event a causally precedes
    event b.
  • Let each process Pi maintain vector Vi such that
  • Vii is the number of events that have occurred
    so far at Pi
  • If Vij k then Pi knows that k events have
    occurred at Pj
  • We increment Vii at the occurrence of each new
    event that happens at process Pi
  • Piggyback vectors with msgs that are sent. When
    Pi sends msg m, it sends its current vector along
    as a timestamp vt.

29
  • Receiver thus knows the number of events that
    have occurred at Pi
  • Receiver is also told how many events at other
    processes have taken place before Pi sent message
    m.
  • timestamp vt of m tells the receiver how many
    events in other processes have preceded m and on
    which m may causally depend
  • When Pj receives m, it adjusts its own vector by
    setting each entry Vjk to maxVjk, vtk
  • The vector now reflects the of msgs that Pj
    must receive to have at least seen the same msgs
    that preceded the sending of m.
  • Vji is incremented by 1 representing the event
    of receiving msg m as the next message from Pi

30
When are messages delivered?
  • Vector timestamps are used to deliver msgs when
    no causality constraints are violated.
  • When process Pi posts an article, it multicasts
    that article as a msg a with timestamp vt(a) set
    equal to Vi.
  • When another process Pj receives a, it will have
    adjusted its own vector such that Vji gt
    vt(a)i
  • Now suppose Pj posts a reaction by multicasting
    msg r with timestamp vt(r) equal to Vj. vt(r)i
    gt vt(a)i.
  • Both msg a and msg r will arrive at Pk in some
    order

31
  • When receiving r, Pk inspects timestamp vt(r) and
    decides to postpone delivery until all msgs that
    causally precede r have been received as well.
  • In particular, r is delivered only if the
    following conditions are met
  • vt(r)j Vkj 1
  • vt(r)i lt Vk i for all i not equal to j
  • says r is the next msg Pk was expecting from Pj
  • says Pk has seen no msg not seen by Pj when it
    sent r. In particular, Pk has already seen
    message a.

32
Controversy
  • There has been some debate about
  • whether support for totally-ordered and
    causally-ordered multicasting should be provided
    as part of the message-communication layer or
  • whether applications should handle ordering
  • Comm layer doesnt know what it contains, only
    potential causality
  • 2 msgs from same sender will always be marked as
    causally related even if they are not
  • Application developer may not want to think about
    it

33
Global State
34
Global state of a distributed system
  • Local state of each process
  • The messages that are currently in transit (sent
    but not received)

35
Who cares, globally speaking?
  • When it is known that local computations have
    stopped and that there are no more messages in
    transit, the system has obviously entered a state
    in which no more progress can be made.
  • deadlocked?
  • correctly terminated?

36
How to record the global state
  • Distributed snapshot
  • reflects a state in which the distributed system
    might have been
  • reflects a consistent global state
  • If we have recorded that process P has received a
    msg from another process Q, then we should also
    have recorded that process Q had actually sent
    the msg
  • The reverse condition (Q has sent a msg that P
    has not yet received) is allowed.

37
Cut!
  • A cut represents the last event that has been
    recorded for each of several processes.
  • All recorded msg receipts have a corresponding
    recorded send event
  • An inconsistent cut would have a receipt of a msg
    but no corresponding send event

38
The algorithm (Chandy Lamport)
  • Assume the distributed system can be represented
    as a collection of processes connected to each
    other through uni-directional point-to-point
    communication channels.
  • Any process may initiate the algorithm.
  • P records its own local state
  • It sends a marker along each of its outgoing
    channels, indicating that the receiver should
    participate in recording the global state
  • ...

39
Chandy Lamport algorithm, cont
  • When process Q receives the marker through an
    incoming channel C, its action depends on whether
    or not it has already saved its local state
  • If it has not
  • it first records its local state and also sends a
    marker along its own outgoing channels
  • If it has
  • the marker on channel C is an indicator that Q
    should record the state of the channel, namely,
    the sequence of messages received by Q since the
    last time it recorded its own local state and
    before it received the marker.

40
Chandy Lamport algorithm, cont
  • A process has finished its part of the algorithm
    when it has received a marker along each of its
    incoming channels and processed each one.
  • Its recorded local state as well as the state it
    recorded for each incoming channel, can be
    collected and sent to the process that initiated
    the snapshot
  • The initiator can subsequently analyze the
    current state
  • Meanwhile, the distributed system as a whole can
    continue to run normally

41
Photo album
  • Because any process can initiate the algorithm,
    the construction of several snapshots may be in
    progress at the same time
  • A marker is tagged with the identifier and
    possibly also a version number of the process
    that initiated the snapshot
  • Only after a process has received that marker
    through each of its incoming channels, can it
    finish its part in the construction of the
    markers associated snapshot

42
Application of a snapshot
43
Termination Detection
  • If a process Q receives the marker requesting a
    snapshot for the first time,
  • considers the process that sent that marker as
    its predecessor
  • When Q completes its part of the snapshot, it
    sends its predecessor a DONE msg.
  • By recursion, when the initiator of the
    distributed snapshot has received a DONE msg from
    all of its successors, it knows the snapshot has
    been completely taken

44
What if msgs still in transit?
  • A snapshot may show a global state in which msgs
    are still in transit
  • Suppose a process records that it had recd msgs
    along one of its incoming channels
  • between the point where it had recorded its local
    state
  • and the point where it received the marker
    through that channel
  • Cannot conclude the distributed computation is
    completed
  • Termination requires a snapshot in which all
    channels are empty

45
Modified algorithm
  • When a process Q finishes its part of a snapshot,
    it either returns DONE or CONTINUE to its
    predecessor
  • A DONE msg is returned only when
  • All of Qs successors have returned a DONE msg
  • Q has not received any msg between the point it
    recorded its own local state and the point it had
    received the marker along each of its incoming
    channels
  • In all other cases, Q sends a CONTINUE msg to its
    predecessor

46
Modified algorithm, continued
  • The original initiator of the snapshot will
    either receive at least one CONTINUE or only DONE
    msgs from its successors
  • When only DONE messages are received, it is known
    that no regular msgs are in transit
  • Conclusion? The computation has terminated.
  • If a CONTINUE appears, P initiates another
    snapshot and continues to do so until only DONE
    msgs are returned.
  • (There are lots of other algorithms, too.)
Write a Comment
User Comments (0)
About PowerShow.com