Time and Global States - PowerPoint PPT Presentation

1 / 46

About This Presentation

Title:

Time and Global States

Description:

Notationally, when UTC time is t, the value of the clock on machine p is Cp(t) In a perfect world, Cp (t) = t for all p and all t. November 2, 2005 ... – PowerPoint PPT presentation

Number of Views:133

Avg rating:3.0/5.0

Slides: 47

Provided by: ruthda

Category:

more less

Transcript and Presenter's Notes

Title: Time and Global States

1
Time and Global States

ECEN5053 Software Engineering of Distributed
Systems
University of Colorado, Boulder

2
Topics

Clock synchronization
Logical clocks
Global State

3
How processes can synchronize

Multiple processes must be able to cooperate in
granting each other temporary exclusive access to
a resource
Also, multiple processes may need to agree on the
ordering of events, such as whether message m1
from process P was sent before or after message
m2 from process Q.

4
Centralized system

Time is unambiguous
If a process wants to know the time, it makes a
system call and finds out
If process A asks for the time and gets it and
then process B asks for the time and gets it, the
time that B was told will be later than the time
that A was told.
Simple, no?

5
Physical Clocks

Physical computer clocks are not clocks they are
timers
Quartz crystal that oscillates at a well-defined
frequency that depends on physical properties
Two registers counter and a holding register
Each oscillation decrements the counter by one
When counter reaches zero, generates an interrupt
and the counter is reloaded from the holding
register
Each interrupt is called a clock tick
Interrupt service procedure adds 1 to time stored
in memory so the software clock is kept up to date

6
The one and the many

What if the clock is off by a little?
All processes on single machine use the same
clock so they will still be internally consistent
What matters is relative time
Impossible to guarantee that crystals in
different computers run at exactly the same
frequency
Gradually software clocks get out of synch --
skew
A program that expects time to be independent of
the machine on which it is run ... fails

7
Hey buddy, can you spare me a second?

To provide UTC (translates as Universal
Coordinated Time) to those who need precise time,
NIST operates a short wave radio station WWV from
Fort Collins, CO
WWV broadcasts a short pulse at the start of each
second
There are stations in other countries plus
satellites
Using either short wave or satellite services
requires an accurate knowledge of the relative
position of the sender and receiver. Why?

8
To WWV or not to WWV

If one computer has a WWV receiver, the goal is
keeping all the others synchronized to it.
If no machines have WWV receivers, each machine
keeps track of its own time
Goal -- keep all machines together as well as
possible
There are many algorithms

9
Underlying model for synchronization models

Each machine has a timer that interrupts H times
a second
Interrupt handler adds 1 to a software clock that
keeps track of the number of ticks since some
agreed-upon time in the past
Call the value of the clock C
Notationally, when UTC time is t, the value of
the clock on machine p is Cp(t)
In a perfect world, Cp (t) t for all p and all t

10
Back to reality

Theoretically, a timer with H60 should generate
216,000 ticks per hour
Relative error is about 10-5 meaning a
particular machine gets a value in the range
215,998 to 216,002
There is a constant called the maximum drift rate
and a timer will work with perfect maximum
drift rate.
If two clocks are drifting in the opposite
direction at a time delta-t after they were
synchronized
may be as much as twice the max drift rate apart
To differ by no more than delta, clocks must be
resynchronized every (delta/2max-drift-rate)
seconds

11
Cristians algorithm

Well suited to one machine with a WWV receiver
and a goal to have all other machines stay
synchronized with it.
Call the one with the WWV receiver the time
server
Periodically, each machine sends a message to the
time server asking for the current time
Machine responds with CUTC as fast as it can
1st approximation, requester sets its clock to
CUTC
Whats wrong with that?

12
Big Trouble

Major problem
Time really should never run backward -- why?
If senders clock was fast, CUTC will be smaller
than the senders current value of C
Change must be introduced gradually
If timer generates 100 interrupts/second, each
interrupt adds 10 ms to the time
To slow down, ISR adds only 9 ms until correct
To speed up, add 11 ms at each interrupt

13
Little Trouble

Minor problem
Takes a nonzero amount of time for the time
servers reply to get back to the sender
Delay may be large and vary with network load
Cristian attempts to measure send and receive
times, subtract, divide by 2 add this to
received CUTC
Better length of time servers ISR, I, and
incoming message processing time (T1 - T0 -
I)/2
To improve accuracy, measure several and average

14
If no WWV Receiver

Berkeley UNIX algorithm
The time server (actually time daemon) is active,
not passive
It polls every machine and asks what time it is
Based on answers, it computes an average time and
tells all machines to adjust their clocks to the
new time
The time daemons time is set manually by the
operator periodically
Centralized algorithm though the time daemon does
not have a WWV receiver

15
Decentralized synchronization

Cristian and Berkeley UNIX are centralized
algorithms with the usual downside. What?
There are several decentralized algorithms, for
example
Divide time into fixed length resynchronization
intervals
At the beginning of each interval, every machine
broadcasts its current time
Each starts a local timer to collect all
broadcasts arriving during a certain interval
Algorithm to compute a new time based on some/all

16
Internet Synchronization

New hardware and software technology in the past
few years make it possible to keep millions of
clocks synchronized to within a few ms of UTC
New algorithms using these synchronized clocks
are beginning to appear
Synchronized clocks can be used
to achieve cache consistency
to use time-out tickets in distributed system
authentication
to handle commitment in atomic transactions

17
Logical Clocks

See also notes from 3 weeks ago
For many purposes, it is sufficient that machines
agree on the same time even if it is not the
right time
Internal consistency of the clocks matters
Clock synchronization is possible but does not
have to be absolute
If 2 processes do not interact, their clocks need
not be synchronized the lack of synch would not
be seen
What is important is that all processes agree on
the order in which events occur

18
Lamport timestamps

a happens-before b means that all processes agree
that first event a occurs, then afterward, event
b occurs
We write a happens-before b as a --gt b
If a occurs before b in the same process, we say
a --gt b is true
If the event a sends a message and event b
receives that message in another process, a --gt b
is also true because a message cannot be
received until after it is sent.
happens-before is transitive

19
Ya caint say

If x and y happen in different processes that do
not exchange messages, then
we cannot say x --gt y
we cannot say y --gt x
nothing can be said about when the events
happened or which event happened first
we call these events concurrent

20
Invent time

Need a way of measuring time so that for every
event a we can assign a time C(a) on which all
processes agree.
Such that, if a --gt b, then C(a) lt C(b)
If a and b are two events in the same process and
a happens before b, then C(a) lt C(b)
If a is the sending of a msg by one process and b
is the receiving of that msg by another, then
C(a) and C(b) must be assigned so that everyone
agrees on the values of C(a) and C(b) with C(a) lt
C(b)
Corrections to C can only be made by addition,
never subtraction so that the clock time always
goes forward

21
If msg leaves at time N, it arrives at gt N1

Each message carries the time according to its
senders clock
When it arrives, if the receivers clock shows a
value prior to the time the message was sent, the
receiver fast forwards its clock to be 1 more
than the sending time
Between every two events the clock must tick at
least once
If a process sends or receives 2 messages in
quick succession, it must advance its clock by
(at least) 1 tick in between
No 2 events ever occur at exactly the same time

22
Totally-ordered Multicast

Consider a bank with replicated data in San
Francisco and New York City.
Customer in SF wants to add 100 to the account
of 1000
Meanwhile, a bank employee in NY initiates an
update by which the customers account will be
increased with 1 interest.
Due to communication delays, the instructions
could arrive at the replicated sites in different
orders with differing final answers
Should have been performed at both sites in same
order

23
Using Lamport timestamps to get totally ordered
multicast

Consider group of processes multicasting messages
to each other
Each message is timestamped with the current
(logical) time of its sender
Conceptually, if multicast, the msg is also sent
to its sender
We assume msgs from the same sender are received
in the order they were sent and that no messages
were lost

24
totally ordered multicast (cont.)

When a process receives a message, it goes into a
local queue ordered according to its timestamp
The receiver multicasts an acknowledgement
Using Lamports algorithm for adjusting local
clocks, the timestamp of the received msg is
lower than the timestamp of the acknowledgement
All processes will eventually have the same copy
of the local queue because each msg is multicast,
plus acks
We assumed msgs are delivered in the order sent
by sender

25

totally ordered multicast (cont. more)

Each process inserts a received msg in its local
queue according to the timestamp in that msg.
Lamports clocks ensure no two messages have the
same timestamp
Also, the timestamps reflect a consistent global
ordering of events
A process delivers a queued msg to the
application it is running when that message is at
the head of the queue and has been acknowledged
by each other process
The msg removed from queue associated acks
removed.

26
Vector Timestamps

With Lamport timestamps, nothing can be said
about the relationship between a and b simply by
comparing their timestamps C(a) and C(b).
Just because C(a) lt C(b), doesnt mean a happened
before b (remember concurrent events)
Consider network news where processes post
articles and react to posted articles
Postings are multicast to all members
Want reactions delivered after associated postings

27
Will totally-ordered multicasting work?

That scheme does not mean that if msg B is
delivered after msg A, B is a reaction to msg A.
They may be completely independent.
Whats missing?
If causal relationships are maintained within a
group of processes, then receipt of a reaction to
an article should always follow the receipt of
the article.
If two items are independent, their order of
delivery should not matter at all

28
Vector Timestamps capture causality

VT(a) lt VT(b) means event a causally precedes
event b.
Let each process Pi maintain vector Vi such that
Vii is the number of events that have occurred
so far at Pi
If Vij k then Pi knows that k events have
occurred at Pj
We increment Vii at the occurrence of each new
event that happens at process Pi
Piggyback vectors with msgs that are sent. When
Pi sends msg m, it sends its current vector along
as a timestamp vt.

Receiver thus knows the number of events that
have occurred at Pi
Receiver is also told how many events at other
processes have taken place before Pi sent message
m.
timestamp vt of m tells the receiver how many
events in other processes have preceded m and on
which m may causally depend
When Pj receives m, it adjusts its own vector by
setting each entry Vjk to maxVjk, vtk
The vector now reflects the of msgs that Pj
must receive to have at least seen the same msgs
that preceded the sending of m.
Vji is incremented by 1 representing the event
of receiving msg m as the next message from Pi

30
When are messages delivered?

Vector timestamps are used to deliver msgs when
no causality constraints are violated.
When process Pi posts an article, it multicasts
that article as a msg a with timestamp vt(a) set
equal to Vi.
When another process Pj receives a, it will have
adjusted its own vector such that Vji gt
vt(a)i
Now suppose Pj posts a reaction by multicasting
msg r with timestamp vt(r) equal to Vj. vt(r)i
gt vt(a)i.
Both msg a and msg r will arrive at Pk in some
order

When receiving r, Pk inspects timestamp vt(r) and
decides to postpone delivery until all msgs that
causally precede r have been received as well.
In particular, r is delivered only if the
following conditions are met
vt(r)j Vkj 1
vt(r)i lt Vk i for all i not equal to j
says r is the next msg Pk was expecting from Pj
says Pk has seen no msg not seen by Pj when it
sent r. In particular, Pk has already seen
message a.

32
Controversy

There has been some debate about
whether support for totally-ordered and
causally-ordered multicasting should be provided
as part of the message-communication layer or
whether applications should handle ordering
Comm layer doesnt know what it contains, only
potential causality
2 msgs from same sender will always be marked as
causally related even if they are not
Application developer may not want to think about
it

33
Global State
34
Global state of a distributed system

Local state of each process
The messages that are currently in transit (sent
but not received)

35
Who cares, globally speaking?

When it is known that local computations have
stopped and that there are no more messages in
transit, the system has obviously entered a state
in which no more progress can be made.
deadlocked?
correctly terminated?

36
How to record the global state

Distributed snapshot
reflects a state in which the distributed system
might have been
reflects a consistent global state
If we have recorded that process P has received a
msg from another process Q, then we should also
have recorded that process Q had actually sent
the msg
The reverse condition (Q has sent a msg that P
has not yet received) is allowed.

37
Cut!

A cut represents the last event that has been
recorded for each of several processes.
All recorded msg receipts have a corresponding
recorded send event
An inconsistent cut would have a receipt of a msg
but no corresponding send event

38
The algorithm (Chandy Lamport)

Assume the distributed system can be represented
as a collection of processes connected to each
other through uni-directional point-to-point
communication channels.
Any process may initiate the algorithm.
P records its own local state
It sends a marker along each of its outgoing
channels, indicating that the receiver should
participate in recording the global state
...

39
Chandy Lamport algorithm, cont

When process Q receives the marker through an
incoming channel C, its action depends on whether
or not it has already saved its local state
If it has not
it first records its local state and also sends a
marker along its own outgoing channels
If it has
the marker on channel C is an indicator that Q
should record the state of the channel, namely,
the sequence of messages received by Q since the
last time it recorded its own local state and
before it received the marker.

40
Chandy Lamport algorithm, cont

A process has finished its part of the algorithm
when it has received a marker along each of its
incoming channels and processed each one.
Its recorded local state as well as the state it
recorded for each incoming channel, can be
collected and sent to the process that initiated
the snapshot
The initiator can subsequently analyze the
current state
Meanwhile, the distributed system as a whole can
continue to run normally

41
Photo album

Because any process can initiate the algorithm,
the construction of several snapshots may be in
progress at the same time
A marker is tagged with the identifier and
possibly also a version number of the process
that initiated the snapshot
Only after a process has received that marker
through each of its incoming channels, can it
finish its part in the construction of the
markers associated snapshot

42
Application of a snapshot
43
Termination Detection

If a process Q receives the marker requesting a
snapshot for the first time,
considers the process that sent that marker as
its predecessor
When Q completes its part of the snapshot, it
sends its predecessor a DONE msg.
By recursion, when the initiator of the
distributed snapshot has received a DONE msg from
all of its successors, it knows the snapshot has
been completely taken

44
What if msgs still in transit?

A snapshot may show a global state in which msgs
are still in transit
Suppose a process records that it had recd msgs
along one of its incoming channels
between the point where it had recorded its local
state
and the point where it received the marker
through that channel
Cannot conclude the distributed computation is
completed
Termination requires a snapshot in which all
channels are empty

45
Modified algorithm

When a process Q finishes its part of a snapshot,
it either returns DONE or CONTINUE to its
predecessor
A DONE msg is returned only when
All of Qs successors have returned a DONE msg
Q has not received any msg between the point it
recorded its own local state and the point it had
received the marker along each of its incoming
channels
In all other cases, Q sends a CONTINUE msg to its
predecessor

46
Modified algorithm, continued

The original initiator of the snapshot will
either receive at least one CONTINUE or only DONE
msgs from its successors
When only DONE messages are received, it is known
that no regular msgs are in transit
Conclusion? The computation has terminated.
If a CONTINUE appears, P initiates another
snapshot and continues to do so until only DONE
msgs are returned.
(There are lots of other algorithms, too.)