CS556: Distributed Systems - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

CS556: Distributed Systems

Description:

... and a nominal 'perfect clock' per unit of time measured by the reference ... Astronomical Time: ... rarely deleted, to keep in step with Astronomical Time ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 33
Provided by: mar177
Category:

less

Transcript and Presenter's Notes

Title: CS556: Distributed Systems


1
CS-556 Distributed Systems
Synchronization (I)
  • Manolis Marazakis
  • maraz_at_csd.uoc.gr

2
The issue of Time in distributed systems
  • A quantity that we often have to measure
    accurately
  • necessary to synchronize a nodes clock with an
    authoritative external source of time
  • Eg timestamps for electronic transactions
  • both at merchants banks computers
  • auditing
  • An important theoretical construct in
    understanding how distributed executions unfold
  • Algorithms for several problems depend upon clock
    synchronization
  • timestamp-based serialization of transactions for
    consistent updates of distributed data
  • Kerberos authentication protocol
  • elimination of duplicate updates

3
Clock Synchronization
  • When each machine has its own clock, an event
    that occurred after another event may
    nevertheless be assigned an earlier time.

4
Fundamental limits
The notion of physical time is problematic in
distributed systems - limitations in our
ability to timestamp events at different nodes
sufficiently accurately to know the order in
which any pair of events occurred, or whether
they occurred simultaneously.
5
History of Process pi
  • e i e
  • total ordering of events at process
  • Assuming that process executes on a single
    processor
  • history(pi) hi ltei0, ei1, ei2, ... gt
  • series of events that take place within pi
  • Hi(t) hardware clock value (by oscillator)
  • Ci(t) software clock value (generated by OS)
  • Ci(t) a Hi(t) b
  • Eg nsecs elapsed at time t since a reference
    time
  • clock resolution period bet. updates of Ci(t)
  • limit on determining order of events

6
Clock skew drift
  • Skew instantaneous difference bet. readings
  • Drift different rates of counting time
  • physical variations of underlying oscillators
  • variance with temperature
  • Even extremely small differences accumulate over
    a large number of oscillations
  • leading to observable difference in the counters
  • drift rate difference in reading bet. a clock
    and a nominal perfect clock per unit of time
    measured by the reference clock
  • 10-6 seconds/sec for quartz crystals
  • 10-7 - 10-8 seconds/sec for high precision quartz
    crystals

7
UTC Coordinated Universal Time
  • Atomic oscillators
  • drift rate 10-13 seconds/second
  • International Atomic Time (since 1967)
  • 1 standard sec 9,192,631,770 periods of
    transition for Cs133
  • Astronomical Time years, seconds, ...
  • UTC 1 leap sec is occasionally inserted, or more
    rarely deleted, to keep in step with Astronomical
    Time
  • time signals broadcasted from land-based radio
    stations (WWV) and satelites (GPS)
  • accuracy 0.1-10 millisec (land-based), 1
    microsec (GPS)

8
Synchronization of physical clocks
  • D synchronization bound
  • S source of UTC time, t I
  • External synchronization
  • S(t) - Ci(t) lt D
  • Clocks are accurate within the bound D
  • Internal synchronization
  • Ci(t) - Cj(t) lt D
  • Clocks agree within the bound D
  • external sync internal sync

9
Correctness of clocks
  • Hardware correctness
  • (1 - p)(t - t) H(t) - H(t) (1 p)(t -
    t)
  • There can be no jumps in the value of H/W clocks
  • Monotonicity
  • t gt t C(t) gt C(t)
  • A clock only ever advances
  • Even if a clock is running fast, we only need to
    change at which updates are made to the time
    given to apps
  • can be achieved in software Ci(t) a Hi(t) b
  • Hybrid
  • monotonicity drift rate bounded bet. sync.
    points (where clock value can jump ahead)

10
Synchronous systems
  • P1 sends its local clock value t to P2
  • P2 can set its clock value to (t Ttransmit)
  • Ttransmit can be variable or unknown
  • resource competition bet. processes
  • network congestion
  • u (max - min)
  • uncertainty in Ttransmit
  • obtained if P2 sets its clock to (t min) or (t
    max)
  • If P2 sets its clock value to t (maxmin)/2,
    then skew lt u/2
  • Optimal bound for N processes u (1 - )

In asynchronous systems Ttransmit min x,
where x 0 Only the distribution of x may be
measurable, for a given installation
11
Clock Synchronization Algorithms
  • The relation between clock time and UTC when
    clocks tick at different rates.

12
Time servers Christians algorithm
Receiver of UTC signals
Tround total round-trip time t time value
in message mt estimate (t Tround /2)
13
Cristian's Algorithm
  • Getting the current time from a time server.

14
Limitations of Cristians algorithm
  • Variability in estimate of Tround
  • can be reduced by repeated requests to S taking
    the minimum value of Tround
  • Single point of failure
  • group of synchronized time servers
  • multicast request use only 1st reply obtained
  • Faulty clocks
  • f faulty clocks, N servers
  • N gt 3f, for the correct clocks to achieve
    agreement
  • Malicious interference
  • Protection by authentication techniques

15
The Berkeley algorithm (I)
  • Gusella Zatti (1989)
  • Co-ordinator (master) periodically polls slaves
  • estimates each slaves local clock (based on RTT)
  • averages the values obtained (incl. its own clock
    value)
  • ignores any occasional readings with RTT higher
    than max
  • Slaves are notified of the adjustment required
  • This amount can be positive or negative
  • Sending the updated current time would introduce
    further uncertainty, due to message transmit
    delay
  • Elimination of faulty clocks
  • averaging over clocks that do not differ from one
    another more than a specified amount
  • Election of new master, in case of failure
  • no guarantee for election to complete in bounded
    time

16
The Berkeley Algorithm (II)
  • The time daemon asks all the other machines for
    their clock values
  • The machines answer
  • The time daemon tells everyone how to adjust
    their clock

17
Averaging algorithms
  • Divide time into fixed-length re-synchronization
    intervals T0 iR, T0 (i1)R
  • At the beginning of an interval, each machine
    broadcasts the current time according to its
    clock
  • and starts a local timer to collect all
    incoming broadcasts during a time interval S
  • When the broadcasts have been received, a new
    time value is computed
  • Average
  • Average after discarding the m lowest and the m
    highest values
  • tolerate up to m faulty machines
  • May also correct each value based on estimate of
    propagation time from the source machine

18
NTP An Internet-scale time protocol
  • Statistical filtering of timing data
  • discrimination based on quality of data from
    different servers
  • Re-configurable inter-server connections
  • logical hierarchy
  • Scalable for both clients servers
  • Clients can re-sync. frequently to offset drift
  • Authentication of trusted servers
  • and also validation of return addresses

Sync. Accuracy 10s of milliseconds over
Internet paths 1 millisecond on LANs
19
NTP Synchronization Subnets
Primary servers
stratum
High stratum ? server more liable to be less
accurate
Node ? root RTT as a quality criterion
  • 3 modes of synchronization
  • multicast acceptable for high-speed LAN
  • procedure-call similar to Cristians algorithm
  • symmetric between a pair of servers
  • All modes rely on UDP messages.

20
Message pairs bet. NTP peers (I)
  • Each message contains the local times when the
    previous
  • message was sent received, and the local time
    when the
  • current message was sent.
  • There can be a non-negligible delay bet. the
    arrival of one
  • message the dispatch of the next.
  • Messages may be lost

Offset oi estimate of the actual offset bet.
two clocks, as computed from a pair of
messages Delay di total transmission time for
the message pair
21
Message pairs bet. NTP peers (II)
T i-2 T i - 3 t o, where o is the true
offset
T i T i - 1 t - o
di t t T i-2 - T i - 3 Ti - T i - 1
o oi (t - t)/2
oi (T i-2 - T i - 3 - Ti T i - 1 ) / 2
oi - di / 2 o oi di /2
Delay di is a measure of the accuracy of the
estimate of offset
22
NTP data filtering peer selection
  • Retain 8 most recent ltoi, di gt pairs
  • compute filter dispersion metric
  • higher values ? less reliable data
  • The estimate of offset with min. delay is chosen
  • Examine values from several peers
  • look for relatively unreliable values
  • May switch the peer used primarily for sync.
  • Peers with low stratum are more favored
  • closer to primary time sources
  • Also favored are peers with lowest sync.
    dispersion
  • sum of filter dispersions bet. peer root of
    sync. subnet
  • May modify local clock update frequency wrt
    observed drift rate

23
Lamports notion of logical time
  • For many purposes, it is sufficient that all
    machines agree on the same time
  • Emphasis on internal consistency
  • If two processes do not interact, lack of
    synchronization will not be observable
  • and thus will not cause problems
  • Ordering of events is needed to avoid ambiguities

24
Lamport Timestamps
  • 3 processes, each with its own clock. The clocks
    run at different rates.
  • Lamport's algorithm corrects the clocks.

25
Space-Time diagram representation of a
distributed computation
26
The happened-before relation
  • We cannot synchronize clocks perfectly across a
    distributed system
  • cannot use physical time to find out event order
  • Lamport, 1978 happened-before partial order
  • (potential) causal ordering
  • e i e, for process Pi e e
  • send(m) receive(m), for any message m
  • e e and e e e e
  • concurrent events a // b
  • occur at different processes chain of
    messages intervening between them

27
Totally-Ordered Multicasting
  • Updating a replicated database leaving it in an
    inconsistent state.
  • Solution via multicast
  • Each msg is multicast, with timestamp current
    (logical) time
  • Recipient ACKs each message (via multicast)
  • Each process puts received messages in its local
    queue, sorted
  • according to the timestamp
  • A process only delivers a msg when it is at the
    head and
  • it has been ACKed by all processes

28
Lamports Logical Clocks (I)
  • Per-process monotonically increasing counters
  • Li Li 1, before each event is recorded at Pi
  • Clock value, t, is piggy-backed with messages
  • Upon receiving ltm ,tgt, Pj updates its clock
  • Lj max Lj, t, Lj Lj 1
  • Total order by taking into account process ID
  • (Ti, i) lt (Tj, j) iff (Ti lt Tj or (Ti Tj and i
    lt j) )

29
Lamports Logical Clocks (II)
p
1
a
b
m
1
Physical
p
2
time
c
d
m
2
p
3
e
f
L(b) gt L(e), but b // e
30
FIFO delivery causal delivery
31
Hidden channels
The relation captures the flow of data
intervening bet. events Data can flow in ways
other than message passing !
a pipe rapture, detected by sensor 1 b
pressure drop, detected by sensor 2
The pipe acts as comm. channel
Controller (P3) increases heat (to increase
pressure), then receives notification of rapture.
32
Vector Clocks
  • Mattern, 1989 Fidge, 1991
  • clock vector of N numbers (one per process)
  • Vi i Vi i 1, before Pi timestamps an
    event
  • Clock vector is piggybacked with messages
  • When Pi receives ltm ,tgt
  • Vi j max tj, Vi j , for j1, , N
  • Vi j, j i events that have occurred at Pj
    and has a (potential) effect on Pi
  • Vi i events that Pi has timestamped

e e V(e) lt V(e)
Write a Comment
User Comments (0)
About PowerShow.com