Reliable Distributed Systems presentation

About This Presentation

Transcript and Presenter's Notes

Title: Reliable Distributed Systems

1
Reliable Distributed Systems

Real Time Systems

2
Topics for this lecture

Adding clocks to distributed systems
Also, to non-distributed ones
Well just touch on real-time operating systems
and scheduling an active area until recently
Recent machines are so fast that importance of
topic is now reduced
Using time in broadcast protocols
Comparison of CASD with gossip

3
Many systems need real-time

Air traffic control talks about when planes
will be at various locations
Doctors talk about how patients responded after
drug was given, or change therapy after some
amount of time
Process control software runs factor floors by
coordinating what machine tools do, and when

4
Many systems need real-time

Video and multi-media systems need isochronous
communication protocols that coordinate video,
voice, and other data sources
Telecommunications systems must guarantee
real-time response despite failures, for example
when switching telephone calls

5
Real time in real systems

These real systems combine the need for logical
consistency and coordination (for control) with
the need for physical coordination and timely
behavior
Issue becomes one of integrating real-time tools
and support with logical tools and support such
as we have already considered
Not real-time or logical time, but want both for
different aspects of a single application!

6
Clock Synchronization and Synchronous Systems

Up to now, we restricted attention to logical
notions of time a happens before b
But recall that we also touched on real clocks
Two views of real-time
Supporting clocks that programs can consult as
needed
Making direct use of real-time inside protocols

7
Clock Synchronization

Topic was extremely active during early 1980s
Best known algorithms include the one in OSF/1
UNIX (based on one by Marzullo at Xerox), the
optimally accurate algorithm of Srikanth and
Toueg, and the probabilistic algorithm of
Cristian
Introduction of Global Positioning System is
eliminating the need for this form of
synchronization

8
Clock synchronization protocols

Would like to think of network as having a single
clock that all processes can consult
But need to implement this using many local
clocks that can
Initially be set incorrectly
Drift over the course of time
Clock synchronization tries to keep clocks close
to true real-time and minimize their tendency to
drift

9
Precision and Accuracy

Accuracy measures local clocks relative to an
external source of accurate real-time. Accurate
clocks are close to real-time
Precision measures local clocks relative to each
other. Precise clocks are close to each other
Skew is the numerical limit on the maximum
dis-tance that correct clocks can drift apart.
E.g. could say the maximum skew is 1sec for
some system

10
How clock synchronization used to work

Periodically, all processors would run a clock
sync protocol, for example by broadcasting the
reading from their clocks
Each receives a set of values from the others
(sets may differ due to faults!)
Algorithm would pick a synchronized value from
the set analysis used to prove properties of
clocks

11
Global Positioning System

Satellite system launched by military in early
1990s, became public and inexpensive
Can think of satellites as broadcasting the time
Small radio receiver picks up signals from three
satellites and triangulates to determine position
Same computation also yields extremely accurate
clock (accurate to a few milliseconds)

12
Clock synchronization with GPS

Put two GPS receivers (or more) on a network
Periodically, receivers broadcast the true time
Other machines only need to adjust their clocks
to overcome propagation delays for clock sync
messages through the network!
Well matched to the a-posteriori clock
synchronization approach

13
Basic idea

GPS receiver broadcasts the time is now 1000
on a broadcast network (ethernet)
Receivers note the time when they receive the
message 1001, 958, .... and reply with
values
GPS receiver subtracts the median value
Differences 1, -2, ... now give the drift of
the clock of the destination relative to the
median clock

14
A-posteriori method, adjustment stage

Now we distribute these drift values back to the
processors, which compensate for the rate of
drift over the time period to the next
synchronization
Can show that this yields clocks that are optimal
both in accuracy and precision
A processor with a broken clock has a good chance
of discovering it during synchronization

15
Using real-time

One option is to use a real-time operating
system, clock synchronization algorithm, and to
design protocols that exploit time
Example MARS system uses pairs of redundant
processors to perform actions fault-tolerantly
and meet deadlines. Has been applied in process
control systems. (Another example Delta-4)

16
Features of real-time operating systems

The O/S itself tends to be rather simple
Big black boxes behave unpredictably
They are structured in terms of tasks
A task is more or less a thread
But typically come with expected runtime,
deadlines, priorities, interruptability, etc
User decomposes application into task-like
component parts and then expresses goals in a
form that RTOS can handle
Widely used on things like medical devices

17
RTOS can be beneficial

Lockheed Martin ATL timed CORBA method
invocations
Variation in response time was huge with a normal
Linux OS
When using a Timesys RTOS the variability is
eliminated!

18
Real-time broadcast protocols

Can also implement broadcast protocols that make
direct use of temporal information
Examples
Broadcast that is delivered at same time by all
correct processes (plus or minus the clock skew)
Distributed shared memory that is updated within
a known maximum delay
Group of processes that can perform periodic
actions

19
A real-time broadcast
ta
tb
t

Message is sent at time t by p0. Later both p0
and p1 fail. But message is still delivered
atomically, after a bounded delay, and within a
bounded interval of time (at non-faulty processes)
20
A real-time distributed shared memory
ta
tb
t
set x3
x3
At time t p0 updates a variable in a distributed
shared memory. All correct processes observe the
new value after a bounded delay, and within a
bounded interval of time.
21
Periodic process group Marzullo
Periodically, all members of a group take some
action. Idea is to accomplish this with minimal
communication
22
The CASD protocols

Also known as the ? -T protocols
Developed by Cristian and others at IBM, was
intended for use in the (ultimately, failed) FAA
project
Goal is to implement a timed atomic broadcast
tolerant of Byzantine failures

23
Basic idea of the CASD protocols

Assumes use of clock synchronization
Sender timestamps message
Recipients forward the message using a flooding
technique (each echos the message to others)
Wait until all correct processors have a copy,
then deliver in unison (up to limits of the clock
skew)

24
CASD picture
ta
tb
t

p0, p1 fail. Messages are lost when echoed by
p2, p3
25
Idea of CASD

Assume known limits on number of processes that
fail during protocol, number of messages lost
Using these and the temporal assumptions, deduce
worst-case scenario
Now now that if we wait long enough, all (or no)
correct process will have the message
Then schedule delivery using original time plus a
delay computed from the worst-case assumptions

26
The problems with CASD

In the usual case, nothing goes wrong, hence the
delay can be very conservative
Even if things do go wrong, is it right to assume
that if a message needs between 0 and ?ms to make
one hope, it needs 0,n ? to make n hops?
How realistic is it to bound the number of
failures expected during a run?

27
CASD in a more typical run
ta
tb
t

28
... leading developers to employ more aggressive
parameter settings
ta
tb
t

29
CASD with over-aggressive paramter settings
starts to malfunction
ta
tb
t

all processes look incorrect (red) from time to
time
30
CASD mile high

When run slowly protocol is like a real-time
version of abcast
When run quickly protocol starts to give
probabilistic behavior
If I am correct (and there is no way to know!)
then I am guaranteed the properties of the
protocol, but if not, I may deliver the wrong
messages

31
How to repair CASD in this case?

Gopal and Toueg developed an extension, but it
slows the basic CASD protocol down, so it
wouldnt be useful in the case where we want
speed and also real-time guarantees
Can argue that the best we can hope to do is to
superimpose a process group mechanism over CASD
(Verissimo and Almeida are looking at this).

32
Why worry?

CASD can be used to implement a distributed
shared memory (delta-common storage)
But when this is done, the memory consistency
properties will be those of the CASD protocol
itself
If CASD protocol delivers different sets of
messages to different processes, memory will
become inconsistent

33
Why worry?

In fact, we have seen that CASD can do just this,
if the parameters are set aggressively
Moreover, the problem is not detectable either by
technically faulty processes or correct ones
Thus, DSM can become inconsistent and we lack any
obvious way to get it back into a consistent state

34
Using CASD in real environments

Would probably need to set the parameters close
to the range where CASD can malfunction, but
rarely
Hence would need to add a self-stabilization
algorithm to restore consistent state of memory
after it becomes inconsistent
Problem has not been treated in papers on CASD
pbcast protocol does this

35
Using CASD in real environments

Once we build the CASD mechanism how would we use
it?
Could implement a shared memory
Or could use it to implement a real-time state
machine replication scheme for processes
US air traffic project adopted latter approach
But stumbled on many complexities

36
Using CASD in real environments

Pipelined computation
Transformed computation

37
Issues?

Could be quite slow if we use conservative
parameter settings
But with aggressive settings, either process
could be deemed faulty by the protocol
If so, it might become inconsistent
Protocol guarantees dont apply
No obvious mechanism to reconcile states within
the pair
Method was used by IBM in a failed effort to
build a new US Air Traffic Control system

38
Similar to MARS

Research system done in Austria by Hermann Kopetz
Basic idea is that everything happens twice
Receiver can suppress duplicates but is
guaranteed of at least one copy of each message
Used to overcome faults without loss of real-time
guarantees
MARS is used in the BMW but gets close to a
hardware f.tol. scheme

39
Many more issues.

What if a process starts to lag?
What if applications arent strictly
deterministic?
How should such a system be managed?
How can a process be restarted?
If not, the system eventually shuts down!
How to measure the timing behavior of components,
including the network

40
FAA experience?

It became too hard to work all of this out
Then they tried a transactional approach, also
had limited success
Finally, they gave up!
6B was lost
A major fiasco, ATC is still a mess

41
Totem approach

Start with extended virtual synchrony model
Analysis used to prove real-time delivery
properties
Enables them to guarantee delivery within about
100-200ms on a standard broadcast LAN
Contrast with our 85us latency for Horus!

42
Tradeoffs between consistency, time

Notice that as we push CASD to run faster we lose
consistency
Contrast with our virtual synchrony protocols
they run as fast as they can (often, much faster
than CASD when it is not malfunctioning) but
dont guarantee real-time delivery

43
A puzzle

Suppose that experiments show that 99.99 of
Horus or Ensemble messages are delivered in 85us
/- 10us for some known maximum load
Also have a theory that shows that 100 of Totem
messages are delivered in about 150ms for
reasonable assumptions
And have the CASD protocols which work well with
? around 250ms for similar LANs

44
A puzzle

Question is there really a difference between
these forms of guarantees?
We saw that CASD is ultimately probabilistic.
Since Totem makes assumptions, it is also,
ultimately, probabilistic
But the experimentally observed behavior of Horus
is also probabilistic
... so why isnt Horus a real-time system?

45
What does real-time mean?

To the real-time community?
A system that provably achieves its deadlines
under stated assumptions
Often achieved using delays!
To the pragmatic community?
The system is fast enough to accomplish our goals
Experimentally, it never seems to lag behind or
screw up

46
Some real-time issues

Scheduling
Given goals, how should tasks be scheduled?
Periodic, a-periodic and completely ad-hoc tasks
What should we do if a system misses its goals?
How can we make components highly predictable in
terms of their real-time performance profile?

47
Real-time today

Slow transition
Older, special purpose operating systems and
components, carefully hand-crafted for
predictability
Newer systems are simply so fast (and can be
dedicated to task) that what used to be hard is
now easy
In effect, we no longer need to worry about
real-time, in many cases, because our goals are
so easily satisfied!

Write a Comment

User Comments (0)

About PowerShow.com

Reliable Distributed Systems PowerPoint PPT Presentation