PLATO: Predictive Latency-Aware Total Ordering

About This Presentation

Title:

PLATO: Predictive Latency-Aware Total Ordering

Description:

PLATO: Predictive Ordering. In a datacenter, broadcast ... Traffic Spike: PLATO is insensitive to data rate, while Fixed Sequencer depends on data rate ... – PowerPoint PPT presentation

Number of Views:491

Avg rating:3.0/5.0

Slides: 25

Provided by: mahes5

Learn more at: https://www.cs.cornell.edu

Category:

more less

Transcript and Presenter's Notes

Title: PLATO: Predictive Latency-Aware Total Ordering

1
PLATO Predictive Latency-Aware Total Ordering

Mahesh Balakrishnan
Ken Birman
Amar Phanishayee

2
Total Ordering

a.k.a Atomic Broadcast
delivering messages to a set of nodes in the same
order
messages arrive at nodes in different orders
nodes agree on a single delivery order
messages are delivered at nodes in the agreed
order

3
Modern Datacenters

Applications
E-tailers, Finance, Aerospace
Service-Oriented Architectures,
Publish-Subscribe, Distributed Objects, Event
Notification
Totally Ordered Multicast!
Hardware
Fast high-capacity networks
Failure-prone commodity nodes

4
Total Ordering in a Datacenter
Updates are Totally Ordered
Replicated Service
Totally Ordered Multicast is used to consistently
update Replicated Services Latency of Multicast
? System Consistency Requirement order
multicasts consistently, rapidly, robustly
5
Multicast Wishlist

Low Latency!
High (stable) throughput
Minimal, proactive overheads
Leverage hardware properties
HW Multicast/Broadcast is fast, unreliable
Handle varying data rates
Datacenter workloads have sharp spikes and
extended troughs!

6
State-of-the-Art

Traditional Protocols
Conservative
Latency-Overhead tradeoff
Example Fixed Sequencer
Simple, works well
Optimistic Total Ordering
deliver optimistically, rollback if incorrect
Why this works No out-of-order arrival in LANs
Optimistic total ordering for datacenters?

7
PLATO Predictive Ordering

In a datacenter, broadcast / multicast occurs
almost instantaneously
Most of the time, messages arrive in same order
at all nodes.
Some of the time, messages arrive in different
orders at different nodes.
Can we predict out-of-order arrival?

8
Reasons for Disorder Swaps
Typical Datacenter Diameter 50-500 microseconds
Out-of-order arrival can occur when the
inter-send interval between two messages is
smaller than the diameter of the network
9
Reasons for Disorder Loss

Datacenter networks are over-provisioned
Loss never occurs in the network
Datacenter nodes are cheap
Loss occurs due to end-host buffer overflows
caused by CPU contention

10
Emulab Testbed (Utah)
11
Cornell Testbed
12
Disorder Emulab3
Percentage of swaps and losses goes up with data
rate
At 2800 packets per sec, 2 of all packet pairs
are swapped and 0.5 of packets are lost.
13
Disorder
14
Predicting Disorder

Predictor Inter-arrival time of consecutive
packets into user-space
Why?
Swaps simultaneous multicasts
? low inter-arrival time
Loss kernel buffer overflow
? sequence of low inter-arrival times

15
Predicting Disorder

95 of swaps and 14 of all pairs are within 128
µsecs

Inter-arrival time of swaps
Inter-arrival time of all pairs
Cornell Datacenter, 400 multicasts/sec
16
Predicting Disorder
17
PLATO Design

Heuristic If two packets arrive within ? µsecs,
possibility of disorder
PLATO
Heuristic Lazy Fixed Sequencer
Heuristic works ? zero (?) latency
Heuristic fails ? fixed sequencer latency

18
PLATO Design
API optdeliver, confirm, revoke Ordering
Layer Pending Queue Packets suspected to be
out-of-order, or queued behind suspected
packets Suspicious Queue Packets optdelivered
to the application, not yet confirmed
19
PLATO Design
20
Performance
? Fixed Sequencer ? PLATO
At small values of ?, very low latency of
delivery but more rollbacks
21
Performance
Latency of both Fixed-Sequencer and PLATO
decreases as throughput increases
22
Performance
Traffic Spike PLATO is insensitive to data rate,
while Fixed Sequencer depends on data rate
23
Performance
Latency is as good as static ? parameterization
?? is varied adaptively in reaction to rollbacks
24
Conclusion