Exercising DHash with Distributed Backup - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Exercising DHash with Distributed Backup

Description:

Vivaldi Synthetic Coordinates. Each node estimates its own position ... Vivaldi in action. Execution on 86 PlanetLab Internet hosts ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 20
Provided by: robert1636
Category:

less

Transcript and Presenter's Notes

Title: Exercising DHash with Distributed Backup


1
Exercising DHash with Distributed Backup
  • Robert Morris
  • Frans Kaashoek, David Karger, Russ Cox, Frank
    Dabek, Emil Sit, Josh Cates, Jacob Strauss, James
    Robertson
  • http//pdos.lcs.mit.edu/chord
  • MIT Laboratory for Computer Science

2
Backup Characteristics
  • Full dump only once ever
  • Then just incremental snapshots
  • Dump reads a lot, to verify old blocks
  • Data caching isnt very useful
  • Mostly bulk data movement
  • Archived snapshots are on-line
  • So interactive read also matters

3
What do we deserve?
  • Throughput
  • Same as one TCP? N TCPs?
  • Use window of RPCs to cover latency
  • Lookup latency very cheap w/ proximity
  • Read latency (w/ k copies) max / 2k
  • Write latency max

4
Initial Performance
  • Bulk read 50 kbytes/second
  • Bulk write 10 kbytes/second
  • 90 nodes, Planetlab RON
  • 5 replicas of each block

5
What went wrong?
  • Forced to talk to the predecessor
  • Fetch from replica closest to predecessor
  • Replicas are large, so puts were slow
  • Unified congestion window
  • Mimiced single TCP increase/decrease
  • Worst-case timeouts, no fast retransmit

6
Old Chord/DHash
replicas on successors
(key)
originator
4.
3.
predecessor, returns successor list
2.
1.
7
Solutions
  • 7-of-14 coding w/ IDA, not replication
  • Maintain proximity to originator
  • Avoid predecessor, or any particular node
  • Fetch data from nodes close to originator
  • Predict better RPC timeouts
  • Key tool synthetic coordinates

8
Vivaldi Synthetic Coordinates
  • Each node estimates its own position
  • Position (x,y) synthetic coordinates
  • x and y units are time (milliseconds)
  • Distance predicts network latency
  • Key point predict w/o pinging first
  • Like GNP, but no landmarks

9
Vivaldi synthetic coordinates
  • Each node starts with a random incorrect position

2,3
1,2
0,1
3,0
10
Vivaldi synthetic coordinates
  • Each node starts with a random incorrect position

A
  • Each node pings a few other nodes to measure
    network latency (distance)

2 ms
1 ms
1 ms
2 ms
B
11
Vivaldi synthetic coordinates
  • Each node starts with a random incorrect position
  • Each node pings a few other nodes to measure
    network latency (distance)

2
1
1
  • Each nodes moves to cause measured distances to
    match coordinates

2
12
Vivaldi synthetic coordinates
  • Each node starts with a random incorrect position
  • Each node pings a few other nodes to measure
    network latency (distance)

2
1
1
  • Each nodes moves to cause measured distance to
    match coordinates

2
13
Vivaldi in action
  • Execution on 86 PlanetLab Internet hosts
  • Each host only pings a few other random hosts
  • Most hosts find useful coordinates after a few
    dozen pings

14
Vivaldi predicts latency well
15
New lookupfetch
G
Cs successor list
F
E
A
D
3.
C
2.
1.
B
  • API lookup(k, m) returns first m successors of
    k
  • Stops when successor list overlaps m

16
Vivaldi helps decrease DHash fetch times
  • 77 PlanetLab 18 RON nodes
  • Fundamental limit about 100 ms?

17
DHash throughput
Throughput (kbytes/second)
Berkeley
Mazu
NYU
Australia
18
Conclusions
  • Unclear what fair throughput is vs TCP
  • Main latency costs
  • Data is not likely to be close
  • Last steps of lookup are expensive
  • Need latency predictions to as-yet unknown nodes
  • Synthetic coordinates are useful in many ways

http//pdos.lcs.mit.edu/chord
19
Vivaldi vs. network coordinates
  • Vivaldis coordinates match geography well
  • over-sea distances shrink (faster than over-land)
  • orientation of Australia and Europe is wrong
Write a Comment
User Comments (0)
About PowerShow.com