Title: Symphony: Distributed Hashing in a Small World
1Symphony Distributed Hashing in a Small World
- Gurmeet Singh Manku
- Mayank Bawa
- Prabhakar Raghavan
Presented by Satpreet Singh
2Motivation
- GOAL To maintain a large DHT over a WAN
- DESIRED CHARACTERISTICS
- Scalability (work for a range of network sizes)
- Stability (handle churn)
- Performance (provide low-latency lookups low
maintenance costs with churn) - Flexibility (provide design knobs, preferably
run-time) - Simplicity (easy to understand, code, deploy)
- SOLUTION Symphony (fuses ideas from Chord
Klienbergs Small World greedy routing
algorithm/result)
3Features/Advantages of Symphony
- Low state maintenance (conseq. of low degree)
- Fewer pings/keep-alives, less (ambient) control
traffic - Distributed locking and coordination overhead
over smaller sets of nodes - Smaller bootstrapping time when a node joins
- Smaller recovery time when a node leaves
-
- 2. Smooth out-degree vs latency tradeoff
- Only protocol that offers this tuning knob
even at run time! - Out-degree not fixed at runtime, or as
function of network size. -
- 3. Flexibility and support for heterogeneity
- Different nodes can have different number of
links - 4. Fault tolerance
- Only short links are bolstered. No backups for
long links
4Architecture Overview
- Establish keyspace as 0, 1) (wrap around a
ring like in Chord) - Every node manages subrange from own-id. to
next-clockwise-nodes-id. ( equi-sized) - Objects hash to m-bit hash-key K, managed by
node that manages real number containing K/2m - 2 Short-links one with each immediate neighbor
5Architecture Overview
- k ( 1) Long Links (uni-/bi-direct.)
- draws a rand. number (x) from a Probability
Distribution Function - contacts manager of (x) using a Routing Protocol
- Establishes a link (if incoming links at manager
2k, if not resample PDF) - PDF is a type of harmonic distribution (so,
Symphony) - Pn(x) 1/(x ln n) in x ? 1/n, 1
- 0 otherwise
- PDF estimates n using an Estimation Protocol
6Network Size Estimation Protocol
- Goal to estimate n - the current total number
of nodes in the DHT - So if,
- - S is any set of s distinct nodes,
- Xs is the sum of segment-lengths managed by them,
- Estimated n s/Xs
- All s nodes update their estimate,
- Experimentally s 3 found good enough,
- So simply use node and its two immediate
neighbors - Fact Impact of increasing s on avg. latency is
insignificant
x Length of arc 1/x Estimate of n
(Idea from Viceroy)
7Routing Protocol(s)
Klienbergs Small World result A message can be
routed to any node by greedy routing in O(log2n)
hops, in a construction where each node has one
link to each of its 4 directional neighbors and
a single long-distance link to a node chosen from
a suitable PDF. To lookup hash key x ? 0,1),
contact the manager of x Unidirectional
Routing Protocol Node forwards a lookup for x
along (short or long) link that minimizes the
clockwise distance to x Bidirectional Routing
Protocol Node forwards a lookup for x along
(short or long) link that minimizes the absolute
distance to x In both cases, expected path
length in an n-node network with k O(1) links
is O(1/k log2n) hops. Bidirectional
1-Lookahead reduce latency by 40 and 30 each
8Join/Leave Protocols
- JOIN
- The new node chooses its id x from 0, 1)
uniformly at random - Using the routing protocol it identifies the
current manager of x - It then runs the estimation protocol using s 3
- X then uses Pnx to establish its long distance
links - Cost k links O(1/k log2n) msgs. O(log2n)
messages - LEAVE
- All out- and in- links to xs long distance
neighbors are snapped - Other nodes whose outgoing links to x are just
broken, reinstate those links with other nodes - The immediate neighbors of x establish
short-links between themselves - Successor of x initiates estimation protocol
over s 3 neighbors - Cost O(log2n) messages
9Re-linking Protocols etc
- RE-LINKING
- nx xs current estimate of n
- nxlink xs estimate when long distance links
were last established - When nx and nxlink differ ? stale estimate
- Re-link only when nx / nxlink is not in the
range 0.5, 2 - Re-linking gains are marginal, cost high
O(log2n) messages - LOOKAHEAD
- Node can maintain list of neighbors neighbors
- Improves choice of neighbor for routing queries
- No extra messages piggyback on keep-alives of
TCP link - Cost O(k2) space. Number of long-links remains
unchanged - FAULT TOLERANCE
- Deletion of short links more detrimental as
leads to node isolation - Make f copies of nodes content in f next
clockwise nodes
10Experimental Data
- SETUP
- Large DHT 25 to 215 nodes simulated in network
- Four kinds of test networks
- Static, Expanding, Expanding-Relink Dynamic
Estimate Protocol Performance estimate improves
for log(n) neighbors, but impact on avg. latency
is minimal (later)
11Experimental Data
Routing Protocol Performance Increasing links
beyond 2 has marginal benefits. Bidirectional
routing is good (30 reduction in latency)
12Experimental Data
- Lookahead Performance
- 1-Lookahead reduces avg. latency by 40 for
small value of k. - Also, it does not entail an increase in the no.
of long-links per node. - Neighbor-lists are exchanged periodically
piggy-backed on normal routing traffic or
keep-alives
13Experimental Data
- Fault-tolerance motivation
- Left On deleting a random set of links (short
long), successful lookups drops quite quickly
deletion of short links causes node isolation
quickly - Right Impact of removing only long links not
as severe (and only avg. latency goes up) - Thus, only fortify short links. Make f copies of
content in clockwise direction.
14Comparisons Conclusions
- Conclusions
- For large DHTs, (25 to 215 nodes) Symphony
outperforms others - Avg. TCP links 10 Latency is about 8 hops
- Lower costs of Join/Leave compared to Chord etc.
- Number of neighbors not fixed at outset No
backup links
15Wrapping up
- Symphony...
- is a simple protocol for managing large DHTs
- supports a dynamic network of hosts with
relatively short lifetimes - scales well
- has low lookup latency
- has low maintenance cost
- requires few neighbors per node
- supports heterogenity in nodes (run-time knobs)
- provides flexibility in design
Questions ?