Symphony: Distributed Hashing in a Small World - PowerPoint PPT Presentation

About This Presentation
Title:

Symphony: Distributed Hashing in a Small World

Description:

Symphony: Distributed Hashing in a Small World Gurmeet Singh Manku Mayank Bawa Prabhakar Raghavan Presented by Satpreet Singh Motivation GOAL: To maintain a large DHT ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 16
Provided by: homepageC
Category:

less

Transcript and Presenter's Notes

Title: Symphony: Distributed Hashing in a Small World


1
Symphony Distributed Hashing in a Small World
  • Gurmeet Singh Manku
  • Mayank Bawa
  • Prabhakar Raghavan

Presented by Satpreet Singh
2
Motivation
  • GOAL To maintain a large DHT over a WAN
  • DESIRED CHARACTERISTICS
  • Scalability (work for a range of network sizes)
  • Stability (handle churn)
  • Performance (provide low-latency lookups low
    maintenance costs with churn)
  • Flexibility (provide design knobs, preferably
    run-time)
  • Simplicity (easy to understand, code, deploy)
  • SOLUTION Symphony (fuses ideas from Chord
    Klienbergs Small World greedy routing
    algorithm/result)

3
Features/Advantages of Symphony
  • Low state maintenance (conseq. of low degree)
  • Fewer pings/keep-alives, less (ambient) control
    traffic
  • Distributed locking and coordination overhead
    over smaller sets of nodes
  • Smaller bootstrapping time when a node joins
  • Smaller recovery time when a node leaves
  • 2. Smooth out-degree vs latency tradeoff
  • Only protocol that offers this tuning knob
    even at run time!
  • Out-degree not fixed at runtime, or as
    function of network size.
  • 3. Flexibility and support for heterogeneity
  • Different nodes can have different number of
    links
  • 4. Fault tolerance
  • Only short links are bolstered. No backups for
    long links

4
Architecture Overview
  • Establish keyspace as 0, 1) (wrap around a
    ring like in Chord)
  • Every node manages subrange from own-id. to
    next-clockwise-nodes-id. ( equi-sized)
  • Objects hash to m-bit hash-key K, managed by
    node that manages real number containing K/2m
  • 2 Short-links one with each immediate neighbor

5
Architecture Overview
  • k ( 1) Long Links (uni-/bi-direct.)
  • draws a rand. number (x) from a Probability
    Distribution Function
  • contacts manager of (x) using a Routing Protocol
  • Establishes a link (if incoming links at manager
    2k, if not resample PDF)
  • PDF is a type of harmonic distribution (so,
    Symphony)
  • Pn(x) 1/(x ln n) in x ? 1/n, 1
  • 0 otherwise
  • PDF estimates n using an Estimation Protocol

6
Network Size Estimation Protocol
  • Goal to estimate n - the current total number
    of nodes in the DHT
  • So if,
  • - S is any set of s distinct nodes,
  • Xs is the sum of segment-lengths managed by them,
  • Estimated n s/Xs
  • All s nodes update their estimate,
  • Experimentally s 3 found good enough,
  • So simply use node and its two immediate
    neighbors
  • Fact Impact of increasing s on avg. latency is
    insignificant

x Length of arc 1/x Estimate of n
(Idea from Viceroy)
7
Routing Protocol(s)
Klienbergs Small World result A message can be
routed to any node by greedy routing in O(log2n)
hops, in a construction where each node has one
link to each of its 4 directional neighbors and
a single long-distance link to a node chosen from
a suitable PDF. To lookup hash key x ? 0,1),
contact the manager of x Unidirectional
Routing Protocol Node forwards a lookup for x
along (short or long) link that minimizes the
clockwise distance to x Bidirectional Routing
Protocol Node forwards a lookup for x along
(short or long) link that minimizes the absolute
distance to x In both cases, expected path
length in an n-node network with k O(1) links
is O(1/k log2n) hops. Bidirectional
1-Lookahead reduce latency by 40 and 30 each
8
Join/Leave Protocols
  • JOIN
  • The new node chooses its id x from 0, 1)
    uniformly at random
  • Using the routing protocol it identifies the
    current manager of x
  • It then runs the estimation protocol using s 3
  • X then uses Pnx to establish its long distance
    links
  • Cost k links O(1/k log2n) msgs. O(log2n)
    messages
  • LEAVE
  • All out- and in- links to xs long distance
    neighbors are snapped
  • Other nodes whose outgoing links to x are just
    broken, reinstate those links with other nodes
  • The immediate neighbors of x establish
    short-links between themselves
  • Successor of x initiates estimation protocol
    over s 3 neighbors
  • Cost O(log2n) messages

9
Re-linking Protocols etc
  • RE-LINKING
  • nx xs current estimate of n
  • nxlink xs estimate when long distance links
    were last established
  • When nx and nxlink differ ? stale estimate
  • Re-link only when nx / nxlink is not in the
    range 0.5, 2
  • Re-linking gains are marginal, cost high
    O(log2n) messages
  • LOOKAHEAD
  • Node can maintain list of neighbors neighbors
  • Improves choice of neighbor for routing queries
  • No extra messages piggyback on keep-alives of
    TCP link
  • Cost O(k2) space. Number of long-links remains
    unchanged
  • FAULT TOLERANCE
  • Deletion of short links more detrimental as
    leads to node isolation
  • Make f copies of nodes content in f next
    clockwise nodes

10
Experimental Data
  • SETUP
  • Large DHT 25 to 215 nodes simulated in network
  • Four kinds of test networks
  • Static, Expanding, Expanding-Relink Dynamic

Estimate Protocol Performance estimate improves
for log(n) neighbors, but impact on avg. latency
is minimal (later)
11
Experimental Data
Routing Protocol Performance Increasing links
beyond 2 has marginal benefits. Bidirectional
routing is good (30 reduction in latency)
12
Experimental Data
  • Lookahead Performance
  • 1-Lookahead reduces avg. latency by 40 for
    small value of k.
  • Also, it does not entail an increase in the no.
    of long-links per node.
  • Neighbor-lists are exchanged periodically
    piggy-backed on normal routing traffic or
    keep-alives

13
Experimental Data
  • Fault-tolerance motivation
  • Left On deleting a random set of links (short
    long), successful lookups drops quite quickly
    deletion of short links causes node isolation
    quickly
  • Right Impact of removing only long links not
    as severe (and only avg. latency goes up)
  • Thus, only fortify short links. Make f copies of
    content in clockwise direction.

14
Comparisons Conclusions
  • Conclusions
  • For large DHTs, (25 to 215 nodes) Symphony
    outperforms others
  • Avg. TCP links 10 Latency is about 8 hops
  • Lower costs of Join/Leave compared to Chord etc.
  • Number of neighbors not fixed at outset No
    backup links

15
Wrapping up
  • Symphony...
  • is a simple protocol for managing large DHTs
  • supports a dynamic network of hosts with
    relatively short lifetimes
  • scales well
  • has low lookup latency
  • has low maintenance cost
  • requires few neighbors per node
  • supports heterogenity in nodes (run-time knobs)
  • provides flexibility in design

Questions ?
Write a Comment
User Comments (0)
About PowerShow.com