Tightly Structured Peer to Peer Systems - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Tightly Structured Peer to Peer Systems

Description:

Chord: Scalable Peer-to-Peer Lookup Service for Internet ... Gnutella :- Decentralized but Flooding. Problem: Too much network traffic, even if better ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 36
Provided by: ssinghi
Category:

less

Transcript and Presenter's Notes

Title: Tightly Structured Peer to Peer Systems


1
Tightly Structured Peer to Peer Systems
  • Scalable Content Addressable Network(CAN)
  • Chord Scalable Peer-to-Peer Lookup Service for
    Internet Applications
  • Surendra Kumar Singhi

2
Background
  • Napster - Centralized Index
  • Problem Doesn't Scale well
  • Gnutella - Decentralized but Flooding
  • Problem Too much network traffic, even if better
    search algorithms like IDS or Directed-BFS are
    used the resource utilization is still poor

3
Solution? How to do Better Search?
  • Use Hash Tables.
  • The file name is now a key.
  • And it maps into values which is the location
    of the file.

4
Content Addressable Network
  • Basic Design
  • Routing
  • Construction
  • Departure, recovery and maintenance
  • Design Improvements
  • Multi-dimensional coordinate space
  • Realities
  • Overloading coordinate zones
  • Topologically Sensitive CAN construction
  • Multiple Hash functions
  • Uniform Partitioning
  • Caching and Replication

5
CAN Basic Design
  • Virtual d-dimensional Coordinate Space on a
    d-torus.
  • The coordinate space partitioned among the nodes.
    Each node has its zone.
  • The (K,V) pair is stored at the node which owns
    the point P in coordinate space.

6
Routing in CAN
  • Routing table containing IP Address and virtual
    coordinate zone of its neighbors.
  • Node greedily forward message to the neighbor
    with coordinates closest to the destination
    coordinate.

7
CAN Construction
  • Find a node already in the CAN.
  • Randomly choose a point P and send a join request.

8
  • Node containing P splits itself into half.
  • The (K,V) pairs from halved zone are handed over
    to new node.
  • New node learns IP addresses of its neighbors.
    Previous occupant also updates its neighbor set.
  • Finally, inform the neighbors of this
    reallocation of space.

9
Node departure, recovery and maintenance
  • Node explicitly handing over its zone and
    associated (K,V) database to one of its neighbor.
  • Failure Immediate Neighbor Takeover Algorithm.
    (key, value) pairs are lost.
  • Nodes entering (K,V) pairs into CAN periodically
    refresh these entries.
  • Nodes send periodic update messages to their
    neighbors.
  • Background zone-reassignment algorithm

10
Design Improvements
  • Reduce path-length or per-CAN-hop latency.
  • Tradeoff between improved routing performance and
    system robustness on one hand and system
    complexity and increased per-node state on the
    other.

11
Multi-dimensional coordinate space
  • Increase in dimension reduces the routing path
    length
  • For n nodes and d dimensions the path length
    scales as O(d(n1/d)).
  • Improve in fault tolerance.

12
Realities multiple coordinate spaces
  • Node owns one zone per reality.
  • Hash table replicated on every reality.
  • Can reach distant portions of space in one hop
    hence reducing average path length.
  • Improved data availability and fault tolerance.

13
Overloading coordinate zone
  • Allow multiple nodes to share the same zone.
  • Request neighbors for peers, and retain the node
    with the lowest RTT as its neighbor.
  • Contents of hash table may be divided or
    replicated among peers.

14
Multiple hash functions
  • Use k different hash functions to map single key
    onto k points in the coordinate space
  • query can be send to k nodes in parallel, thus
    reducing average latency

15
Topologically Sensitive Construction of CAN
Network
  • Well known set of machines(landmarks).
  • Every node measures RTT to landmarks and orders
    landmarks in order of increasing RTT.
  • Node joins CAN at portion of coordinate space
    associated with landmark ordering.

16
More Uniform Partitioning
  • The node receiving the join message , knows the
    zone coordinates of its neighbors too.
  • The zone with the largest volume is split.
  • Is it true load balancing?

17
Caching and Replication
  • CAN nodes maintain a cache of data keys it
    recently accessed.
  • Before forwarding a request for data key, it
    checks whether the requested key is in its own
    cache.
  • When a node feels it is overloaded with request
    for particular data key, it can replicate the
    data key at its neighbor.
  • Cached and replicated keys must have a TTL field
    and should expire from cache.

18
Design Summary
19
(No Transcript)
20
Few Observations
  • Very good results and scales well.
  • Denial of Service attacks?
  • Topologically sensitive construction not
    satisfactory.
  • Keyword Searching?

21
Chord Peer to Peer Lookup Service
  • Base Protocol
  • Consistent Hashing
  • Key Location
  • Node joining the network.
  • Design Improvements
  • Simulation Experimental Result
  • Load Balancing
  • Path Length
  • Node Failures

22
Base Chord Protocol
  • Uses Consistent Hashing.
  • Improves scalability of consistent hashing.

23
Consistent Hashing
  • Each node and key is assigned an m-bit identifier
    using a hash function.
  • The size of identifier space is 2m.
  • Key k is assigned to first node whose identifier
    is equal to or follows k in the identifier space.

24
Scalable Key Location
  • A very simple algorithm will be each node knows
    its successor. But, this doesn't scales.
  • Each node maintains a finger table.
  • The ith entry in the table contains the identity
    of first node s, that succeeds n by at least 2i-1
    on the identifier circle.

25
  • If a node does not knows the successor of a key
    k, it tries to find a node in the finger table
    whose id most immediately precedes k.
  • With high probability the number of nodes that
    must be contacted to find a successor in an
    N-node network is O(log N).

26
Node Joins the network
  • New node(n) learns the identity of an existing
    Chord node(n') by some means.
  • Node n learns its predecessor and fingers by
    asking n' to look them up.
  • Node n needs to be entered in the finger tables
    of existing nodes.
  • Move responsibility to n for all the keys for
    which node n is now the successor.

27
(No Transcript)
28
Improvements to algorithm
  • Use of stabilization protocol to keep node's
    successor pointers up to date.
  • When a node n runs stabilize it asks n's
    successors for the successor's predecessor p, and
    decides whether p should be n's successor
    instead.
  • When a node n fails, nodes whose finger table
    include n must find n's successors.
  • Each node maintains a successor list of its r
    nearest neighbor on the Chord ring.
  • If a node notices that its successor has failed,
    it replaces it with the first live entry in its
    successor list.

29
Simulation Experimental Results
  • Load Balancing

30
Virtual Nodes
31
Path Length
32
Simultaneous Node failures
33
Experimental Results
34
Few Observations
  • Theoretically proved correctness and performance
    in the face of concurrent node arrivals and
    departures.
  • Unfortunately, no connection with the physical
    topology.
  • Few malicious or buggy Chord participants could
    present an incorrect view of the ring.

35
Questions?
Write a Comment
User Comments (0)
About PowerShow.com