Tightly Structured Peer to Peer Systems - PowerPoint PPT Presentation

1 / 35

About This Presentation

Title:

Tightly Structured Peer to Peer Systems

Description:

Chord: Scalable Peer-to-Peer Lookup Service for Internet ... Gnutella :- Decentralized but Flooding. Problem: Too much network traffic, even if better ... – PowerPoint PPT presentation

Number of Views:43

Avg rating:3.0/5.0

Slides: 36

Provided by: ssinghi

Category:

more less

Transcript and Presenter's Notes

Title: Tightly Structured Peer to Peer Systems

1
Tightly Structured Peer to Peer Systems

Scalable Content Addressable Network(CAN)
Chord Scalable Peer-to-Peer Lookup Service for
Internet Applications
Surendra Kumar Singhi

2
Background

Napster - Centralized Index
Problem Doesn't Scale well
Gnutella - Decentralized but Flooding
Problem Too much network traffic, even if better
search algorithms like IDS or Directed-BFS are
used the resource utilization is still poor

3
Solution? How to do Better Search?

Use Hash Tables.
The file name is now a key.
And it maps into values which is the location
of the file.

4
Content Addressable Network

Basic Design
Routing
Construction
Departure, recovery and maintenance
Design Improvements
Multi-dimensional coordinate space
Realities
Overloading coordinate zones
Topologically Sensitive CAN construction
Multiple Hash functions
Uniform Partitioning
Caching and Replication

5
CAN Basic Design

Virtual d-dimensional Coordinate Space on a
d-torus.
The coordinate space partitioned among the nodes.
Each node has its zone.
The (K,V) pair is stored at the node which owns
the point P in coordinate space.

6
Routing in CAN

Routing table containing IP Address and virtual
coordinate zone of its neighbors.
Node greedily forward message to the neighbor
with coordinates closest to the destination
coordinate.

7
CAN Construction

Find a node already in the CAN.
Randomly choose a point P and send a join request.

Node containing P splits itself into half.
The (K,V) pairs from halved zone are handed over
to new node.
New node learns IP addresses of its neighbors.
Previous occupant also updates its neighbor set.
Finally, inform the neighbors of this
reallocation of space.

9
Node departure, recovery and maintenance

Node explicitly handing over its zone and
associated (K,V) database to one of its neighbor.
Failure Immediate Neighbor Takeover Algorithm.
(key, value) pairs are lost.
Nodes entering (K,V) pairs into CAN periodically
refresh these entries.
Nodes send periodic update messages to their
neighbors.
Background zone-reassignment algorithm

10
Design Improvements

Reduce path-length or per-CAN-hop latency.
Tradeoff between improved routing performance and
system robustness on one hand and system
complexity and increased per-node state on the
other.

11
Multi-dimensional coordinate space

Increase in dimension reduces the routing path
length
For n nodes and d dimensions the path length
scales as O(d(n1/d)).
Improve in fault tolerance.

12
Realities multiple coordinate spaces

Node owns one zone per reality.
Hash table replicated on every reality.
Can reach distant portions of space in one hop
hence reducing average path length.
Improved data availability and fault tolerance.

13
Overloading coordinate zone

Allow multiple nodes to share the same zone.
Request neighbors for peers, and retain the node
with the lowest RTT as its neighbor.
Contents of hash table may be divided or
replicated among peers.

14
Multiple hash functions

Use k different hash functions to map single key
onto k points in the coordinate space
query can be send to k nodes in parallel, thus
reducing average latency

15
Topologically Sensitive Construction of CAN
Network

Well known set of machines(landmarks).
Every node measures RTT to landmarks and orders
landmarks in order of increasing RTT.
Node joins CAN at portion of coordinate space
associated with landmark ordering.

16
More Uniform Partitioning

The node receiving the join message , knows the
zone coordinates of its neighbors too.
The zone with the largest volume is split.
Is it true load balancing?

17
Caching and Replication

CAN nodes maintain a cache of data keys it
recently accessed.
Before forwarding a request for data key, it
checks whether the requested key is in its own
cache.
When a node feels it is overloaded with request
for particular data key, it can replicate the
data key at its neighbor.
Cached and replicated keys must have a TTL field
and should expire from cache.

18
Design Summary
19
(No Transcript)
20
Few Observations

Very good results and scales well.
Denial of Service attacks?
Topologically sensitive construction not
satisfactory.
Keyword Searching?

21
Chord Peer to Peer Lookup Service

Base Protocol
Consistent Hashing
Key Location
Node joining the network.
Design Improvements
Simulation Experimental Result
Load Balancing
Path Length
Node Failures

22
Base Chord Protocol

Uses Consistent Hashing.
Improves scalability of consistent hashing.

23
Consistent Hashing

Each node and key is assigned an m-bit identifier
using a hash function.
The size of identifier space is 2m.
Key k is assigned to first node whose identifier
is equal to or follows k in the identifier space.

24
Scalable Key Location

A very simple algorithm will be each node knows
its successor. But, this doesn't scales.
Each node maintains a finger table.
The ith entry in the table contains the identity
of first node s, that succeeds n by at least 2i-1
on the identifier circle.

If a node does not knows the successor of a key
k, it tries to find a node in the finger table
whose id most immediately precedes k.
With high probability the number of nodes that
must be contacted to find a successor in an
N-node network is O(log N).

26
Node Joins the network

New node(n) learns the identity of an existing
Chord node(n') by some means.
Node n learns its predecessor and fingers by
asking n' to look them up.
Node n needs to be entered in the finger tables
of existing nodes.
Move responsibility to n for all the keys for
which node n is now the successor.

27
(No Transcript)
28
Improvements to algorithm

Use of stabilization protocol to keep node's
successor pointers up to date.
When a node n runs stabilize it asks n's
successors for the successor's predecessor p, and
decides whether p should be n's successor
instead.
When a node n fails, nodes whose finger table
include n must find n's successors.
Each node maintains a successor list of its r
nearest neighbor on the Chord ring.
If a node notices that its successor has failed,
it replaces it with the first live entry in its
successor list.

29
Simulation Experimental Results

Load Balancing

30
Virtual Nodes
31
Path Length
32
Simultaneous Node failures
33
Experimental Results
34
Few Observations

Theoretically proved correctness and performance
in the face of concurrent node arrivals and
departures.
Unfortunately, no connection with the physical
topology.
Few malicious or buggy Chord participants could
present an incorrect view of the ring.

35
Questions?

Write a Comment

User Comments (0)