Search and Replication in Unstructured Peer-to-Peer Networks - PowerPoint PPT Presentation

About This Presentation

Title:

Search and Replication in Unstructured Peer-to-Peer Networks

Description:

Peers are connected by an overlay network. Users cooperate to share files (e.g., music, videos, etc. ... Use of central directory server (CDS) ... – PowerPoint PPT presentation

Number of Views:28

Avg rating:3.0/5.0

Slides: 29

Provided by: antoniade

Category:

more less

Transcript and Presenter's Notes

Title: Search and Replication in Unstructured Peer-to-Peer Networks

1
Search and Replication in Unstructured
Peer-to-Peer Networks

Pei Cao, Christine Lv., Edith Cohen, Kai Li and
Scott Shenker
ICS 2002

2
Outline

Brief survey of P2P architectures
Evaluation Methodology
Search Methods
Replication
Conclusions

3
Peer-to-Peer Networks

Peers are connected by an overlay network.
Users cooperate to share files (e.g., music,
videos, etc.)
Dynamic nodes join or leave frequently

4
P2P Network Architectures I

Centralized
Use of central directory server (CDS)
Peers query to the CSD to find other peers that
hold the desired object
Pros very efficient
Cons poorly scales
single point of failure

5
P2P Network Architectures II

Decentralized No central directory server
But structured
P2P network topology is tightly controlled
Files are placed at specified locations
Unstructured
No control in Network topology or file placement

6
P2P Network Architectures III

Decentralized but Structured
loose structured
Placement of files is based on hints
tight structure
Precisely declare
structure of P2P network and
file placement
Use of distributed hash table
Pros Efficient satisfaction of queries
Good scaling
Cons No proof it works

7
P2P Network Architectures IV

Decentralized and Unstructured
Placement of files not based on topology
knowledge
Finding files
Node queries neighbors (usually using flooding)
Pros extremely resilient to network changes
Cons extremely unscalable
generates large loads

8
Evaluation Methodology I

Terminology
Network Topology
instant graph formed by nodes in the network
Query Distribution
frequency of lookups to files
Replication Distribution
percentage of nodes that have a particular file

9
Evaluation Methodology II

Network Topologies
Powel-Law Random Graph (PLRG)
Max node degree 1746, median 1 average 4.46
Normal Random Graph (Random)
Average and median node degree is 4
Gnutella graph (Gnutella)
Oct 2000 snapshot
Max degree 136, median 2, average 5.5
Two-dimensional Grid
100x100 ? 10000 nodes

10
Evaluation Methodology III

Object query distribution qi
Uniform
Zipf-like
Object replication density distribution ri
Uniform
Proportional ri ? qi
Square-Root ri ? ? qi

11
Evaluation Methodology IV

Metrics
User aspects
Pr(success)
hops
Load aspects
Average messages per node
nodes visited
Peak messages

12
Limitation of Flooding I

Gnutella uses TTL to check hops queries travel
Problem
Hard to choose TTL
For objects that are widely present in the
network, small TTLs suffice
For objects that are rare in the network, large
TTLs are necessary
Number of query messages grow exponentially as
TTL grows

13
Limitation of Flooding II

Node may receive the same messages more than once
Need for duplication detection mechanisms
Still duplication increases as TTL increases in
flooding

14
Limitation of Flooding Conclusion

Flooding increases per-node overhead
Need for more scalable search methods
Expanding Ring
Random Walks

15
Expanding Ring

Adaptively Adjust TTL
Multiple floods start with TTL1 increment TTL
by 2 each time until search succeeds

Still have duplicate messages
16
Random Walk

Simple random walk
Takes too long to find anything
Multiple-walker random walk
K walkers after each walking T steps visits as
many nodes as 1 walker walking KT steps
More messages ? more overhead
When to terminate the search
TTL
Checking check back with query originator once
every C steps

17
Search Traffic Comparison
18
Search Delay Comparison
19
Lessons Learned about Search Methods

Key Cover the right number of nodes as quickly
as possible and with as little overhead as
possible
Pay Attention to
Adaptive termination
Minimize message duplication
Small expansion in each step

20
Replication

In unstructured P2P systems, search success is
essentially about coverage visiting enough nodes
to find the object gt replication density matters
Goal minimize average search size (number of
probes till query is satisfied)
Theoretical Optimal copy everything everywhere
Limited node storage

21
Replication Strategies

Uniform Replication
pi 1/m
Simple, resources are divided equally
Proportional Replication
pi qi
Fair, resources per item proportional to demand
Reflects current P2P practices

22
Square-Root Replication

pi is proportional to square-root(qi)
Lies In-between Uniform and Proportional

23
Achieving Square-Root Replication I

Assuming that each query keeps track the number
of probes needed
Store an object at a number of nodes that is
proportional to the number of probes
Two implementations
Path replication store the object along the path
of a successful walk
Random replication store the object randomly
among nodes visited by the agents

24
Achieving Square-Root Replication II
25
Evaluation of Replication Methods I