GIA:%20Making%20Gnutella-like%20P2P%20Systems%20Scalable - PowerPoint PPT Presentation

About This Presentation

Title:

GIA:%20Making%20Gnutella-like%20P2P%20Systems%20Scalable

Description:

E.g., Napster, Gnutella, KaZaA. File sharing is the dominant P2P app. Mass-market ... ray-of-light.mp3' Distributed Hash Tables (DHTs) Structured solution ... – PowerPoint PPT presentation

Number of Views:73

Avg rating:3.0/5.0

Slides: 25

Provided by: yati7

Category:

more less

Transcript and Presenter's Notes

Title: GIA:%20Making%20Gnutella-like%20P2P%20Systems%20Scalable

1
GIA Making Gnutella-like P2P Systems Scalable

Yatin Chawathe
Sylvia Ratnasamy, Scott Shenker, Nick Lanham, Lee
Breslau

2
The Peer-to-peer Phenomenon

Internet-scale distributed system
Distributed file-sharing applications
E.g., Napster, Gnutella, KaZaA
File sharing is the dominant P2P app
Mass-market
Mostly music, some video, software

3
The Problem

Potentially millions of users
Wide range of heterogeneity
Large transient user population
Existing search solutions cannot scale
Flooding-based solutions limit capacity
Distributed Hash Tables (DHTs) not necessarily
appropriate

4
Our Solution GIA

Scalable Gnutella-like P2P system
Design principles
Explicitly account for node heterogeneity
Query load proportional to node capacity
Results
Gia outperforms Gnutella by 35 orders of
magnitude

5
Outline

Existing approaches
GIA Scalable Gnutella
Results Simulations Experiments
Conclusion

6
Gnutella

Distributed search and download
Unstructured ad-hoc topology
Peers connect to random nodes
Random search
Flood queries across network
Scaling problems
As network grows, search overhead increases

P6
P5
P4 has madonna- american-life.mp3
P1
P4
who has madonna
P2 has madonna- ray-of-light.mp3
P3
P2
7
Distributed Hash Tables (DHTs)

Structured solution
Given a filename, find its location
Can DHTs do file sharing?
Probably, but with lots of extra workCaching,
keyword searching
Do we need DHTs?
Not necessarily Great at finding rare files, but
most queries are for popular files

8
Other Solutions

Supernodes KaZaA
Classify nodes as low- or high-capacity
Only pushes the problem to a bigger scale
Random Walks Lv et al
Forwarding is blind
Queries can get stuck in overloaded nodes
Biased Random Walks Adamic et al
Right idea, but exacerbates overloaded-node
problem

9
Outline

Existing approaches
GIA Scalable Gnutella
Results Simulations Experiments
Conclusion

10
GIA 10,000-foot view

Unstructured, but take node capacity into account
High-capacity nodes have room for more queries
so, send most queries to them
Will work only if high-capacity nodes
Have correspondingly more answers, and
Are easily reachable from other nodes

11
GIA Design

Make high-capacity nodes easily reachable
Dynamic topology adaptation
Make high-capacity nodes have more answers
One-hop replication
Search efficiently
Biased random walks
Prevent overloaded nodes
Active flow control

Make high-capacity nodes easily reachable
Dynamic topology adaptation
Make high-capacity nodes have more answers
One-hop replication
Search efficiently
Biased random walks
Prevent overloaded nodes
Active flow control

Query
12
Dynamic Topology Adaptation

Make high-capacity nodes have high degree (i.e.,
more neighbors)
Per-node level of satisfaction, S
0 ? no neighbors, 1 ? enough neighbors
Function of
Nodes capacity ? Neighbors capacities
Neighbors degrees ? Their age
When S ltlt 1, look for neighbors aggressively

13
Active Flow Control

Accept queries based on capacity
Actively allocation tokens to neighbors
Send query to neighbor only if we have received
token from it
Incentives for advertising true capacity
High capacity neighbors get more tokens
Allocate tokens with weighted fair queuing

14
Practical Considerations

Query resilience node death
Periodic keep-alive messages
Query responses are implicit keep-alives
Determining node capacity
Function of bandwidth and age of node
Finding rare items
Bifurcate the random walk every 10 hops

15
Outline

Existing approaches
GIA Scalable Gnutella
Results Simulations Experiments
Conclusion

16
Simulation Results

Compare four systems
FLOOD TTL-scoped, random topologies
RWRT Random walks, random topologies
SUPER Supernode-based search
GIA search using GIA protocol suite
Metric
Collapse point aggregate throughput that the
system can sustain

17
Questions

What is the relative performance of the four
algorithms?
Which of the GIA components matters the most?
How does the system behave in the face of
transient nodes?

18
System Performance
19
Factor Analysis
Algorithm Collapse point
RWRT 0.0005
RWRTOHR 0.005
RWRTBIAS 0.0015
RWRTTADAPT 0.001
RWRTFLWCTL 0.0006
Algorithm Collapse point
GIA 7
GIA OHR 0.004
GIA BIAS 6
GIA TADAPT 0.2
GIA FLWCTL 2
20
Transient Behavior
Static SUPER
Static RWRT (1 repl)
21
Deployment