Title: Gnutella2: A Better Gnutella?
1Gnutella2 A Better Gnutella?
- COMP 5204 Data Networks
- Julie Thorpe
- School of Computer Science
- Carleton University
2Introduction
- Gnutella and Gnutella2 are P2P application
protocols. - Gnutella has an interesting history essentially
a reverse engineered Beta version. - Changes governed by Gnutella Developers Forum
(GDF). - Gnutella2 is a completely different protocol than
Gnutella, claiming it's what Gnutella should have
been. - The GDF reject this claim, and refuse to call
Gnutella2 by its name they instead call it
Mike's Protocol. - Unclear whether Gnutella2 better than Gnutella.
- My project goal is to compare these protocols to
determine which is theoretically better.
3Presentation Outline
- Gnutella review
- Gnutella's problems
- Gnutella2 review
- Comparison
- Network architecture
- Searching algorithms
- Cooperation incentives
- Security
- Concluding remarks
4Gnutella Review (1)
- Purely decentralized, simple protocol for file
sharing. - Runs over TCP/IP connections.
- New node enters Gnutella network by connecting to
a known server (sends Ping), and when server
responds (sends Pong) they are now connected
peers. - Learns of other nodes if server forwards Ping to
its peers (and gets a Pong back in response).
5Gnutella Review (2)
- Has no way to advertise files.
- Peers find files by flooding with query
requests (stop after TTL hops). - Query responses are routed back along the same
path as the request arrived upon. - Addressed by GUIDs.
6Gnutella's Problems
- Scalability.
- Performance.
- Lack of cooperation incentives.
- Abuse by servents (Gnutella client/server
program).
7Gnutella's Problems - Scalability
- For a node to be reached by a query, it (and all
other nodes on the path to it) must be forwarded
the request! - The reach is determined by n ( connections to
other hosts) and TTL ( hops each request is
permitted to take) - Assumption nodes all have the same n and TTL.
- Ritter 1 estimated that for reasonable
parameters, to achieve a reach of 106 nodes,
Gnutella nodes must have a bandwidth between 19.2
and 64 Gbps!
8Gnutella's Problems - Performance
- Messages are forwarded through many other peers
on the network. - Connection speeds are effectively restricted to
the bandwidth of the slowest peer along the
route. - If A and B have high-speed connections, and C has
a modem connection, the download rate between A
and B is limited to C's speed.
9Gnutella's Problems Lack of Cooperation
Incentives
- One study found that over 70 of users shared no
files, and 50 of all responses are returned by
the top 1 2 . - Implications of free-riders on Gnutella
- Increased search horizon (farthest set of hosts
reachable by a search request, directly related
to its TTL). - The top 1 that is providing most files reaches
connection saturation. - This generally a difficult problem to solve for
purely decentralized systems.
10Gnutella's Problems Abuse by Servents
- Since it's a protocol, implementations can
implement the Gnutella protocol as they please
(in theory). - Servents can act selfishly to improve their
performance - Increasing TTL to increase search horizon
(generating geometrically higher messages). - Frequent re-querying (generating more messages,
degrading network).
11Gnutella2
- GDF recommended (in version 0.6) ultrapeers to
improve Gnutella's performance and scalability. - Gnutella2 enforces a variation of this to improve
performance and scalability. - Decentralized, 2-tier hierarchy of peers (leaf
nodes and hubs). - Other important differences will come out in
comparison.
12Comparison Gnutella vs. Gnutella2
- Network architecture
- Searching algorithms
- Cooperation incentives
- Security
13Gnutella's Network Architecture
- Recall it is completely decentralized.
14Gnutella2's Network Architecture
- Decentralized, 2-tier.
- This architecture is recommended for Gnutella in
v0.6. - New node enters by connecting to a known hub
(almost identical to Gnutella's handshake). - Hubs typically accept 300-500 leaves, and connect
to 5-30 other hubs. - Leaves typically connect to 3 hubs.
15Comparison Network Architecture
- Gnutella's purely decentralized is much simpler,
but not as scalable. - No real difference between Gnutella's v0.6
ultrapeer structure and Gnutella2's hub
structure. - Either of these strategies should reduce
searching traffic as explained in searching
algorithms.
16Gnutella's Searching Algorithm
- Recall that using the purely decentralized
version, packets are flooded throughout the
network. - If the v0.6 ultrapeer recommendation is
implemented, searching is optimized using Query
hash tables(QHTs). - A QHT is maintained by each node, and describes
the content it is sharing. - An ultrapeer maintains an aggregate of its leaf's
QHTs and its own QHT. - Searches are performed by forwarding a query to
an ultrapeer, who checks its aggregate QHT for a
match. - If there is a match, the query is forwarded to
the appropriate leaf, otherwise the query is
forwarded to neighbouring ultrapeers by
flooding.
17Gnutella2's Searching Algorithm
- Ultrapeers are called hubs.
- Uses a QHT like Gnutella, but if a hub cannot
match a query to its aggregate QHT, it checks a
set of caches - Each hub maintains a cached copy of each
neighbouring hub's aggregate QHT. - Upon a search miss, a hub will try to match the
query against its cached copies of its neighbours
QHTs. - If the query matches, it will forward the query
once, and the node that receives the query
processes it and directly sends the result back
to the client. - If no match is made, the searching client will
continue at another untried hub.
18Comparison Searching Algorithms
- Both Gnutella with Ultrapeers and Gnutella2
significantly reduce the number of messages. - Should increase performance and scalability.
- Gnutella2's method further reduces the number of
messages sent for a query - Less query request messages due to caching
neighbour's QHTs. - Less response messages since Gnutella2's method
allows the responding node to directly contact
the requesting node, rather than sending the
message back through the path to get there.
19Comparison Cooperation Incentives
- Neither Gnutella nor Gnutella2 specify
cooperation incentives. - Implementations often will not allow connections
to network unless you share something, and by
default make downloads shared. - The problem of cooperation incentives in a
decentralized environment is interesting, since
nodes can avoid connecting to those that have a
profile of their behaviour.
20Gnutella's Security
- Gnutella's query messages are routed through
peers, and the query does not contain the
querying node's IP address, but a Globally Unique
Identifier (GUID). - Provides anonymity by masking requester's
identity. - Denial-of-service (DOS) attacks are possible by
flooding the network with many requests with a
fake GUID. - Another node could be similarly DOS'ed if a GUID
for one of their request GUIDs is known. - Response IP addresses could be spoofed and
malicious content provided.
21Gnutella2's Security
- Gnutella2 does not use GUIDs for queries
- Sends the response directly back to the
requesting node. - The QHTs do not contain information about the
content stored on a neighbouring node, providing
privacy. - Queries make use of query keys (to verify the
query return address is that of the original
sender). - Prevents malicious users from sending out queries
for the purpose of flooding the network with
spoofed requests. - Search clients only permitted to query a hub
after obtaining a query key, which are unique
to each (hub, search client return address) from
it to include in the transmission.
22Comparison - Security
- Both Gnutella with ultrapeers and Gnutella2
provide privacy through their caching. - Both Gnutella and Gnutella2 are suceptible to
spoofed response IP addresses. - For Gnutella2
- Gnutella does not provide authentication of nodes
for querying, thus it is susceptible to request
flooding attacks. - Gnutella ultrapeers cannot block certain hosts
(do not have query keys or unique request
addresses). - For Gnutella
- Gnutella2 does not provide anonymous queries.
23Concluding Remarks
- Gnutella has some serious flaws (scalability,
performance, lack of cooperation incentives and
servent abuse). - Gnutella2 solves all but cooperation incentives.
- Gnutella with ultrapeers solves scalability and
performance, but the searching algorithm and
caching is less sophisticated. - Gnutella2 has many more features outside of this
comparison, primarily being more extensible (yet
specific) to support applications other than file
sharing. - Although they are different protocols, Gnutella2
is in essence, an improved version of Gnutella
with ultrapeers.
24Open Problems
- Is it possible to create useful cooperation
incentives in a large, truly distributed
environment like Gnutella where peers may
reconnect to different hubs upon connection?
25References
- Jordan Ritter, Why Gnutella Can't Scale. No,
Really., http//www.tch.org/gnutella.htm. - Eytan Adar and Bernardo A. Huberman,
Free-Riding on Gnutella, http//www.firstmond
ay.org/issues/issue5_10/adar/index.html. - Farhad Manjoo, Gnutella Bandwidth Bandits,
Aug. 8, 2002. - RFC-Gnutella 0.6, http//rfc-gnutella.sourceforg
e.net/developer/testing/index.html. - Anurag Singla and Christopher Rohrs,
Ultrapeers Another Step Towards Gnutella
Scalability, Version 1.0, http//rfc-gnutella.so
urceforge.net/src/Ultrapeers_1.0.html - Gnutella vs. Gnutella2, Part 1,
http//www.mp3newswire.net/stories/2003/gnutella.h
tml. - The Gnutella2 Developers Network.
http//www.gnutella2.com. - LimeWire Network Improvements,
http//www.limewire.com/developer/net_improvements
.html - P2P Networking Technologies. URL
http//ntrg.cs.tcd.ie/undergrad/4ba2.02-03/Intro.h
tml.