Title: Peer to Peer Computing
1Peer to Peer Computing
- from
- http//www.cs.rpi.edu/courses/fall02/netprog/
2What is Peer-to-Peer?
- A model of communication where every node in the
network acts alike. - As opposed to the Client-Server model, where one
node provides services and other nodes use the
services.
3Advantages of P2P Computing
- No central point of failure
- E.g., the Internet and the Web do not have a
central point of failure. - Most internet and web services use the
client-server model (e.g. HTTP), so a specific
service does have a central point of failure. - Scalability
- Since every peer is alike, it is possible to add
more peers to the system and scale to larger
networks.
4Disadvantages of P2P Computing
- Decentralized coordination
- How to keep global state consistent?
- Need for distributed coherency protocols.
- All nodes are not created equal.
- Computing power, bandwidth have an impact on
overall performance. - Programmability
- As a corollary of decentralized coordination.
5P2P Computing Applications
- File sharing
- Process sharing
- Collaborative environments
6P2P File Sharing Applications
- Improves data availability
- Replication to compensate for failures.
- E.g., Napster, Gnutella (guh-noo-tell-ahh),
Freenet, KaZaA (kuz-zaah).
7P2P Process Sharing Applications
- For large-scale computations
- Data analysis, data mining, scientific computing
- E.g., SETI_at_Home(search for extraterrestrial
intelligence), Folding_at_Home, distributed.net,
World-Wide Computer
8P2P Collaborative Applications
- For remote real-time human collaboration.
- Instant messaging, virtual meetings, shared
whiteboards, teleconferencing, tele-presence. - E.g., talk, IRC, ICQ, AOL Messenger, Yahoo!
Messenger, Jabber, MS Netmeeting, NCSA Habanero,
Games
9P2P Technical Challenges
- Peer identification
- Routing protocols
- Network topologies
- Peer discovery
- Communication/coordination protocols
- Quality of service
- Security
- Fine-grained resource management
10P2P Topologies
- Centralized
- Ring
- Hierarchical
- Decentralized
- Hybrid
11Centralized Topology
Napster central indexing and searching service
12Ring Topology
13Hierarchical Topology
14Decentralized Topology
Gnutella, Freenet, OceanStore
15Hybrid TopologyCentralized Ring
16Hybrid TopologyCentralized Decentralized
17Evaluating topologies
- Manageability
- How hard is it to keep working?
- Information coherence
- How authoritative is info? (Auditing,
non-repudiation) - Extensibility
- How easy is it to grow?
- Fault tolerance
- How well can it handle failures?
18Evaluating topologies
- Resistance to legal or political intervention
- How hard is it to shut down? (Can be good or bad)
- Security
- How hard is it to subvert?
- Scalability
- How big can it grow?
19Decentralized
- Manageable
- Coherent
- Extensible
- Fault Tolerant
- Secure
- Lawsuit-proof
- Scalable
- Very difficult, many owners
- Difficult, unreliable peers
- Anyone can join in!
- Redundancy
- Difficult, open research
- No one to sue
- Theory yes Practice no
20Centralized Decentralized
- Manageable
- Coherent
- Extensible
- Fault Tolerant
- Secure
- Lawsuit-proof
- Scalable
- Same as decentralized
- Better than decentralized
- Anyone can still join!
- Plenty of redundancy
- Same as decentralized
- Still no one to sue
- Looking very hopeful
Best architecture for P2P networks?
21Napster
- The P2P revolution is started.
- Central indexing and searching service
- File downloading in a peer-to-peer point-to-point
manner.
22Gnutella
- Peer-to-peer indexing and searching service.
- Peer-to-peer point-to-point file downloading
using HTTP. - A gnutella node needs a server (or a set of
servers) to start-up gnutellahosts.com
provides a service with reliable initial
connection points
But introduces a new single point of failure!
23Freenet
- Peer-to-peer indexing and searching service.
- Peer-to-peer file downloading.
- Files served use the same route as searches (not
point-to-point) - Provides for anonymity (freedom of speech)
- Communications are encrypted and are
"routed-through" other nodes to make it extremely
difficult to determine who is requesting the
information and what its content is.
24KaZaA/Morpheus
- Hybrid indexing/searching model
- Not centralized like Napster, not decentralized
like Gnutella. - Peer-to-peer file downloading using HTTP.
- SmartStream automatic download resumption (for
incomplete file) - FastStream fast downloads (distribute the
download task over a list of peers) - SuperNodes elected dynamically if sufficient
bandwidth and processing power hybrid topology
model. - A central server keeps user registrations, logs
usage, and helps bootstrapping peer discovery.
25References
- Nelson Minars articles at
- http//www.openp2p.com/pub/a/p2p/2001/12/14/topolo
gies_one.html - http//www.openp2p.com/pub/a/p2p/2002/01/08/p2p_to
pologies_pt2.html
26Grid v.s. P2P?
- Refer to
- I. Foster and A. Iamnitchi. On death, taxes, and
the convergence of peer-to-peer and grid
computing. In Proceedings of the 2nd
International Workshop on Peer-to-Peer Systems
(IPTPS '03), 2003.
27Towards High Performance Peer-to-Peer Content and
Resource Sharing Systems
28Question No. 1
- Which type of P2P applications the paper
addresses? - File sharing
- Process sharing
- Collaborative environments
29Question No. 2
- What is the goal of the paper?
- Ensure the users to access all available content
efficiently - Fair load distribution to ensure load balancing,
considering the heterogeneity of the contributed
resources by the different peers - Low user-request response times
30Question No. 3
- What are the challenges addressed by the paper in
achieving the goal? - Autonomous nodes ? complex distributed
coordination algorithms - Scalability in number of nodes, documents
- Heterogeneity in content contributions,
processing and storage capacities - Dynamism in content popularity, nodes, content
31Question No. 4
- What are the assumptions made by the paper?
- Content has an initially static and known
popularity, with document popularities following
the Zipf distribution the popularity of any
document is roughly inversely proportional to its
rank in the popularity table, - fi K/ia
- Who has the control on the node content?
- By the node user? (free-rider)
- By the Max-Fair algorithm?
32Question No. 5
- How does the query works?
- Load-balancing
- intra-cluster
- inter-cluster
- associating the document categories with clusters
of nodes, in a manner that ensures a fair
distribution of the document-category
popularities to the clusters of nodes - fairness index
33Max-Fair Algorithm
- How are clusters initially decided?
- What leads to a nodes belonging to more than one
cluster?
34Discussions
- Does the paper address the challenges well and
achieve its goal? - Fair load distribution to ensure load balancing,
considering the heterogeneity of the contributed
resources by the different peers - Low user-request response times
- Deficiencies of the proposed strategy?
35A Course Project
- Research on the state-of-art of load-balancing in
Peer-to-Peer computing - Propose an improved strategy by
- removing one deficiency or
- dropping one simplifying assumption
- in existing mechanisms
- Evaluate the proposed strategy by simulation