P2P Search - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

P2P Search

Description:

Napster. MP3 file sharing with a centralized catalog. Peers hold files. Napster Inc's servers hold catalog. File transfer is P2P, using a proprietary protocol ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 42
Provided by: longwoo
Category:
Tags: p2p | napster | search

less

Transcript and Presenter's Notes

Title: P2P Search


1
P2P Search
  • COP6731
  • Advanced Database Systems

2
P2P Computing
  • Powerful personal computer
  • Share computing resources
  • P2P Computing
  • Advantages
  • Shared infrastructure costs
  • Highly scalable
  • No SPOF
  • censorship-resistance

3
P2P Search Techniques
  • Centralized P2P systems
  • e.g. Napster, SETI_at_home
  • Decentralized unstructured P2P systems
  • e.g. Gnutella
  • Hybrid - partially decentralized
  • e.g., Freenet
  • Structured P2P systems
  • DHT systems (CAN/Chord/Pastry/Tapestry)
  • Skip-list based systems

4
Napster
  • MP3 file sharing with a centralized catalog
  • Peers hold files
  • Napster Incs servers hold catalog
  • File transfer is P2P, using a proprietary protocol

5
Napster Publish a File
  • Users upload their IP address and music titles
    they wish to share

Central Napster server
(xyz.mp3, 192.1.2.3)
192.1.2.3
6
Napster Query for a File
  • Users search for peers to download desired files

xyz.mp3 ?
192.1.2.3
192.1.2.3
Central Napster server
7
Napster Transfer Requested File
  • File transfer is P2P, using a proprietary protocol

xyz.mp3 ?
192.1.2.3
Central Napster server
8
Disadvantage of Centralized Directory
  • Performance bottleneck
  • Single point of failure
  • Can we do it without a directory ?

9
Gnutella
  • No catalog
  • Pings network to locate Gnutella peers
  • File requests are broadcast to peers
  • Flooding or breadth-first research
  • When provider is located, the file is transferred
    via HTTP

10
Gnutella Issue a Request
xyz.mp3 ?
11
Gnutella Flood the Request
12
Gnutella Reply with the File
xyz.mp3
13
Gnutella - Disadvantages
  • Network flooding - unnecessary network traffic
  • Using TTL - some files might not be found
  • Alternatively,
  • using ultranodes (or supernodes)
  • using depth-first search, i.e., Freenet

14
Morpheus, Kazaa
Supernode Layer
15
Using Ultranodes
  • Queries flood only the network of ultranodes
  • Other peer nodes shielded from query traffic
  • Combine the benefits of centralized and
    decentralized search
  • Take advantage of the heterogeneity in peer
    capabilities

16
Freenet - Depth-First Search
17
Freenet File not Found
  • The requested file not found due to a poor
    routing decision made at peer D
  • In this case, query backs out of the dead-end,
    and tries another peer in depth-first manner

18
Structured P2P Systems
  • DHT-based
  • Chord / Pastry / Tapestry hash-based into
    single dimensional space
  • CAN hash-based into multi-dimensional space
  • P-grid hash-based into virtual binary search
    tree
  • Skip-list based
  • Skipgraph / SkipNet
  • Index Tree-based
  • BATON

19
DHT Design Goals
  • An overlay network with
  • Flexible mapping of keys to physical nodes
  • Data Independence
  • Small network diameter
  • Small degree (fan-out)
  • Local routing decisions
  • Robustness to churn
  • Routing flexibility
  • Proximity
  • A storage or memory mechanism with
  • No guarantees on persistence
  • Maintenance via soft state

20
Metrics
  • Searching/Lookup
  • Number of hops in searching
  • Number of messages
  • Database related metrics
  • Total disk I/O
  • Response Time
  • Accuracy
  • Maintenance
  • Number of hops
  • Number of messages

21
How to Bound Search Space ?
Work on placement!
Network
22
Basic Idea - Hashing
P2P Network
Publish (H(y))
Join (H(x))
Object y
Peer x
H(y)
H(x)
Peer nodes also have hash keys in the same hash
space
Objects have hash keys
y
x
Hash key
Place object to the peer with closest hash keys
23
Viewed as a Distributed Hash Table
0
2128-1
Hash table
Peer nodes
Each is responsible for a range of the hash
table, according to the peer hash key Objects
are placed in the peer with the closest key
Note that peers are Internet edges
24
How to Find an Object?
0
2128-1
Hash table
Peer node
Want to keep only a few entries!
one hop to find the object
Simplest idea Everyone knows everyone else!
25
Using Distributed Hash Table (DHT)
  • A peer only needs to know its logical neighbors
  • Search based on multihop routing

0
2128-1
Hash table
Peer node
26
DHT in action
27
DHT in action
28
DHT in action
Operation take key as input route messages to
node holding key
29
DHT in action put()
insert(K1,V1)
Operation take key as input route messages to
node holding key
30
DHT in action put()
insert(K1,V1)
Operation take key as input route messages to
node holding key
31
DHT in action put()
(K1,V1)
Operation take key as input route messages to
node holding key
32
DHT in action get()
retrieve (K1)
Operation take key as input route messages to
node holding key
33
DHT in action
retrieve (K1)
34
CAN Content Addressable Network
  • Each peer is responsible for one zone, i.e.,
    stores all (key, value) pairs of the zone
  • Each peer knows the neighbors of its zone
  • Random assignment of peers to zones at startup
  • Dimensional-ordered multihop routing

35
CAN Object Publishing
I
node Ipublish(K,V)
36
CAN Object Publishing
x a
I
node Ipublish(K,V)
(1) a hx(K)
37
CAN Object Publishing
x a
I
node Ipublish(K,V)
(1) a hx(K) b hy(K)
y b
38
CAN Object Publishing
I
node Ipublish(K,V)
J
(1) a hx(K) b hy(K)
(2) route (K,V) - J
39
CAN Object Publishing
I
node Ipublish(K,V)
J
(1) a hx(K) b hy(K)
(K,V)
(2) route (K,V) - J (3) J stores (K,V)
40
CAN Object Retrieval
node Iretrieve(K)
J
(1) a hx(K) b hy(K)
(K,V)
(2) route retrieve(K) to J that is
in charge of (a,b)
I
41
Some Research Topics
  • Content-based Image Retrieval in P2P
  • Location Management in P2P
  • Security Considerations for DHT
  • P2P Backup
  • Wireless P2P
Write a Comment
User Comments (0)
About PowerShow.com