Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems

Description:

Example of searching a mp3 file in Gnutella network. The query is flooded ... From the traffic trace collected, if a file is downloaded for download at t0. ... – PowerPoint PPT presentation

Number of Views:17
Avg rating:3.0/5.0
Slides: 27
Provided by: compu369
Category:

less

Transcript and Presenter's Notes

Title: Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems


1
Efficient Content Location Using Interest-based
Locality in Peer-to-Peer Systems
  • Presented by Lin Wing Kai

2
Outline
  • Background
  • Design of Interest-based Locality
  • Simulation of Interest-based Locality
  • Enhancement of Interest-based Locality
  • Understanding the scheme

3
Background
  • 3 types of P2P systems
  • Centralized P2P Napster
  • Decentralized Unstructured Gnutella
  • Decentralized Structured Distributed Hash Table
    (DHT)

4
Background
  • Each peer is connected randomly, and searching is
    done by flooding.
  • Allow keyword search

Example of searching a mp3 file in Gnutella
network. The query is flooded across the network.
5
Background
  • DHT (Chord)
  • Given a key, Chord will map the key to the node.
  • Each node need to maintain O(log N) information
  • Each query use O(log N) messages.
  • Key search means searching by exact name

6
Outline
  • Background
  • Design of Interest-based Locality
  • Simulation of Interest-based Locality
  • Enhancement of Interest-based Locality
  • Understanding the scheme

7
Interest-based Locality
  • Peers have similar interest will share similar
    contents

8
Architecture
  • Shortcuts are modular.
  • Shortcuts are performance enhancement hints.

9
Creation of shortcuts
  • The peer use the underlying topology (e.g.
    Gnutella) for the first few searches.
  • One of the return peers is selected from random
    and added to the shortcut lists.
  • Each shortcut will be ordered by the metric, e.g.
    success rate, path latency.
  • Subsequent queries go through the shortcut lists
    first.
  • If fail, lookup through underlying topology.

10
Outline
  • Background
  • Design of Interest-based Locality
  • Simulation of Interest-based Locality
  • Enhancement of Interest-based Locality
  • Understanding the scheme

11
Performance Evaluation
  • Performance metric
  • success rate
  • load characteristics (query packets per peers
    process in the system)
  • query scope (the fraction of peers in each query)
  • minimum reply path length
  • additional state kept in each node

12
Methodology query workload
  • Create traffic trace from the real application
    traffic
  • Boeing firewall proxies
  • Microsoft firewall proxies
  • Passively collect the web traffic between CMU and
    the Internet
  • Passively collect typical P2P traffic (Kazza,
    Gnutella)
  • Use exact matching rather than keyword matching
    in the simulation.
  • song.mp3 and my artist song.mp3 will be
    treated as different.

13
Methodology Underlying peers topology
  • Based on the Gnutella connectivity graph in 2001,
    with 95 nodes about 7 hops away.
  • Searching TTL is set to 7.
  • For each kind of traffic (Boeing, Microsoft
    etc), run 8 times simulations, each with 1 hour.

14
Methodology Storage and replication modeling
(web)
  • The first peer make the web request will be
    modeled as first node containing the web pages.
  • Subsequent search from other peers will search
    from this peer and replicate the page.

15
Methodology Storage and replication modeling
(P2P)
  • From the traffic trace collected, if a file is
    downloaded for download at t0.
  • The file should also be available for download
    before t0.
  • However, if the file isnt downloaded during the
    sampled trace,
  • There is no information to indicate the existence
    of the file.

16
Simulation Results success rate
17
Simulation Results load, scope and path length
-- Query load for Boeing and Microsoft Traffic
-- Query scope for shortcut scheme is about 0.3,
where in Gnutella is about 100.
-- Average path length of the traces
18
Outline
  • Background
  • Design of Interest-based Locality
  • Simulation of Interest-based Locality
  • Enhancement of Interest-based Locality
  • Understanding the scheme

19
Increase Number of Shortcuts
20
Using Shortcuts Shortcuts
  • Idea

Add the shortcuts shortcut
Performance gain of 7 on average
21
Outline
  • Background
  • Design of Interest-based Locality
  • Simulation
  • Enhancement of Interest-based Locality
  • Understanding the scheme

22
Interest-based Structures
  • When viewed as an undirected graph
  • In the first 10 minutes, there are many connected
    components, each component has a few peers in
    between.
  • At the end of simulation, there are few connected
    components, each component has several hundred
    peers. Each component is well connected.
  • The clustering coefficient is about 0.6 0.7,
    which is higher than that in Web graph.

23
Web Objects Locality
  • Webpage contains several web objects, locality
    should exists in between these objects.

24
Locality Across Publishers
  • Same publisher exhibit low interest locality,
    peer actually may interest different publishers
    content.

Same publisher shortcuts means shortcuts that are
originally created as accessing the same content
from the same publisher for the current request.
25
Sensitivity of Shortcuts
  • Run Interest based shortcuts over DHT (Chord)
    instead of Gnutella.

Query load is reduced by a factor 2 4. Query
scope is reduced from 7/N to 1.5/N
26
Conclusion
  • Interest based shortcuts are modular and
    performance enhancement hints over existing P2P
    topology.
  • Shortcuts are proven can enhance the searching
    efficiencies.
  • Shortcuts form clusters within a P2P topology,
    and the clusters are well connected.
Write a Comment
User Comments (0)
About PowerShow.com