Semantic Overlay Networks for P2P Systems - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Semantic Overlay Networks for P2P Systems

Description:

Classification from 'All Music Guide' based on music style (26 styles without ... Classifications on decade and tone provide worse results. Statistics of styles ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 12
Provided by: Ada67
Category:

less

Transcript and Presenter's Notes

Title: Semantic Overlay Networks for P2P Systems


1
Semantic Overlay Networksfor P2P Systems
  • Arturo Crespo and Hector Garcia-Molina

2
Motivation and Related Work
  • Partition the P2P network into several thematic
    networks
  • Queries for a content will not reach nodes
    without such content
  • Flooding in smaller networks with smaller TTL (or
    more results with same)
  • Edutella, Hypercup peers with similar content
    connect to the same SuperPeer

3
Node partitioning
  • When does a node belong to SON A?
  • When it contains a piece of type A
  • When it contains more than x pieces of type A
  • Less nodes per SONgtmore results sooner
  • Less SONs per nodegtless connections
  • As in DBFS, coverage problem

4
SON Classification
  • Classification must provide
  • Load-balance
  • Each category has similar number of nodes
  • Each node belongs to a small number of categories
  • Easy and accurate way to classify a document

5
Statistics
  • Classification from All Music Guide based on
    music style (26 styles without and 255 styles
    with substyles)
  • Traces from 1800 Napster nodes
  • Classifications on decade and tone provide worse
    results

6
Statistics of styles
  • A node belongs to a style if it has 1 document
  • 24 of the nodes belong to one style
  • 90 nodes belong to up to eight
  • 14 styles with 200 to 400 nodes
  • 2 styles with 1000 to 1200 nodes
  • 1 style with 1600 to 1800 nodes

7
Statistics of substyles
  • 18 of the nodes belong to one substyle
  • 90 nodes belong to up to 30 styles
  • 87 of substyles with less than 400 nodes

8
Statistics of document classification
  • Classification based on the All Music Guide
    database
  • Reasons of failure
  • Filename format
  • All documents not songs
  • Misspellings
  • Database not complete
  • 25 of documents classified incorrectly
  • 4 of nodes classified incorrectly

9
Layered SONs
  • Styles and substyles
  • Minimum required percentage (or number) of
    documents of type A to join SON A
  • May not be able to join a substyle but may join a
    style
  • Nodes belonging to a substyle, do not join the
    parent style

10
Searching
  • A document is classified to belong to (sub)style
    A
  • Search all substyles of style A SONs
  • Search (sub)style A
  • Search a higher level SON
  • Until we get enough results
  • (How do we locate each SON?)

11
Results
  • Acyclic graphs, to measure effect of msgs to
    nodes without such content
  • To get half the documents that match a query
  • Layered SONs 461 msgs
  • Gnutella 1731 msgs
Write a Comment
User Comments (0)
About PowerShow.com