Taxonomy Caching: A Scalable Low-Cost Mechanism for Indexing Remote Contents in Peer-to-Peer Systems - PowerPoint PPT Presentation

About This Presentation

Title:

Taxonomy Caching: A Scalable Low-Cost Mechanism for Indexing Remote Contents in Peer-to-Peer Systems

Description:

Athens University of Economics and Business. Athens, Greece. June 28, 2006. ICPS'2006 ... Mobile devices high storage capacity & wireless support ... – PowerPoint PPT presentation

Number of Views:18

Avg rating:3.0/5.0

Slides: 21

Provided by: kje87

Category:

more less

Transcript and Presenter's Notes

Title: Taxonomy Caching: A Scalable Low-Cost Mechanism for Indexing Remote Contents in Peer-to-Peer Systems

1
Taxonomy Caching A Scalable Low-Cost Mechanism
for Indexing Remote Contents in Peer-to-Peer
Systems

Kjetil Nørvåg
Norwegian University of Science and Technology
Trondheim, NorwayChristos Doulkeridis and
Michalis Vazirgiannis
Athens University of Economics and Business
Athens, Greece

2
Outline

Motivation and example application
Taxonomies and taxonomy-based querying
Taxonomy-based query routing
Taxonomy caching architecture and maintenance
Experimental results
Summary and further work

3
Motivation

Mobile devices high storage capacity wireless
support
Contain multimedia documents that can be shared
Possibly other data/services
Temperature or other environmental data
Important challenge find the files services!
Problem
Dynamic contents, location, and visibility
Limited bandwidth
? Centralized indexing/search engines not
applicable
? P2P network search

4
Example application MobiShare

Devices share resources by hosting web services
Device connected to a CAS
CASs connected P2P
More details in Valavanis et al., Web
Intelligence2003

5
Outline of basic idea

1) Describe contents according to taxonomy
2) Taxonomy info cached at remote peers
3) Use cached knowledge to route queriesto
appropriate peers
Why?
1) Should reduce latency
2) Increase recall with same cost

6
Resource description

Taxonomy-based resource description
Also applicable for audio/video
More than one taxonomy might exist in system
Resource description Taxonomy ID and set of
categories

7
Taxonomy-based querying

Query
1) Request for all resources belonging to
category Cj
or
2) Request for all resources belonging to
category Cj and satisfying some additional
property
Example properties Text contents, metadata

8
Searching in unstructured P2P networks

Basic search technique Local execution of query
then forwarding if TTLgt0
Naïve flooding (all neighbors)
Normalized flooding (only K neighbors)
Random walks only one random neighbor, but W
walks initiated
Problem Only a limited of peers can be
searched (query horizon)
Possible improvements
Routing indices
Summary indexing (bloom filters etc)
Result caching
However Still limited scalability and coverage

9
Taxonomy caching

Basic idea
Maintain taxonomic of remote contents in a
taxonomy cache (TCache)
Mapping from taxonomic concept to set of peers
Advantages
Cheaper to maintain than full-text index
More applicable to multimedia data
More robust wrt. changes in contents
Used to improve query routing
? Higher recall and reduced latency

10
Query routing using taxonomy cache (TCache)

Basis one of traditional routing strategies
Query forward peers PF
Starting point PF neighborsPNPN1,,PNn
Lookup in TCache Lookup(category)
?PCPC1,,PCm
PF PNPC
Query forwarded to (subset of) PF

11
Query forwarding alternatives (1)

Query forward peers PF
of neighbors (excl. previous) Nn
matches from lookup Nc
Ranking of peers in PC
Based on of resources within a category
High of resources considered experts
TCB
Highest ranked in PC the Nn neighbors in
PN1,,PNn
Forwarding to peer in PC called jump
Jump can be to peer beyond query horizon!
TCA
If Nc Nn forward to Nn highest ranked peers in
PC
If Nc lt Nn forward to all Nc peers in PC
(Nn-Nc) randomly selected neighbors

12
Query forwarding alternatives (2)

TCCN
If Nc Nn forward to all Nc peers in PC
If Nc lt Nn forward to all Nc peers in PC
(Nn-Nc) neighbors
TCDN
If Nc Nn forward to Nn/2 highest ranked peers
in PC random selection of Nn/2 other peers in
PC
If Nc lt Nn forward to all Nc peers in PC
(Nn-Nc) neighbors

13
Distributing taxonomic information

Basic mechanism piggyback matching category with
query result
Rsult returned through original path, possibly
involving jumps
Makes revalidation of contents intermediate
TCaches possible
Coverage will be gradually extended (beyond query
horizon)
Lazy distribution by gossiping also possible

14
TCache architecture and maintenance

Aim Provide efficient mapping C ?PC1,,PCm
For each category Peers, of resources, and TTL
TTL
Regularly decremented
Reset to start value at revalidation
Caching policy Aggressive vs. selective
Compacting techniques Peer upgrade non-expert
pruning

15
Experimental setup

Simulations
Excerpts of DMOZ taxonomy
Synthetic network topologies
Resource allocation 80/20 rule
Queries are taxonomic categories
A number of peers have role as querying peers
Measured Contacted peers, messages, recall and
latency
In this presentation Results using flooding and
TCDN query routing

16
Improvements in recall
NM (F) NM (TC) Recall (F) Recall (TC)
TTL1 7.8 7.0 0.0022 0.0019
TTL3 166.7 166.0 0.0117 0.0149
TTL5 524.7 523.9 0.0282 0.0717
TTL7 1058.6 1057.7 0.0506 0.1835
TTL9 1721.0 1719.6 0.0773 0.2930
TTL11 2566.3 2566.0 0.1104 0.4012
TTL13 3536.5 3535.8 0.1477 0.4891
TTL15 4560.2 4558.7 0.1864 0.5755
17
Primary reason for improvementMore intelligent
query forwarding
NC (F) NC (TC) Recall (F) Recall (TC)
TTL1 7.8 6.7 0.0022 0.0019
TTL3 45.3 53.4 0.0117 0.0149
TTL5 110.6 158.0 0.0282 0.0717
TTL7 199.9 346.8 0.0506 0.1835
TTL9 305.6 583.1 0.0773 0.2930
TTL11 437.7 840.3 0.1104 0.4012
TTL13 586.7 1120.6 0.1477 0.4891
TTL15 741.6 1372.4 0.1864 0.5755
18
Improvement and scalability
19
Latency reduction

TCache results in very fast retrieval of first
results
Finding all results approximately similar
performance because flooding in both techniques

20
Summary and further work

Presented motivation and context
Taxonomy-based querying and query routing
TCache architecture and maintenance
Experimental results proving our claims
Future/ongoing work
Employing the techniques for XML/XPath querying
in P2P context (to appear at IEEE P2P2006)
Integration of different taxonomies

Write a Comment

User Comments (0)