Supporting Ranked Search in Parallel Search Cluster Networks presentation

About This Presentation

Transcript and Presenter's Notes

Title: Supporting Ranked Search in Parallel Search Cluster Networks

1
Supporting Ranked Search in Parallel Search
Cluster Networks

Fang Xiong Qiong Luo Dyce Jing Zhao
xfang, luo, zhaojing_at_cs.ust.hk
Hong Kong University of Science and Technology

2
Introduction

Environment P2P
Unstructured, super-peer, Parallel Search Cluster
Network (PSCN)
Task search
Data object ID
filename
Content ranked keyword search
Previous work on ranked search in P2P
PlanetP in the unstructured P2P network
Shen et al. in the super-peer network

3
Both are unstructured P2P networks
FSL Forwarding Search Link NIL Non-forwarding
Index Link FIL Forwarding Index Link
4
The Process of Ranked Search in a PSCN

Indexing time
Build the local indexes
Transmit the local indexes across clusters
through NILs
Querying time
Forward the query within a cluster through FSLs
Collect the Local Aggregate Information (LAI),
and merge into the Global Aggregate Information
(GAI)
The document-level index vs. the peer-level index
In the case of using the document-level index,
additional steps include calculating the Local
Ranking (LR), and merging into the Global Ranking
(GR)
In the case of using the peer-level index,
additional steps include calculating the Local
Peer Ranking (LPR), merging into the Global Peer
Ranking (GPR), calculating the Local Document
Ranking (LDR) and merging into the Global
Document Ranking (GDR)
Merge the locally ranked query results into
globally ranked ones and return all or top-K of
them to the user

Average processing time spent on each step, using
the document-level index
The majority of processing time is spent on
local processing
This suggests that it is necessary to distribute
the search workload evenly over multiple peers
Average processing time spent on each step, using
the peer-level index

Average processing time in three overlays, using
the document-level index
The processing time in the unstructured network
is much larger than in the other two
The processing time in the super-peer network is
about 30 larger than that in the PSCN
The processing time is slightly more when using
the document-level index than that using the
peer-level index
Average processing time in three overlays, using
the peer-level index

7
2231less
4147 less
4.57.7 times higher
2225 lower
8
Summary

The majority of processing time is spent on local
processing. Therefore, it is beneficial to
distribute the search workload over peers
otherwise, the bottleneck will be at the
super-peers in a super-peer network or at the
querying peer in an unstructured network.
The processing time and the storage cost per peer
in a PSCN is the lowest among the three overlays.
The downside of a PSCN is the flooding
communication within a cluster and the index
replication cost across clusters. The super-peer
network wins on the network bandwidth usage and
the total storage cost.
Compared with document-level indexes, peer-level
indexes save 70 of the processing time, 30 of
the network bandwidth usage and 30 of the
storage space, with a slight decrease in
precision.

Write a Comment

User Comments (0)

About PowerShow.com

Supporting Ranked Search in Parallel Search Cluster Networks PowerPoint PPT Presentation