Supporting Ranked Search in Parallel Search Cluster Networks PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Supporting Ranked Search in Parallel Search Cluster Networks


1
Supporting Ranked Search in Parallel Search
Cluster Networks
  • Fang Xiong Qiong Luo Dyce Jing Zhao
  • xfang, luo, zhaojing_at_cs.ust.hk
  • Hong Kong University of Science and Technology

2
Introduction
  • Environment P2P
  • Unstructured, super-peer, Parallel Search Cluster
    Network (PSCN)
  • Task search
  • Data object ID
  • filename
  • Content ranked keyword search
  • Previous work on ranked search in P2P
  • PlanetP in the unstructured P2P network
  • Shen et al. in the super-peer network

3
Both are unstructured P2P networks
FSL Forwarding Search Link NIL Non-forwarding
Index Link FIL Forwarding Index Link
4
The Process of Ranked Search in a PSCN
  • Indexing time
  • Build the local indexes
  • Transmit the local indexes across clusters
    through NILs
  • Querying time
  • Forward the query within a cluster through FSLs
  • Collect the Local Aggregate Information (LAI),
    and merge into the Global Aggregate Information
    (GAI)
  • The document-level index vs. the peer-level index
  • In the case of using the document-level index,
    additional steps include calculating the Local
    Ranking (LR), and merging into the Global Ranking
    (GR)
  • In the case of using the peer-level index,
    additional steps include calculating the Local
    Peer Ranking (LPR), merging into the Global Peer
    Ranking (GPR), calculating the Local Document
    Ranking (LDR) and merging into the Global
    Document Ranking (GDR)
  • Merge the locally ranked query results into
    globally ranked ones and return all or top-K of
    them to the user

5
  • Average processing time spent on each step, using
    the document-level index
  • The majority of processing time is spent on
    local processing
  • This suggests that it is necessary to distribute
    the search workload evenly over multiple peers
  • Average processing time spent on each step, using
    the peer-level index

6
  • Average processing time in three overlays, using
    the document-level index
  • The processing time in the unstructured network
    is much larger than in the other two
  • The processing time in the super-peer network is
    about 30 larger than that in the PSCN
  • The processing time is slightly more when using
    the document-level index than that using the
    peer-level index
  • Average processing time in three overlays, using
    the peer-level index

7
2231less
4147 less
4.57.7 times higher
2225 lower
8
Summary
  • The majority of processing time is spent on local
    processing. Therefore, it is beneficial to
    distribute the search workload over peers
    otherwise, the bottleneck will be at the
    super-peers in a super-peer network or at the
    querying peer in an unstructured network.
  • The processing time and the storage cost per peer
    in a PSCN is the lowest among the three overlays.
  • The downside of a PSCN is the flooding
    communication within a cluster and the index
    replication cost across clusters. The super-peer
    network wins on the network bandwidth usage and
    the total storage cost.
  • Compared with document-level indexes, peer-level
    indexes save 70 of the processing time, 30 of
    the network bandwidth usage and 30 of the
    storage space, with a slight decrease in
    precision.
Write a Comment
User Comments (0)
About PowerShow.com