SKYPEER: Efficient Subspace Skyline Computation over Distributed Data - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

SKYPEER: Efficient Subspace Skyline Computation over Distributed Data

Description:

SKYPEER: Efficient Subspace Skyline Computation over Distributed Data ... The skyline query is used to find a set of non dominated data points in a multi ... – PowerPoint PPT presentation

Number of Views:124
Avg rating:3.0/5.0
Slides: 18
Provided by: non102
Category:

less

Transcript and Presenter's Notes

Title: SKYPEER: Efficient Subspace Skyline Computation over Distributed Data


1
  • SKYPEER Efficient Subspace Skyline Computation
    over Distributed Data
  • Akrivi Vlachou, Christos Doulkeridis, Yannis
    Kotidis
  • Department of Informatics
  • Athens University of Economics and Business
  • ICDE 2007
  • 69721040 ???

2
Outline
  • Introduction
  • Skyline Computation in P2P Networks
  • SKYPEER Algorithm
  • Experimental Evaluation
  • Conclusion

3
Introduction
  • The skyline query is used to find a set of non
    dominated data points in a multi-dimensional
    dataset, and most previous work has assumed a
    centralized setting
  • The past assumption are hardly applicable to
    large-scale P2P systems.
  • Relying on a super-peer architecture we propose a
    threshold based algorithm, called SKYPEER and its
    variants
  • For efficient subspace skyline processing, we
    extend the notion of domination by defining the
    extended skyline set
  • Skyline queries help users make intelligent
    decisions over complex data

4
Skyline Computation in P2P Networks
  • In general, super-peers maintain information
    about the peers they have been assigned, so that
    at query time, they can process a query without
    having to contact all peers.
  • In this work we assume that the super-peer
    topology is predefined and we focus on the
    optimization of interactions among super-peers
    and peers.
  • Preliminaries and Definitions
  • Querying peer Pinit, henceforth referred to as
    initiator of a query
  • All peers send their local datasets to Pinit
    where a centralized skyline algorithm is executed

5
Skyline Computation in P2P Networks
6
Skyline Computation in P2P Networks
  • Locally evaluate as many parts of the query as
    possible.
  • Each super-peer needs to collect from the
    associated peers only the skyline points of all
    subspaces.
  • Given the locally stored extended skyline, each
    super peer individually processes a subspace
    skyline request and transmits the results to the
    query initiator.
  • This approach is considered as baseline, and will
    be henceforth referred to as naive.

7
(No Transcript)
8
Skyline Computation in P2P Networks
  • Extended Skyline
  • we adjust the dominance definition to compute all
    necessary values co-instantaneously during the
    skyline calculation.
  • Definition 1.
  • For any dimension set U, where U ? D, p
    ext-dominates q if on each dimension di ? U, pi
    ltqi. The ext-skyline (ext-SKYU) is set of all
    points that are not ext-dominated by any other.

9
Skyline Computation in P2P Networks
10
SKYPEER Algorithm
  • Mapping
  • each d-dimensional point p is transformed to a
    one dimensional value f(p) based on the formula
  • Observation 5.
  • Let psky be a skyline point in a subspace U.A
    point p for which the following inequality holds
    cannot be a skyline point in subspace U.

11
SKYPEER Algorithm
12
SKYPEER Algorithm
  • Merging
  • Each peer executes a local subspace skyline
    computation
  • The initiator peer computes the overall subspace
    skyline result by merging the local results

fig2
13
SKYPEER Algorithm
  • Optimization
  • 1. Threshold propagation
  • (i) Fixed Threshold
  • Pinit calculates its threshold t for q(U, t)
    and forwards the threshold value to all
    super-peers.
  • (ii) Refined Threshold
  • Pinit calculates and sends its threshold to
    its neighboring super-peers, which do not
    forward it immediately to other super-peers, but
    rather they first compute the subspace
    skyline, calculate the new threshold t, and then
    forward q(U, t).

fig1
fig2
14
SKYPEER Algorithm
  • 2. Merging strategy
  • (i) Fixed Merging
  • All super-peers forward their computed
    subspace skyline back to Pinit, and Pinit is
    responsible for merging the results and computing
    the resulting subspace skyline for q(U).
  • (ii) Progressive Merging
  • Each super-peer merges the results it receives
    with its locally computed subspace skyline,
    before sending the results back to the
    super-peer from which it originally received
    the query.

fig1
fig2
15
Experimental Evaluation
16
Experimental Evaluation
17
Conclusion
  • 4 variants are more efficient than naive
  • Subspace?
Write a Comment
User Comments (0)
About PowerShow.com