On Computing Topt Most Influential Spatial Sites - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

On Computing Topt Most Influential Spatial Sites

Description:

VLDB 2005, Trondheim, Norway. 1. On Computing Top-t Most Influential Spatial Sites ... VLDB 2005, Trondheim, Norway. 6. Example. Now that Q is the shadowed ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 44
Provided by: vldbId
Category:

less

Transcript and Presenter's Notes

Title: On Computing Topt Most Influential Spatial Sites


1
On Computing Top-t Most Influential Spatial Sites
  • Tian Xia, Donghui Zhang, Evangelos Kanoulas, Yang
    Du
  • Northeastern University
  • Boston, USA

2
Outline
  • Problem Definition
  • Related Work
  • The New Metric minExistDNN
  • Data Structures and Algorithm
  • Experimental Results
  • Conclusions

3
Problem Definition
  • Given
  • a set of sites S
  • a set of weighted objects O
  • a spatial region Q
  • an integer t.
  • Top-t most influential sites query
  • find t sites in Q with the largest influences.
  • influence of a site s total weight of objects
    that consider s as the nearest site.

4
Motivation
  • Which supermarket in Boston is the most
    influential among residential buildings?
  • Sites supermarkets
  • Objects residential buildings
  • Weight people in a building
  • Query region Boston
  • Which wireless station in Boston is the most
    influential among mobile users?

5
Example
  • Suppose all objects have weight 1, Q is the
    whole space, and t 1.
  • The most influential site is s1, with influence
    3.

6
Example
o2
o4
s2
s3
o5
o1
s4
s1
o3
o6
  • Now that Q is the shadowed rectangle and t 2.
  • Top-2 most influential sites s4 and s2.

7
Outline
  • Problem Definition
  • Related Work
  • The New Metric minExistDNN
  • Data Structures and Algorithm
  • Experimental Results
  • Conclusions

8
Related Work
  • Bi-chromatic RNN query considers two datasets,
    sites and objects.
  • The RNNs of a site s ? S are the objects that
    consider s as the nearest site.

9
Related Work
  • Solutions to the RNN query based on
    pre-computation KM00, YL01.

10
Related Work
  • Solution to RNN query based on Voronoi diagram
    SRAE01.
  • Compute the Voronoi cell of s a region enclosing
    the locations closer to s than to any other
    sites.
  • Querying the object R-tree using the Voronoi cell.

11
Related Work SRAE01
o2
o4
s2
s3
o5
o1
s4
s1
o3
o6
12
Our Problem vs. RNN Query
  • RNN query
  • A single site as an input.
  • Interested in the actual set of the RNNs.
  • Top-t most influential sites query
  • A spatial region as an input.
  • Interested in the aggregate weight of RNNs.

13
Straightforward Solution 1
  • For each site, pre-compute its influence.
  • At query time, find the sites in Q and return the
    t sites with max influences.
  • Drawback 1 Costly maintenance upon updates.
  • Drawback 2 binding a set of sites closely with a
    set of objects.

14
Straightforward Solution 2
  • An extension of the Voronoi diagram based
    solution to the RNN query.
  • Find all sites in Q.
  • For each such site, find its RNNs by using the
    Voronoi cell, and compute its influence.
  • Return the t sites with max influences.

15
Straightforward Solution 2
  • Drawback 1 All sites in Q need to be retrieved
    from the leaf nodes.
  • Drawback 2 The object R-tree and the site R-tree
    are browsed multiple times.
  • For each site in Q, browse the site R-tree to
    compute the Voronoi Cell.
  • For each such Voronoi Cell, browse the object
    R-tree to compute the influence.

16
Features of Our Solution
  • Systematically browse both trees once.
  • Pruning techniques are provided based on a new
    metric, minExistDNN.
  • No need to compute the influences for all sites
    in Q, or even to locate all sites in Q.

17
Outline
  • Problem Definition
  • Related Work
  • The New Metric minExistDNN
  • Data Structures and Algorithm
  • Experimental Results
  • Conclusions

18
Motivation
  • Intuitively, if some object in Oi may consider
    some site in Sj as an NN, Oi affects Sj.
  • To estimate the influences of all sites in a site
    MBR Sj, we need to know whether an object MBR Oi
    will affect Sj.

19
maxDist A Loose Estimation
  • If maxDist(O1, S1) affect S2.
  • Why not good enough?

20
minMaxDist A Tight Estimation?
  • An object o does not affect S2, if there exists
    S1 such that
  • minMaxDist(o1, S1)

21
minMaxDist A Tight Estimation?
  • Not true for an object MBR O1.

22
A Tight Estimation?
  • A metric m(O1, S1) should
  • guarantee that, each location in O1 is within
    m(O1, S1) of a site in S1,
  • and be the smallest distance with this property.

23
New Metric minExistDNNS1(O1)
  • Definition minExistDNNS1(O1)
  • max minMaxDist(l, S1) ? location l? O1
  • O1 does not affect S2, if there exists S1, s.t.
    minExistDNNS1(O1)

24
Examples of minExistDNNS1(O1)
  • How to calculate it?

25
Calculating minExistDNNS1(O1)
  • Step 1 Space partitioning

Every location l in the same partition is
associated with the second closest corner of S1
the distance is minMaxDist(l, S1)!
26
Space Partitioning
  • O1 is divided into multiple sub-regions, one in
    each partition.

27
Calculating minExistDNNS1(O1)
  • Step 2 Choose up-to 8 locations on O1 border
    and compute the minMaxDists to S1.
  • minExistDNN is the largest one!

28
Outline
  • Problem Definition
  • Related Work
  • The New Metric minExistDNN
  • Data Structures and Algorithm
  • Experimental Results
  • Conclusions

29
Data Structure
  • Two R-trees S of sites, O of objects.
  • Three queues
  • queueSIN entries of S inside Q.
  • queueSOUT entries of S outside Q.
  • queueO entries of O.

30
Data Structure
  • queueSIN
  • queueO
  • queueSOUT

S1
S2
O1
S3
31
maxInfluence and minInfluence
  • For each entry Sj in queueSIN,
  • maxInfluence total weight of entries in queueO
    that affect Sj.
  • minInfluence total weight of entries in queueO
    that ONLY affect Sj, divided by the number of
    objects in Sj.
  • queueSIN is sorted in decreasing order of
    maxInfluence.

32
Algorithm Overview
  • Expand an entry from one of the three queues.
  • Remove the entry from the queue.
  • Retrieve the referenced node, and insert the
    (unpruned) entries into the same queue.
  • Update maxInfluence and minInfluence if
    necessary.
  • If top-t entries in queueSIN are sites, with
    minInfluences maxInfluences of all remaining
    entries, return.

33
Example
  • queueSIN S1
  • queueO O1
  • queueSOUT S3
  • queueSIN S5, S7
  • queueO O6
  • queueSOUT S9

Q
  • S6 is not affected by O1, prune S6.
  • O5 does not affect S5 and S7, prune O5.

34
A Pruning Case
S1
Expand S1
  • S2 is pruned because of minExistDNNS3(O1) minDist(S2, O1)

35
Choosing an Entry to Expand
  • Expand top entries in queueSIN.
  • Expand the most important Oi.
  • Importance Oi affected entries area(Oi)
  • Expand Sj that contains the most important Oi.

36
Choosing an Entry to Expand
  • Estimate the probability of pruning Oi using some
    Sj in queueSOUT.
  • After expanding S2, O1 is likely not to affect S1.

37
Outline
  • Problem Definition
  • Related Work
  • The New Metric minExistDNN
  • Data Structures and Algorithm
  • Experimental Results
  • Conclusions

38
Experimental Setup
  • Data sets
  • 24,493 populated places in North America
  • 9,203 cultural landmarks in North America
  • R-tree page size 1 KB
  • LRU buffer 128 disk pages.
  • t 4.
  • Comparing to the solution using Voronoi diagram.

39
Selected Experimental Results
  • sites objects 1 2.5

40
Selected Experimental Results
  • sites objects 2.5 1

41
Outline
  • Problem Definition
  • Related Work
  • The New Metric minExistDNN
  • Data Structures and Algorithm
  • Experimental Results
  • Conclusions

42
Conclusions
  • We addressed a new problem Top-t most
    influential sites query.
  • We proposed a new metric minExistDNN. It can be
    used to prune search space in NN/RNN related
    problems.
  • We carefully designed an algorithm which
    systematically browses both R-trees once.
  • Experiments showed more than an order of
    magnitude improvement.

43
Thank you!
  • Q A
Write a Comment
User Comments (0)
About PowerShow.com