Title: Continuous Reverse Nearest Neighbor Monitoring
1Continuous Reverse Nearest Neighbor Monitoring
- Authors info
- Tian Xia and Donghui Zhang
- College of Computer and Information Science
- Northeastern University
- Presenter
- Kamiru, U
2Outline
- Background
- Problem Definition
- Related Work
- Solution
- Straightforward solution
- Incremental solution
- Experiments
3Background
- Evolutional technologies on hardware enable new
kind of data management applications to monitor
continuous processes - Obtaining amounts of state samples via sensors
(Data Stream) and store into database - So, updates are very frequent in these kinds of
applications - It is also a problem when monitor on the
continuous spatial data / queries
4Traditional Spatial Queries
- There are many existing algorithms to solve
different kinds of spatial queries based on
R-tree, such as - range queries
- k nearest neighbors (kNN)
- k closest pairs (kCP)
- reverse nearest neighbors (RNN)
5Traditional Spatial Queries (Cont)
- Most of them are efficient only on the static
objects and queries - If they try to monitor on the moving objects or
queries, it is necessary to execute some high
cost operations, such as deletion, insertion and
update, to maintain the data in R-tree - Many variations are derived from R-tree to
monitor the moving objects - TPR-tree Saltenis 00
- STAR-tree Procopiuc 02
- REXP-tree Saltenis 02
- FUR-tree Lee 03
6Reverse Nearest Neighbors (RNN)
- RNN definition
- an object o is considered as a query point qs
reverse nearest neighbor, if there does not exist
another object o such that dist(o, o) lt dist(o,
q) - Example
o1 is the q0s RNN
o2
o0
o1
d2
d0
d1
q
7Reverse Nearest Neighbors (RNN) (Cont)
- Previous works on finding RNNs focused either on
the static query Stanoi 00, Tao 04, or the
predictive query Benetis 02 - The predictive query is based on the assumption
of knowing the trajectory information - It uses trajectory-based TPR-tree to predict the
result, but it is too expensive to maintain for
the CRNN query - Unlike the static RNN query, the CRNN query
requires updating the result set efficiently to
reflect the recent motion of objects and queries - So they are inefficient or inapplicable in the
CRNN monitoring problem
8Problem Definition
- Given a set of objects O and a query set Q, all
being static or moving, the CRNN query monitors
the exact reverse nearest neighbors (RNN) of each
query point over time
9Application of CRNN
Soldier C
- One example of CRNNs application is in the
battlefield, where a soldier registers a CRNN
query to monitor the other soldiers who might
need help from him.
Help
To all, Im going to help Soldier A. Because he
is my RNN.
Time 1
Time 0
Soldier A
Soldier A
Soldier B
Assume that Soldier B and C have registered
Soldier A on their CRNN list
10Related Work
- Stanoi et al. Stanoi 00 proposed a method (SAE)
that divides the space centered at the query q
into six equal partitions of 60o - For a given 2-dimensional dataset, RNN(q) will
return at most six data points for any query
point q Smid 97, Korn 99 - And the number of data points that satisfy RNN(q)
is still a constant in higher dimensions.
11SAE
- SAEs filter-refinement framework
- Finds six constrained NNs in each region as the
candidates - For each candidate, it performs the NN search to
see whether the candidate really considers q as
NN (filter out the false positives)
S0
cand5
cand1
nn_cand5
60o
S5
S1
q
nn_cand1
S2
- The candidates of qs RNNs are o1, o2, o4, o5,
o6, o7 - The RNN(s) of q are o7
S3
S4
12Continuous Spatial Queries Monitoring
- Continuous Nearest Neighbor (CNN) query was
recently studied in Xiong 05, Yu 05, Mouratidis
05 and three methods (denoted as SEA-CNN,
YPK-CNN, CPM-CNN, respectively) - All of them use a monitoring region (grid) for
each query point to handle the updates - In this paper, they use the conceptual space
partitioning from CPM-CNN to monitor the update
region
13CPM-CNN
- CPM-CNN partitions the space into grid that
organize the cells into conceptual rectangles - The rectangles are denoted by the
- Direction (Up, Down, Left or Right)
- Level (i.e. the number of rectangles between q
and itself)
14Frequent updates R-tree (FUR-tree)
- Most R-tree variants (TPR-tree, STAR-tree,
REXP-tree) process updates as combinations of
separate top-down deletion and insertion
operations - Top-down update is inherently inefficient
- In R-tree, objects are stored into the leaf of
the tree - The root is the starting point of updates
- So FUR-tree propose a new concept of updating
R-tree, which is bottom-up approach
15FUR-tree (Cont)
- Bottom-up approach is to access the leaf of an
objects entry directly - It requires a secondary index on object IDs
Hash Table
16Straightforward solution
- Straightforward solution indexes the objects
using an FUR-tree and compute the RNNs of every
query point at each time stamp using TPL - TPL Tao 04 method is currently the best
approach for computing RNNs in the static case - FUR-tree is the optimized for frequent updates of
objects
17CRNN Framework
- CRNN consider two situations on monitoring the
RNNs - Queries update
- When an existing query q moves to a new location,
CRNN treats the update as - deleting q with the old location
- re-compute q with the new location
- Although it is not the best way to handle it,
re-computing a moving query is more efficient
than updating from the old query result
Mouratidis 05 - Objects update
- Uses pie-regions and circ-regions to monitor the
object update - Proposes two optimizations
- Lazy-update
- Partial-insert
18CRNN Query Initialization
- If the pop-up element e is a rectangle
- Push the next level rectangle of same direction
into H (heap), and for each cell c in e - If e is a cell, for each object o in e
- For all candi ! null
- update nn_canni and d(nn_candi, candi) if
necessary - candj is the nearest candidate to o
- If o in Sk
- update candk and dnnk d(q, candk) if necessary
- update nn_candk and d(nn_candk, candk), where
nn_candk is either q or candj which is closer
S0
S1
S5
...
S4
S2
S3
(1)
(2)
(n)
...
19CRNN Query Initialization (Cont)
- When we pop up C2,5
- candj null because all candi null
- Set
- cand1 o7
- nn_cand1 q
- S1 is checked
CHECKED
- After we find all candidates candi in each Si
- For all nn_candi q,
- perform NN search on candi
- update nn_candi and d(nn_candi, candi)
- Output candi if nn_candi q
20Monitoring region of CRNN query
- In order to enable the possibility of incremental
processing, it is necessary to maintain the
monitoring region for the continuous query, such
that - guarantee the query results are unaffected as
long as no update happens inside the region - Straightforward proposal might consider the union
of every circle - Center is some RNN objects
- Radius is its distance to the query point q
- But it does not work
o1
o3
q
o2
o4
o5
o5
21Monitoring region of CRNN query (Cont)
- Pie-region
- Given a query point q, the space is divided into
6 partitions (Si) - Pie-region in Si is a pie centered at q and
having the constrained NN in Si on the perimeter - Circ-region
- Circ-region in Si is a circle centered at the
candidate in Si and having either q or an object
closer than q on the perimeter
cand0
nn_cand0
cand1
S0
S1
S5
q
nn_cand1
S2
S4
S3
22Handling Updates in Pie-regions
- The pie-region information is stored in each cell
- Updating the pie-region
- Some object(s) (o4, o8) move into pie-region (Si)
- Set candi o and dnni dist(o, q)
- Candidate(s) (o4) leave a pie-region (Si)
- Perform a constrained NN serach in Si to
determine the new pie-region - Candidate(s) (o6, o7) moves in the same
pie-region (Si) - Update dnni
- Finally, use updateCand (performing the NN
serach) to update the circ-region
23Handling Updates in Circ-regions
- We cannot store circ-regions by associating every
cell that intersects with them, because it is
expensive for the following reasons - Circ-region is not always changed incrementally
- Circ-region may change frequently
- This paper use FUR-tree to store the circ-region
that correspond to each candidate
24Handling Updates in Circ-regions (Cont)
- FUR-tree maintain the following
- the radius of circ-region and the candidate store
to the leaf - it stores the max radius for all candidates in
the sub-tree - each candidate will also store the queries it
belongs to - The hash table store
- the set of nn_candi
- the pointers to their corresponding candidates in
the leaf
25Optimization
- Lazy-Update
- The NN search is performed only when the enlarged
circ-region cover q - Partial-Insert
- FUR-tree stores the candidates whose
circ-regions radii are larger than a threshold - Other candidate cand and the corresponding
nn_cand are stored in a hash table
26Comparison with the straightforwardsolution
27Varying the data size
28Varying the percentage of movingdata per time
stamp
29References
- Saltenis 00 S. Saltenis, C.S. Jensen, S.T.
Leutenegger, and M.A. Lopez. Indexing the
Positions of Continuously Moving Objects. In
Proc. of ACM SIGMOD, 2000. - Procopiuc 02 C. Procopiuc, P. Agarwal, and S.
Har-Peled. Star-Tree An Efficient Self-Adjusting
Index for Moving Objects. In Proc. of ICDE
(poster), 2002. - Saltenis 02 S. Saltenis and C.S. Jensen.
Indexing of Moving Objects for Location-Based
Services. In Proc. of ICDE, 2002. - Lee 03 Mong-Li Lee, Wynne Hsu, Christian S.
Jensen, Bin Cui, and Keng Lik Teo. Supporting
frequent updates in r-trees A bottom-up
approach. In VLDB, pages 608619, 2003. - Stanoi 00 Ioana Stanoi, Divyakant Agrawal, and
Amr El Abbadi. Reverse nearest neighbor queries
for dynamic databases. In ACM SIGMOD Workshop on
Research Issues in Data Mining and Knowledge
Discovery, pages 4453, 2000. - Tao 04 Yufei Tao, Dimitris Papadias, and Xiang
Lian. Reverse knn search in arbitrary
dimensionality. In VLDB, pages 744755, 2004. - Benetis 02 Rimantas Benetis, Christian S.
Jensen, Gytis Karciauskas, and Simonas Saltenis.
Nearest neighbor and reverse nearest neighbor
queries for moving objects. In IDEAS, pages
4453, 2002. - Smid 97 M. Smid. Closest point problems in
computational geometry. In Handbook on
computational Geometry, Elsevier Science
Publiching, 1997. - Korn 99 F. Korn and S. Muthukrishnan. Influence
sets based on reverse nearest neighbor queries.
Technical report, ATT Labs Research,
http//www.research.att.com/resources/trs/,1999. - Mouratidis 05 Kyriakos Mouratidis, Dimitris
Papadias, and Marios Hadjieleftheriou. Conceptual
partitioning An efficient method for continuous
nearest neighbor monitoring. In SIGMOD
Conference, pages 634645, 2005.
30References (Cont)
- Xiong 05 Xiaopeng Xiong, Mohamed F. Mokbel, and
Walid G. Aref. Sea-cnn Scalable processing of
continuous k-nearest neighbor queries in
spatio-temporal databases. In ICDE, pages
643654, 2005. - Yu 05 Xiaohui Yu, Ken Q. Pu, and Nick Koudas.
Monitoring k-nearest neighbor queries over moving
objects. In ICDE, pages 631642, 2005.
31The END
- Thank you for your attendance
32Appendix A
- Properties of monitoring region
- The region usually has a regular shape
- The region only contains the result objects
- The region does not rely on the distances between
objects
33Appendix B
d3
3
d1 d, o1 and o4 are RNNs of q d2 lt d, only o2
is RNN of q d3 gt dk gt d, only o4 is RNN of q
k
4
1
d1
d2
2
d
d
60o
q
d
d
d
d