Location-based Spatial Queries - PowerPoint PPT Presentation

About This Presentation

Title:

Location-based Spatial Queries

Description:

Department of Computer Science. Hong Kong University of Science and Technology. and ... approach to attain up-to-date information is to pose a new query to the server ... – PowerPoint PPT presentation

Number of Views:181

Avg rating:3.0/5.0

Slides: 59

Provided by: sree71

Learn more at: https://crystal.uta.edu

Category:

more less

Transcript and Presenter's Notes

Title: Location-based Spatial Queries

1
Location-based Spatial Queries

AGM SIGMOD 2003
Jun Zhang, Manli Zhu, Dimitris Papadias, Yufei
Tao, Dik Lun Lee
Department of Computer Science
Hong Kong University of Science and Technology
and
Carnegie Mellon University
Presented By
Sreepraveen Veeramachaneni

2
Outline

Introduction
General techniques in spatial query Processing
Motivation
Background
Location-based nearest neighbor search
Location-based window queries
Experiments
Summary
References

3
Introduction

The paper proposes an approach that enables
mobile clients to determine the validity of
previous queries based on their current location.
Two types are spatial queries are discussed
1.Window Queries
2.Nearest Neighbor Queries

4
Techniques In use

Spatial databases have been extensively studied
during the last two decades and several spatial
access methods have been proposed.
The most popular one is R-tree and its variations
like R -tree.
R-trees can be viewed as multi-dimensional
extensions of B-trees.

5
R-treeAssuming capacity of 3 entries per node.
Points that are close in space are clustered in
the same leaf node represented as a MBR. Nodes
are then recursively grouped together following
the same principle until the top level, which
consists of a single root
6
R tree Contd.

This technique is used to answer window queries
Another important type of spatial information
processing is nearest neighbor query, which
retrieves the data point that is closest to a
query point

7
Branch and Bound Algorithm

Roussopoulos et al., proposed a branch and bound
algorithm that searches the R-tree in a depth
first manner.
Starting from root, all entries are sorted
according to their minimum distance from the
query point, and the entry with the smallest
value is visited first.
The process is repeated recursively until the
leaf level where the first potential nearest
neighbor is found.
During backtracking to upper levels, the
algorithm only visits entries whose mindist is
smaller than the distance of the nearest neighbor
already found.

8
(No Transcript)
9
Outline

Introduction
General techniques in spatial query Processing
Motivation
Background
Location-based nearest neighbor search
Location-based window queries
Experiments
Summary

10
Traditional Scenario

The traditional scenario in spatial databases
assumes that
Queries are static and
Each query returns a single output and terminates.

11
Where Is Your Nearest Restaurant?

Traditional nearest neighbor search in spatial
databases considers static query points.

12
What if You Move?

Getting only the nearest neighbor is inadequate
When will it expire?

The conventional approach to attain up-to-date
information is to pose a new query to the server
after a position update, which could lead to high
network overhead and extra processing effort.
And due to high mobility of the user, the result
may be invalidated immediately as the users
position changes

14
Outline

Introduction
General techniques in spatial query Processing
Motivation
Background
Location-based nearest neighbor search
Location-based window queries
Experiments
Summary

15
First spatial query processing Technique for
Mobile computing Zheng, Lee SSTD 2001

The first technique was to pre-compute and store
in an R-tree the Voronoi diagram of the dataset.
Voronoi Diagram The Voronoi diagram of a
collection of geometric objects is a partition of
space into cells, each of which consists of the
points closer to one particular object than to
any others

When a nearest neighbor query arrives at the
server, the Voronoi diagram is used to
efficiently compute the nearest neighbor

In addition to result, the server sends back to
the client the client the validity time T of the
result, which is a conservative approximation
assuming that the querys speed is below a
maximum value.
Problem Difficult to estimate the value of Query
speed. A high value will result in very short T
and a low value will result in false misses
The method only deals with single nearest
neighbor queries and retrieval of K neighbors
would require order-K Voronoi diagrams , which
are complicated and incur large space overhead.

18
K- Nearest neighbor query Song, Roussopoulos,
SSTD 2001

Song and Roussopoulos proposed a technique that
does not assume Voronoi diagrams and can be used
for any number of neighbors.
When a k nearest neighbor query q arrives, the
server computes and returns to the client a
number m gt k of neighbors.

19
Implementation

Let dist (k) and dist (m) be the distances of the
kth and mth nearest neighbor from the query point
q.
If the client re-issues the query at a new
location q, the new k nearest neighbors will be
among the m objects of the first query, provided
that
2.dist(q,q) dist(m) dist(k)

20
Example

The figure shows an example for a 2-nearest
neighbor query at location q, where the server
returns four results o, a, b and c ( the nearest
neighbors are o and a)
When the client moves to the location q, the two
NN are o and b.
If 2.dist(q, q) dist(4) dist(2), the client
can determine this by computing new distances
(wrt to q) of the four objects, with out having
to issue a new query to the server

21
Problems

An obvious problem of this approach lies in
obtaining a proper value of m
A high value will increase the network overhead
and the storage requirements at the client, while
a low value may be useless( if it does not reduce
the number of queries)
m depends on factors like data distribution and
query frequency which are difficult to estimate

22
Time Parameterized Nearest Neighbor (TP NN) Tao
and Papadias, SIGMOD02

Given a query moving with steady velocity, return
all nearest neighbor results ( up to a future
timestamp), i.e., the output is a set of tuples
ltRi, Tigt, where Ri is the set of nearest
neighbors during future interval Ti
For this situation, the concept of time
parameterized queries can be applied for both
window queries and nearest neighbor queries.
When a server receives a request from a client ,
it executes a TP query and returns ltR,T,Cgt, where
R is the set of objects satisfying the
corresponding spatial query (current result), T
is the validity time of R, and C is the result
change at T
From the set of objects in R, and the set of
objects in C that will cause the changes , the
client can incrementally compute the next result

23
TP window Query

Consider, that a client moving east with speed
one issues a window query.
The server returns ltb, 1, -bgt meaning that
object b currently intersects the query window,
but after 1 time unit it will stop doing so and
therefore, b should be removed from the result.

24
Influence of a Object

The result of a spatial query changes in future
because some objects influence its correctness.
If an object (e.g., b) satisfies the query at the
current time, it may influence the result when it
no longer satisfies it in the future (at time 1).
An object not currently in the result (e.g., d)
may influence the query when it becomes part of
the result (at time 2).
Some objects such as a and c, may never change
the result, so their influence time is set to 8

25
Time Parameterized Nearest Neighbor (TP NN)Tao
and Papadias, SIGMOD02

Returns
The nearest neighbor R of the current query
location
The expiry time T of R (given the querys
movement)
The change C of the result at T

Result Ri, T2, Cj
26
Problem with the techniques discussed so far

All the techniques we discussed for mobile
computing presuppose that future locations of
clients can be calculated using their current
movements (i.e., the velocity of client is known
and constant during the lifespan of the query)
But in many applications query velocities are
continuously updated as the users change their
speed or direction of movement
Motivated by this, the authors introduce a
technique where, instead of time, the validity of
the result is determined by the users location in
space.

27
Outline

Introduction
General techniques in spatial query Processing
Motivation
Background
Location-based nearest neighbor search
Location-based window queries
Experiments
Summary

28
Location-Based Nearest Neighbors

Assumptions
We assume that there exists a spatial index
(e.g., R-trees) for data objects, but no
specialized structures (e.g., Voronoi diagrams)
for nearest neighbor search.

29
Getting You Covered by the Nearest Restaurant

Some users (say, a tourist walking causally)
cannot specify their heading directions clearly.

30
Validity Region of the Result

In addition to the nearest restaurant, we also
return the validity region of this restaurant.
Another query is issued to retrieve the new
nearest restaurant, only if the user moves out of
this region.

31
Influence Points

Points that determine the influence region.

32
Influence Points

Keeping the influence points avoids the
in-polygon check.
The user only needs to check if her/his location
is closer to any yellow point than a.

33
Validity Region A Closer Look

The validity region of q is the Voronoi Cell (VC)
of o.

34
Pre-Compute the Voronoi Diagram?

Bad idea!
To answer kNN of a specific value k, a k-order
Voronoi Diagram is necessary.
If we want to answer NN, 2NN, , 20NN, then 20
sets of Voronoi Diagrams are necessary.
Huge space!
Poor support for data update.
Our solution Compute the cell on the fly.
Use a single R-tree
Support all values of k

35
Relationship with Time Parameterized NN

If q moves towards l, then its nearest restaurant
will change to point a at position q.
The corresponding TP query q returns (i) o,
(ii), a, and (iii) q.

36
Algorithm

Step 1 Find the current NN
Step 2 Use TP NN queries to tighten the
validity region progressively

37
Algorithm

Step 2 Use TP NN queries to tighten the
validity region progressively

The algorithm issues totally 2Sinf TP NN queries,
where Sinf is the number of influence points.
This algorithm generalizes to computing k-order
Voronoi Cells for arbitrary values of k (see the
paper for details).

38
Extensions to k NN queries

The above method can be easily applied to k
nearest neighbor queries, where the validity
region is the maximal area around the query,
where each point has the same set of k nearest
neighbors.

39
Outline

Introduction
General techniques in spatial query Processing
Motivation
Background
Location-based nearest neighbor search
Location-based window queries
Experiments
Summary
References

40
Location-based Window Queries Find All Close
Restaurants

Some users would consider more restaurants in
their vicinity.
The validity region here is such that, as long as
the user stays in this region, the query result
does not change.

41
Location-based Window Queries

The focus f of the window query q is the centroid
of the query window
The validity region V (q) of a query q is the
maximal area around the query focus (i.e., f ? V
(q)) where the query result R (q) does not change
The points that satisfy q are called inner
objects, and those outside the query window outer
objects

42
Location-based Window Queries

The Minkowski region of each point (e.g., a) is a
rectangle (ra) identical to the query window
whose centroid lies on the corresponding point
(a)
If query focus moves inside ra, the query result
always contains object a.
The intersection of the inner Minkowski regions
corresponds to inner validity region.

43
The Validity Region of Window Queries

If the user location is at the boundary of the
validity region, the corresponding query windows
boundary will cross some data point.

44
The Validity Region of Window Queries

If the user location is at the boundary of the
validity region, the corresponding query windows
boundary will cross some data point.

45
The Validity Region of Window Queries

If the user location is at the boundary of the
validity region, the corresponding query windows
boundary will cross some data point.

46
The Validity Region of Window Queries

If the user location is at the boundary of the
validity region, the corresponding query windows
boundary will cross some data point.

47
The Influence Points

In addition to the query result a,b,c, the user
is also aware of 2 inner influence points a,b
and 2 outer influence points d,e.
The original result is invalidated if the query
window
Does not cover any inner influence point.
Covers any outer influence point
The user does not need to store the actual
boundary of the validity region).

48
Retrieving the Influence Points

First get the query result a,b,c (a traditional
window query).
Then the influence points.
Using Time Parameterized Window Queries (see
paper).

49
Outline

Introduction
General techniques in spatial query Processing
Motivation
Background
Location-based nearest neighbor search
Location-based window queries
Experiments
Summary
References

50
Experiments

Datasets
GR (23K, data space 800km?800km),
NA (569K, data space 7000km?7000km)
Disk page size set to 4k bytes
Index R-tree
Queries
LB kNN parameter k
LB WQ parameter query length
Each workload consists of 200 queries with the
same parameters distributed uniformly in the data
space.

51
Experiments

The area of validity region drops linearly with
cardinality since the number voronoi cells
increases ( while the area of data space remains
constant).
Under all settings the average number of edges
in a voronoi cell is 6 for uniform datasets which
is equal to number of influence objects.

52
Experiment 1 Number of Influence Points for LB
kNN

The number of influence objects decreases to 4
for kgt10. this is because for kgt1, an influence
object may contribute more than one edge (since
it can form perpendicular bisector with any of
the k nearest neighbors of the query), while the
total number of edges remains around 6.

53
Experiment 2 Query cost for LB kNN

The above figure shows the number of node
accesses as a function of cardinality for k 1
The number of nodes accesses for TPNN queries is
about 12 times that of the regular nearest
neighbor query because, on average we need 6 TPNN
queries to retrieve the influence objects and
another 6 queries to confirm the vertices of the
validity region.

54
Experiment 2 Query cost for LB kNN with a buffer

As we can see, using an LRU buffer equal to 10
of the R-tree size the actual cost of TPNN
queries reduces significantly, since all the
queries access similar parts of the data space.
Thus, given a relatively small buffer, the
overhead imposed by location-based NN queries is
not significant

55
Experiment 3 Number of Influence Points for LB WQ
56
Experiment 4 Query cost for LB WQ
57
Conclusion

Location-based queries retrieve the validity
regions for the query results.
We considered kNN and window queries.
Future work
Apply the concept of validity region to other
types of queries (e.g., range queries).
Study the incremental computation of the query
result (i.e., what happens after the user exits
the validity region?)

58
References

Song, Z., Roussopoulos, N. K-Nearest Neighbor
Search for Moving Query Point. SSTD 2001
Tao, Y., Papadias, D. Time Parameterized Queries
in Spatio-Temporal Databases. SIGMOD 2002
Zheng, B., Lee, D. Semantic Caching in Location
Dependant Query Processing. SSTD 2001
Roussopoulos, N., Kelly, S., Vincent, F. Nearest
Neighbor Queries. SIGMOD 1995

Write a Comment

User Comments (0)