Uncertain Spatiotemporal Databases - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Uncertain Spatiotemporal Databases

Description:

Selectivity estimation. Histograms. Sampling. Stream processing ... Range search, NN, selectivity estimation, ... New query types: fuzzy range search, ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 40
Provided by: foxmu1
Category:

less

Transcript and Presenter's Notes

Title: Uncertain Spatiotemporal Databases


1
Uncertain Spatiotemporal Databases
  • Yufei Tao
  • Chinese University of Hong Kong

2
Spatial databases
  • Query processing Retrieval of stationary
    locations.
  • Range search
  • Nearest neighbor retrieval
  • Object location changes are captured by explicit
    updates to the database.

3
Retrospect
  • First (?) SIGMOD/VLDB/ICDE/PODS paper about query
    processing on spatiotemporal data
  • G. Kollios, D. Gunopulos, and V. Tsotras. On
    Indexing Mobile Objects. PODS, 1999.
  • 2006 1999 7 years.

4
The current literature
  • Historical retrieval
  • A direct (but non-trivial) extension of the
    traditional temporal databases
  • The database stores the historical location of
    each object at every single past timestamp.

5
The current literature (cont.)
  • Find all vehicles that were in the 1km vicinity
    of the crime scene at 7pm on Nov 1 2006.
  • Find the 10 vehicles that were closest to the
    crime scene at 7pm on Nov 1 2006.

6
The current literature (cont.)
  • Future prediction
  • A novel area specific to spatiotemporal
  • The database stores the current location and
    velocity of each object.
  • The objective is to retrieve objects expected to
    satisfy certain predicates in the future.

7
The current literature (cont.)
Current movements
Locations 1 timestamp later
  • Find the aircrafts expected to appear in the air
    space of Hong Kong in 10 minutes.
  • Find the aircraft nearest to UA801 in 10
    minutes.

8
The current literature (cont.)
Current movements
  • Future prediction captures the conventional
    spatial databases as a special case.
  • Queries concern objects current locations only.
  • Hence, object velocities do not matter.

9
The current literature (cont.)
  • Indexing
  • HR-, MVR-, 3D R-trees,
  • TPR-, TPR-, STRIPES, Bx-, Bdual-trees,
  • Range search
  • Nearest neighbor
  • Time parameterized NN
  • Continuous NN
  • Location based NN
  • Selectivity estimation
  • Histograms
  • Sampling
  • Stream processing
  • Fact Harder and harder to impress the database
    community.

10
A new twist?
  • All the above work assumes that the database has
    the exact location of each object.
  • This is rarely possible.

11
Uncertainty
An objects location is described by a
probability density function.
12
Probabilistic range search
Find the clients that are currently in the town
center with at least 50 probability.
13
Qualification probability
Qualification probability
Example uniform
14
Qualification probability (cont.)
  • Probabilistic modeling in practice

15
Qualification probability (cont.)
must be calculated numerically
Calculation time of an appearance probability in
2D space 1.3ms
Time for a random I/O access 10ms
16
Exciting opportunities
  • Uncertain range search
  • Reynold et al. VLDB 04, Tao et al. VLDB 05
  • Uncertain nearest neighbor search
  • Reynold et al. TKDE 04
  • Uncertain selectivity estimation
  • No work
  • Uncertain stream processing
  • No work
  • All existing work considers only uncertain
    stationary objects.
  • Location updates need to be reported to the
    database through explicit updates.
  • Combined with future prediction or historical
    retrieval.
  • No existing work.

17
In this talk
  • Goal Motivate query processing on uncertain
    spatiotemporal objects
  • Uncertain range search on stationary objects
  • If time permits
  • Indexing of historical uncertain data

18
Nonfuzzy queries
  • Find all objects that appear in a search region
    rq with a probability ? tq.

19
Fuzzy queries
  • Find all objects whose distances to an uncertain
    object q are ? eq with a probability ? tq.

Example q the cab with suspects objects
police cars
20
Goal
  • Support any pdf.
  • Convex/concave uncertainty regions
  • Minimize the number of page accesses
  • Minimize the number of qualification probability
    calculations.
  • Minimize the total cost (I/O CPU)

21
Review MBR and why
  • Minimum bounding rectangle (MBR)

22
Main Idea
  • For each object, pre-compute some auxiliary
    information that can be used to
  • efficiently decide whether an object appears in a
    region with at least a certain probability
  • without calculating its actual appearance
    probability.
  • Auxiliary information Probabilistically
    constrained rectangle

23
Probabilistically Constrained Rectangle (PCR)
24
PCR in pruning
Probability threshold tq 0.8.
25
PCR in pruning
Probability threshold tq 0.2.
26
PCR in validating
tq 0.9
tq 0.8
tq 0.1
o.pcr(0.1)
27
PCR in validating (cont.)
tq 0.6
tq 0.8
tq 0.7
o.pcr(0.1)
28
Still down the road
  • Only a finite number of PCRs can be stored for
    each object. Which ones to store?
  • How to index these PCRs to reduce query cost?
  • Cost model analysis?
  • Fuzzy range search (repeat all the above issues)

29
In this talk
  • Goal Motivate query processing on uncertain
    spatiotemporal objects
  • Uncertain range search on stationary objects
  • If time permits
  • Indexing of historical uncertain data

30
Historical uncertain spatiotemporal databases
  • Application
  • A database stores the locations of each vehicle
    on a 10-minute basis in the whole past year.
  • Possible queries
  • Find all vehicles that appeared in the town
    center sometime during 6pm-7pm with a probability
    at least 50.
  • Find all vehicles that appeared in the town
    center during the entire period of 6pm-7pm with a
    probability at least 50.
  • Nearest neighbor search counterparts

31
Existing Uncertainty Models
  • The linear model

32
Existing Uncertainty Models (cont.)
  • The ellipse model Pfoser et al. SSTD 99

33
Existing Uncertainty Models (cont.)
  • The cylinder model Trajcevski et al. TODS 04

34
A general modeling
  • Each object o is associated with an o.pdf(x, t),
    which captures the probability that o is at
    location x at any timestamp t in history.

Conceptually, one pdf like this at every past
timestamp.
35
Query semantics Trajcevski et al. TODS 04
  • Based on the cylinder model

possibly sometime
possibly always
36
Query semantics (cont.)
  • Based on the cylinder model

always possibly
definitely always
37
Query semantics (cont.)
  • Based on the cylinder model

definitely sometime
sometime definitely
38
Query processing
Conceptually, one pdf like this at every past
timestamp.
  • Difficulties
  • There are infinitely number of timestamps in the
    past. How to compute the pdfs at all of them?
  • How to index them?
  • Can PCRs help?
  • Algorithms for supporting all the previous
    semantics based on the index structure?

39
Conclusions
  • Uncertain spatiotemporal query processing
  • Data
  • Stationary, historical, and moving
  • Queries
  • Range search, NN, selectivity estimation,
  • New query types fuzzy range search,
  • Combinations of the above two.
Write a Comment
User Comments (0)
About PowerShow.com