Spatial Access Methods - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Spatial Access Methods

Description:

Spatial Access Methods & Query Processing. Matei Lunca Geographic Information Analysis 2004 ... GIS vragen zoals 'buffer rond rivier' 4. Extending RDMS for GIS/GIA ... – PowerPoint PPT presentation

Number of Views:319
Avg rating:3.0/5.0
Slides: 44
Provided by: membersMu
Category:

less

Transcript and Presenter's Notes

Title: Spatial Access Methods


1
Spatial Access Methods Query Processing
  • Matei Lunca Geographic Information Analysis
    2004
  • Richardson Van Oosterom - Advances In Spatial
    Data Handling

2
Inhoud
  • Extend RDMS for GIS/GIA
  • Trees
  • Query types
  • The curse of dimensionality
  • Approximate matches

3
Geographic Information Retrieval
  • Spatial Access Methods
  • Algoritmes voor opslaan en vinden van ruimtelijke
    gegevens 3-D met sterke relatie en dus niet via
    gewone structuren zoals B-Trees op te slaan
  • Query Processing
  • Datastructuur en DB zoekacties in deze context
  • GIS vragen zoals buffer rond rivier

4
Extending RDMS for GIS/GIA
  • In GIS objects organized by location and
    extension in space
  • Because of arbitrary complexity of spatial
    objects access methods for 2D objects such as
    minimum bounding rectangles needed
  • Curse of dimensionality!

5
Requirements of spatial access methods
  • Dynamic
  • Random access and queries must be supported
  • Space efficient
  • Complex spatial data can in many cases not be
    partitioned because of relations between objects,
    thus data blocks may be large and not fit In
    memory
  • Efficiency independent of operators/ distribution
  • For multiple DB storing different types of data
    to be joined
  • Compatible with concurrency

6
Practical requirements
  • Costs of computing and communicating data
  • Minimize external access costs (I/O)
  • Indexing Trees
  • Pointers at leaves/nodes
  • Searching going down tree
  • Fast for range queries
  • Hashing address buckets
  • No ordering needed

7
Challenges in Indexing
  • Most DB support
  • B-Trees
  • Hash tables
  • Few DB support
  • R-Trees
  • Region quadtree
  • Why is implementation so difficult?
  • Integration with query optimizer
  • Providing query operators that utilize the index
  • Cost model (efficiency known before
    implementation)
  • Concurrency control and recovery techniques

8
Space Driven VS Data Driven
  • Space Driven Trees
  • Decomposition independent from data insertion
    order
  • Region quadtree
  • Data Driven Trees
  • Space decomposed based on input data
  • Point quadtree
  • K-D Tree

9
Space/Data Driven Structures
  • Space driven structures Grids
  • Twin grid file
  • Shuffles points between the primary and secondary
    file to minimize the total size
  • Multilayer grid file
  • Uses two or more grid files, storing objects in
    the first grid file where no splitting across
    hyperplanes is needed
  • Data driven structures - R-Tree

10
Trees
  • X-Tree
  • TR-Tree
  • IQ, PX MDX-Trees
  • PX-Tree
  • TV-Tree
  • VAM-Split Trees

11
Trees X-Tree
  • Adapts R-Trees to high dimensional data
  • Overlap-free split based on split history
  • R/R-Trees lead to high overlap
  • diminish advantages of hierarchical partitions
  • When algorithm would lead to unbalanced directory
    the X-Tree omits the split and the node becomes a
    super node
  • Supernodes are nodes enlarged by a multiple of
    the block size that avoid splits that would
    result in an inefficient structure by linear
    scanning

12
Trees X-Tree (2)
  • Dynamically use overlap-minimizing splits
  • Supernodes accessed sequentially if no good split
    decision found for a directory node

13
Trees TR-Tree
  • Improved R-Tree
  • Represent exact geometry spatial attributes
  • Reduce memory operations
  • Store components of 1 decomposed object
  • Internal node
  • Pointer child node
  • Minimum bounding rectangle of trapezoids in child
  • Leaf node
  • Trapezoids

14
Trees TR-Tree (2)
  • Representation of Bavaria

15
Trees IQ-, PX- MDX-Trees
  • IQ-Tree
  • Index structure for query processing in
    high-dimensional data spaces
  • Compresses data to improve query processing
  • PX-Tree Multi-Disc X-Tree
  • Parallel access method
  • Short response time high query throughput

16
Trees TV-Tree
  • R-Tree-like varying length feature vector
  • Telescope vector
  • Divide attributes into
  • Those common to all subtree items
  • Those used for branching
  • Those ignored
  • Knowledge about the behaviour of single
    attributes (their selectivity) is necessary

17
Trees VAM-Split Trees
  • VAM-Split R-Tree
  • VAM-Split KD-Tree
  • Static index structures
  • All objects must be available when index is
    created
  • Splits are performed at maximum variance value
  • Built in memory before permanently stored on disk
  • Size limited to the amount of (virtual) memory
    available

18
Other Trees
  • The Cell Tree
  • Levels of data split by arbitrary hyperplanes
  • Concave objects decomposed into convex pieces,
    which are indexed in every cell that they overlap
  • The K-D Tree
  • Levels of data are split along different
    dimensions into non-overlapping cells
  • Objects indexed in all cells they intersect

19
Other Trees (2)
  • Generalized BD Tree
  • Stores objects as hierarchy of minimum bounding
    boxes
  • The P-Tree
  • Hyperplanes split space hierarchically by
    polytopes
  • multidimensional boxes with nonrectangular
    sides
  • R-Tree special case in which all polytopes are
    boxes
  • R-files
  • Divide space into hierarchy of nested boxes in
    which objects are indexed in lowest cell which
    contains them

20
Cost Models
  • Curse of dimensionality performance
    deteriorations
  • Cost model for query processing in
    high-dimensional data spaces for careful
    optimization of parameters of an index
  • Data space quantization
  • Data compression - VA File, IQ Tree
  • Reduce I/O by representing attributes in less
    bits
  • Page size
  • Dimension assignment

21
High-dimensional data spaces massive data sets
  • Exotic data, cardinality/dimensionality
  • Terabyte, petabyte
  • Common problem overfit the data
  • Common challenge fit model/pattern robustly
  • Compression, statistics, stochastic analysis,
    discrete mathematics, harmonic analysis
  • Complexity noisiness lead to constructing
    statistical/fuzzy models

22
The Pyramid-Technique
  • Maps data from D-dimensional space to 1D so
    B-Trees can be used to manage data
  • Data space is divided into 2D pyramids
  • Pyramids partitioned into data pages of B-Tree
  • No inverse transformation needed because data and
    D-dimensional key stored

23
The Pyramid-Technique (2)
  • Complex queries
  • Pyramid value calculated from query input
  • Querying the tree with this value
  • Result D-dimensional points sharing pyramid
    value that must be scanned for the search item
  • Efficient query processing only in lt 8 D

24
Query processing
  • Direct VS indirect spatial search
  • Direct locating objects in an geographical area
  • Indirect queries based on non-spatial
    attributes
  • Show geography complying non-spatial requirements

25
Query processing steps
  • Query input
  • Filter step
  • Spatial index
  • Candidate set
  • Refinement step
  • Load spatial extent
  • Test spatial extent
  • Hits/false drops
  • Query result output

26
Graphical Query Example
27
Graphical Query Example
28
Query types
  • Point query/point-in-polygon query
  • Parameter coordinates
  • What objects exists at these coordinates?
  • Window/range query
  • Parameter region defined by coordinates
  • What objects are located in this region?
  • Distance and Buffer Zone queries
  • Parameters buffer object and distance
  • What objects are there within given distance from
    buffer?

29
Query types (2)
  • Path queries (network structure required)
  • Parameters network locations
  • What is the shortest route from A to B?
  • Join and Range queries
  • Spatial objects and relationships
  • Spatial predicates points, windows, buffers,
    paths
  • Overlaying roads and waterworks GIS layers and
    displaying the result according to relative
    height (river, bridge, aqueduct) is a spatial join

30
Query types (3)
  • Feature approach feature vectors
  • Neighborhood search
  • Spatial-Query-by-Sketch
  • Multimedia (2D) search instead of alphanumeric

31
Spatial-Query-by-Sketch Sketcho 1.1b
32
Spatial-Query-by-Sketch Sketcho 1.1b
33
Spatial-Query-by-Sketch Sketcho 1.1b
34
Spatial-Query-by-Sketch Sketcho 1.1b
35
Similarity search
  • Approximate surface
  • by parametric functions
  • Assigning appropriate
  • class to query object
  • Section Coding each polygons circumcircle is
    decomposed into sectors normalized
  • Similarity distance feature vectors

36
Similarity search (2)
  • Shape Histograms (feature vectors!)
  • Bins complete disjoint cells of space
  • Shell Model
  • Concentric uniform shells around the center
  • Independent of rotation around the center
  • Sector Model
  • Distribute uniformly on surface (Voronoi)

37
Shape Histograms
38
Special Query Types
  • Spatial continuous queries
  • In dynamic environments continuous pooling
    necessary, because otherwise query results
    meaningless
  • Result, expiry time given current motion vector,
    and change that can cause expiration
  • Spatio-temporal queries
  • Spatiotemporal Database Systems (STDBS) track and
    presenting data about moving objects, such as GPS
  • Probabilistic models are also available that
    attempt to plot future values in order to give
    faster response

39
Query pre-processing
  • Pre-optimize index structure
  • With specific knowledge if we use a TIN for
    river network studies, valleys are more important
    and could be stored at high nodes in tree
  • Avoid characteristic areas dont store exact
    geometry of a chasm, but no-go denomination

40
Query processing strategies
  • Parallel searches (nice split)
  • In varying data structures
  • Shape-based strategy
  • Models the direction region
  • Converts processing of direction predicates into
    processing of topological operations between open
    shapes and closed geometry objects
  • Eliminates computation related to world boundary

41
Approximate Search/Match
42
Screenshots - LTRMP
43
Hoofdpunten
  • Spatial context definieren/representeren
  • Space Driven VS Data Driven
  • Ieder toepassing zijn eigen techniek
  • Approximate/Fuzzy approach
  • Tree
  • Hashing
  • 3D histogram
Write a Comment
User Comments (0)
About PowerShow.com