The Impact of Global Clustering on Spatial Database Systems - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

The Impact of Global Clustering on Spatial Database Systems

Description:

Combine sets of pages with larger storage units, Custer Units. 15. Introduction ... Storage utilization : 66% Per Cluster unit : 58 objects. 19. The Cluster ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 33
Provided by: islabK
Category:

less

Transcript and Presenter's Notes

Title: The Impact of Global Clustering on Spatial Database Systems


1
The Impact of Global Clustering on Spatial
Database Systems
  • VLDB Conference 94
  • Thomas Brinkhoff Hans-Peter Kriegel
  • 2000/5/16
  • ??? ???

2
Contents
  • Introduction
  • Queries in Spatial Database Systems
  • The Storage of Spatial Objects
  • The Cluster Organization
  • Evaluation

3
Introduction(1/3)
  • Characteristics of Spatial Databases
  • Manage high numbers of objects
  • A high variation in their complexity
  • Need selective spatial access
  • Spatial Access Methods
  • Spatially adjacent objects are arbitrarily
    distributed

4
Introduction(2/3)
  • Global Clustering
  • A set of data pages representing spatially
    adjacent objects is stored on consecutive pages
    of the disk.
  • Problem
  • A global reorganization of all objects in the
    database is not reasonable in a dynamic
    environment

5
Introduction(3/3)
  • Goals of this paper
  • Obtain an evaluation of the importance of several
    techniques for global clustering
  • The impacts of global clustering on spatial joins
    have not been investigated

6
Queries in Spatial DB Systems
  • Point Query
  • Window Query
  • Spatial Join
  • A ?? B
  • Intersection join

i? j
7
  • Introduction
  • Queries in Spatial Database Systems
  • The Storage of Spatial Objects
  • Spatial Access Method
  • Clustering
  • Organization Model
  • The Cluster Organization
  • Evaluation

8
Spatial Access Method(1/2)
  • Access Methods
  • Organize a dynamic set of objects in secondary
    storage
  • B-tree or linear hashing are not suitable
  • SAM
  • group spatial objects which are close to each
    other in data space close to each other in the
    data pages

9
Spatial Access Method(2/2)
10
Clustering(1/2)
  • Access time
  • Seek time latency time transfer time
  • Goal
  • Minimize the number of seek operations and the
    rotational delay in order to reduce access cost.

11
Clustering(2/2)
Page
  • Internal Clustering
  • Local clustering
  • Global clustering

Object
Page
O1
O2
O3
O6
O1
O2
O5
O4
O3
12
Organization Models(1/3)-secondary organization
  • Primary index for approximation
  • Second index for spatial objects

13
Organization Models(2/3)-Primary organization
  • The exact representation of the objects are on
    the datapages.

14
Organization Models(3/3)-for Global Clustering
  • Combine sets of pages with larger storage units,
    Custer Units

15
  • Introduction
  • Queries in Spatial Database Systems
  • The Storage of Spatial Objects
  • The Cluster Organization
  • Requirements
  • SAM R-tree
  • Cluster Organization
  • Modification of R-tree
  • Evaluation

16
The Cluster Organization(1/5)
  • Requirements
  • SAM with high quality space partitioning scheme
  • Support insertion and deletion
  • 3 queries should be efficiently supported
  • Maximum cluster size exists
  • A reasonable storage utilization
  • SAM
  • R-tree
  • One of the most efficient variants of the R-tree

17
The Cluster Organization(2/5)-The Cluster
Organization
  • The Cluster Organization
  • Static definition of the size of a cluster unit
  • Cluster all objects in a cluster unit whose MBRs
    are stored in one data page.

18
The Cluster Organization(3/5)-The Cluster
Organization
  • Page size 4kb
  • Entry size 46Bytes
  • Smax1.5M Sobj
  • Storage utilization 66
  • Per Cluster unit 58 objects

19
The Cluster Organization(4/5)-Modification of
the R-tree
  • Size of all objects in one cluster unit gt the
    maximun clsuter size Smax
  • split data page and the cluster size
  • Splitting
  • No re-insertion

20
The Cluster Organization(5/5)-Processing
21
Evaluation (1/9)
  • TEST DATA
  • Seek time average 9msec
  • Latency time average 6msec
  • Transfer time average 1msec

22
Evaluation (2/9)
  • Storage Utilization
  • Bad Utilization of Cluster organization

23
Evaluation (3/9)
  • Buddy System
  • Each physical unit has the size Smax2-i (igt0)
  • Each cluster unit has the buddy of the smallest
    possible size
  • Cluster gt Buddy
  • ? into a bigger buddy
  • Split
  • ? into a smaller buddy
  • Restricted Sizes
  • (Smax,, 0.5Smax, 0.25Smax)

24
Evaluation (4/9)
  • Windows Queries
  • Transfer the complete cluster unit
  • This is the handicap until now

25
Evaluation (5/9)
  • Geometric Threshold
  • use the degree of overlap between the region of a
    cluster unit and the query window
  • Use threshold T
  • If (T lt degree) send page-by-page
  • If (T gt degree) send the cluster unit
  • The SLM-Technique
  • Reading requested and non-requested pages

26
Evaluation (6/9)
  • Windows Queries
  • The SLM-technique is the best choice

27
Evaluation (7/9)
  • Point Queries

28
Evaluation (8/9)
  • Spatial Join
  • a
  • 86,094 pairs of intersect
  • b
  • 1.2 million pairs of intersect

29
Evaluation (9/9)
  • Impact of Global Clustering on the performance of
    a Complete Spatial Join

30
Conclusion
  • Global Clustering speeds up the access to spatial
    objects for large window queries as well as for
    spatial joins.
  • Using a buddy system, it shows a good storage
    utilization.
  • SLM-technique is the best choice for window
    queries.

31
(No Transcript)
32
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com