Spatial DBMS Research - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Spatial DBMS Research

Description:

OTB Research Institute for Housing, Urban and Mobility Studies. Spatial DBMS issues ... Relational algebra (1) a relational languages proposed by Codd. implementable ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 35
Provided by: wilko1
Category:

less

Transcript and Presenter's Notes

Title: Spatial DBMS Research


1
Spatial DBMS Research
GISt lunch meeting
Wilko Quak
2
Overview
  • Introduction to DBMS Query Processing
  • Benchmarking a spatial DBMS
  • The GeoInfoNed project
  • MonetDB
  • Discussion

3
Introduction to DBMS query processing
  • Slides borrowed from Dr. Yang He

4
Query processing overview
  • Review relational algebra
  • Query processing
  • introduction
  • stages of query processing
  • query optimisation
  • relational algebra tree

5
Relational algebra (1)
  • a relational languages proposed by Codd
  • implementable
  • basis of high-level (SQL) query execution
  • a collection of simple, 'low-level operations
    used to manipulate relations
  • input is one or more relations
  • output is one relation

6
Relational algebra (2)
  • Relational operations
  • unary operators
  • Restrict (Select) ?
  • Project ?
  • binary operators
  • Cartesian product X
  • Union ?
  • Intersection ?
  • Difference -
  • Join
  • Divide ?

P
7
Example relations
  • e.g. two relations Student and Registration

Student ( SID, Name, Gender ) Registration ( SID,
CID, Mark )
Registration
Student
8
Queries examples (1)
  • e.g. Identify all male students
  • in SQL
  • in relational algebra
  • ? (GenderM) ( Student )

SELECT SID, Name, Gender FROM Student WHERE
Gender'M'
Select
9
Queries examples (2)
  • e.g. List students name and gender.
  • in SQL
  • In relational algebra
  • ? Name, Gender ( Student )

SELECT Name, Gender FROM Student
Project
10
Queries examples (3)
  • e.g. Show student ID, name, their course ID and
    marks
  • in SQL
  • in relational algebra

SELECT s.SID, Name, CID, Mark FROM Student s,
Registration r WHERE s.SID r.SID
or
Project
Natural Join
11
Queries in relational algebra
  • A user query may require several operations to be
    performed
  • relational algebra is a procedural language so
    query operations are evaluated in the order
    specified
  • a complex query can be executed in different
    ways, so an efficient one should be used as
    efficiency is an important DBMS requirement
    query optimisation

12
Query processing
  • Four stages involved in query processing
  • query decomposition or parsing
  • query optimization
  • code generation
  • runtime query execution

13
Query optimization (1)
  • refers to the activity of choosing an efficient
    execution strategy or plan for processing a query
  • rule-based and cost-based strategies
  • database statistics in system catalog used for
    cost estimation
  • is a prime objective of the query processing

14
Query optimization (3)
  • In a query processing, disk access takes most
    time
  • The main objective of the query optimisation is
    to minimize the number of disk accesses
  • Many DBMSs use heuristic rules for query
    optimization
  • e.g. Perform selection and projection
    operations as early as possible to reduce the
    cardinality of the relation and the subsequent
    process of that relation

15
Query processing an example
  • e.g. Show student ID, name, their course ID and
    marks
  • in SQL
  • it can be transformed into relational algebra
    query

SELECT s.SID, Name, CID, Mark FROM Student s,
Registration r WHERE s.SID r.SID
or
The first one is better much less disk access
than the second
16
Relational algebra query tree (2)
  • e.g.

( ?SID, Name(Student) ) ( ?
(Registration) )
SID,CID,Mark
17
Spatial Query processing
  • In spatial query processing the operator is a
    spatial operator, for the rest it is the same as
    non-spatial query processing
  • Spatial Select
  • Find all objects within given rectangle 99.99
  • Spatial Join (overlay in GIS terms)
  • Find all restaurants within national parks

18
DBMS Benchmarking
  • Categorization of DBMS usage
  • Implications for benchmarking
  • benchmark choices

19
Categories of DBMS users
  • Static usage
  • Predefined queries with changing parameters
  • Queries can be hand optimized
  • Dynamic usage (browsing)
  • Many different queries
  • Query optimizer is important
  • Access via object-relational mapping (e.g.
    Hibernate)
  • Not discussed here

All categories need different benchmarking
20
Benchmarking static DBMS usage
  • Notes
  • Critical factor is testing the query processor.
  • Query optimizer is not important
  • Benchmark
  • Make small set of simple queries that test one
    operation

21
Benchmarking dynamic DBMS usage
  • Notes
  • Critical factor is testing the query optimizer.
  • Very hard to get quality reproducible results.
  • It is very hard to assess the quality of the
    query optimizer but a small testset might give
    some insight

select city.name,river.name from city,river where
city.inhabitants gt X and distance(city.geometr
y,river.geometry) lt Y
22
Other benchmarking considerations
  • Functionality
  • Usability
  • update behaviour

23
GeoInfoNed RGI-232
24
GeoInfoNed -- What and Why
  • Build a spatially enabled DBMS because
  • A DBMS is at the core of many system. If you
    improve the core the whole system improves.
  • There is a need for an (open source)
    experimentation platform for Geo DBMS research.

25
Who
  • CWI Leading DBMS experts with MonetDB
  • TUDelft/OTB Knowledge of spatial processes
  • CycloMedia Huge dataset and interesting
    problems
  • RWS/AGI Large and diverse datasets and
    interesting problems

26
How
  • At CWI there is the MonetDB DBMS. First we will
    extend it with basic spatial types (According to
    OpenGIS).
  • Together with our Problem Holder partners we
    will find directions for more extensions.
  • MonetDB already has support for Image Data, XML
    storage and querying etc.

27
Example
  • Is there a relationship between traffic accidents
    and objects near the road?

?
!!
GeoInfoNed
28
MonetDB Introduction
  • Hardware trends
  • MonetDB design considerations
  • MonetDB architecture

Slides borrowed from CWI
29
Hardware Trends
50 p/year - cpu speed - mem size - mem
bandwidth - disk bandwidth
1 p/year - mem latency
10 p/year - disk latency
30
Latency is the enemy!
  • Commercial DBMS products (oracle, DB2, SQLserver)
    stem from OLTP roots
  • focus on minimizing random I/Os gt depend on
    latency!
  • MonetDB built for bulk access
  • optimize CPU and memory performance

Latency is one of the killing factors in Frisos
simplicial homology implementation
31
MonetDB design considerations
  • Multi-model database kernel support
  • Extensible data types, operators, accelerators
  • Database hot-set is memory resident
  • Simple data structures are better
  • Index management should be automatic
  • Do not replicate the operating system
  • Optimize when you know the situation
  • Cooperative transaction management

32
MonetDB product family
End-user application
XQuery
SQL
PHP
JDBC
Perl
ODBC
Python
C-mapi lib
Here a MATLAB interface and Franks life would be
easier
MAPI protocol
Monet kernels
33
MonetDB - Physical data organization
  • Binary Association Tables

34
Discussion
Write a Comment
User Comments (0)
About PowerShow.com