Title: Writing a DBMS buyers guide
1Writing a DBMS buyers guide
Based on presentation at FOSS4G 2007 on
benchmarking
2Overview
- Original idea benchmarking
- Complications of benchmarking
- New Idea buyers guide
- What should be in this guide
3Benchmark consideration Weird Cases department
diagonal query geometry
4Benchmark consideration Hot vs Cold
5Why bother with benchmarking
- Stonebraker2007
- Where to find dramatic differences in Spatial
DBMSs?
We define dramatically outperform to mean at
least a factor 10 advantage then customers
will be inclined to try the new architecture
6Where to expect Dramatic differences?
- Linux vs Windows. (No)
- Choice of DBMS (Only in specific cases)
- Choice of FileSystem (no)
- Functionality Difference (Yes)
- Choice of Parameters (Maybe)
7Problems with testing
- DBMS vendors do not want published results
- Oracle explicitly forbids publishing benchmark
results - Hardware
- Moores Law
- Release Frequency of Software
- Spatial testing cannot be done on synthetic data
- Too many parameters
Benchmark results are outdated before they are
publised
8Solution
- Dont spend our time on producing benchmark
results - Write buyers guide we need a classification of
users. - Let people do their own testing Tell them what
en when to test and help them with at test suite.
9Classification of spatial DMBS users
- Four classes
- Server Builders publish spatial data via web
server - GIS User Load various datasets and perform
complex analyses - Data Maintainer Maintain one core dataset
- Power Users All of the above and more
10Class 1 Web Server Builders
- You do not really need a DBMS for this (You use a
fraction of DBMS functionality) - Only one query counts Find everything within BBOX
11Class 2 GIS users
- Main interest is functionality
- Spend more time on loading data
- Need a good query optimiser
- Analysis
12Class 3 Dataset Maintainers
- Limited number of queries
- Transactions are an issue
- Clustering of data after updates is interesting
13Class 4 Power users
- Do their own testing
- Need a platform to discuss their findings
14Test suite proposal
- Very simple performance test script with few
parameters - BBOX Query
- Fixed Dataset (Propasal OpenStreetMap dataset)
- Configurable test suite
- Full Suite that tests every corner of DBMS
- For specialists only
15Test 1 simple BBOX select
- Write simple script that generates a lot of
rectangle queries. - Paremeter
- DBMS size
- query box size
- experiment length
16Test 1 grow DBMS size
- Question Does query response time depend on DBMS
size or on core memory? - Experiment Run same test on more an more copies
of same database.
17Test 1 result PostGIS vs MySQL
18Test 2 Comprehensive Test Suite
- Create set of killer polygons so that every line
of source code will be touched by running
operations. - Test Query optimizer
- Test Join Operator
- Must be done with Skewed Data
19What should be in the Buyers guide
- Performance is not an issue. What are issues
- Details of functionality (topology, coordinate
transforms) - Total cost of ownership (open-source vs
proprietary) - Configuration (faster disks or faster CPU)
- Ease of Use (2 days of programming A LOT
OF HARDWARE) - Use of standards (vendor lock-in, system
integration)
Can we answer these questions?
20Discussion