Database Optimisation and Database Benchmarking - PowerPoint PPT Presentation

1 / 32

About This Presentation

Title:

Database Optimisation and Database Benchmarking

Description:

normally tree based (relational algebra) 2. Convert to canonical form ... based on cost formulas and database statistics. Rule or cost based in Oracle. Slide 25 ... – PowerPoint PPT presentation

Number of Views:34

Avg rating:3.0/5.0

Slides: 33

Provided by: davidala3

Category:

more less

Transcript and Presenter's Notes

Title: Database Optimisation and Database Benchmarking

1
Database Optimisation and Database Benchmarking

David Nelson
April 2007

2
Contents

Introduction
Database Benchmarking
Wisconsin, TPC-A, TPC-B, TPC-C, AS3AP, OO7,
BUCKY
Query Performance and Optimisation
Query Processing
Indexes
Summary and Further Reading

3
Introduction

Software and systems development projects include
performance evaluation work, but sometimes not
sufficient to prevent major performance problems
Tools available
Prototyping
Benchmarking
Query processing
Its a black art!

4
Database Benchmarking

A tool for comparing the performance of DBMS
summarise relative performance in a single figure
Usually measured in transactions per second (tps)
with a cost measure in terms of system cost over
5 years
Two principal uses of benchmarks
providing comparative performance measures
a source of data and queries that represent
experimental approximations to other problems

5
Data Generation Approaches

Artificial
generation of data with entirely artificial
properties
designed to investigate particular aspects of a
system, e.g. join
e.g. Wisconsin
Synthetic Workload
produce a simplified version of an application
use synthetically generated data with similar
properties to real system, e.g. a banking
application
e.g.Transaction Processing Performance Council
(TPC)

6
Benchmarks

Relational Benchmarks
Arfiticial Wisconsin, AS3AP
Synthetic TPC-C (successor of TPC-A and TPC-B)
OO Benchmarks
Synthetic oo7 (successor of oo1)
Object-Relational Benchmarks
Synthetic BUCKY
There are many other benchmarks, these are the
most well known

7
Wisconsin Benchmark

First systematic benchmark definition
compares particular features of DBMS rather than
a simple overall performance metric
A single-user series of tests, comprising
selections and projections with varying
selectivities on clustered, indexed non-indexed
attributes
joins of varying selectivities
aggregate functions (e.g. min, max, sum)
updates/deletions involving key/non-key attributes

8
Wisconsin Benchmark

Straightforward to implement
Scalable, e.g. parallel architectures
Useful readily understandable results
Lack of highly skewed attribute distribution
Simple join queries

9
TPC-A and TPC-B

Measures performance of a simple banking
transaction
from initiation at a terminal until response
arrives back from server
benchmark encompasses time taken by server,
network and other system components
terminals emulated using a negative-exponential
transaction arrival distribution
TPC-B only measures performance of the database
server

10
TPC-C

Based on an order entry application
Scope of system similar to TPC-A, but more
complex, i.e. 9 tables rather than 5, and a
larger range of queries
closer to typical database applications
more comprehensive test of DBMS features
TPC-A and TPC-B now obsolete

11
TPC-C Schema
Taken from TPC-C Standard Specification
available at www.tpc.org
12
TPC-C

5 transactions covering
New order
A payment
Order status enquiry
A delivery
A stock level inquiry
10 terminals at each warehouse
All 5 transactions available at each terminal
produce an equal number of New-Order and Payment
transactions and to produce one Delivery
transaction, one Order-Status transaction, and
one Stock-Level transaction for every ten
New-Order transactions
Metric
number of New-Order transactions executed per
minute

13
Other TPC Benchmarks

TPC-H
Ad-hoc decision support environments
TPC-R (now obsolete)
Business reporting in decision support
environments
TPC-W (now obsolete!)
Transactional web benchmark for e-commerce
TPC-App
Application server and web services benchmark
Further info on
www.tpc.org

14
AS3AP

ANSI SQL Standard Scalable and Portable Benchmark
based on Wisconsin
series of two tests
single-user basic features of RDBMS
multi-user characteristics of DB workload, e.g.
parallel and main memory DB applications
based on throughput, excludes terminal time etc.
queries are scalable

15
oo7

A benchmark for Object Database Systems
An improvement on previous oo1 benchmark
Examines the performance characteristics of
different types of retrieval/traversal, object
creation/deletion and updates and query processor
A number of sample implementations are provided
Based on a complex parts hierarchy
Further info
ftp.cs.wisc.edu

16
oo7 Tests

Test 1
Raw traversal speed, traversal with updates,
operations on manuals
Tests with/without full cache
Test 2
Exact matches, range searches, path lookup, scan,
make, join
Test 3
Insert/update a group of composite parts

17
BUCKY

An Object-Relational Benchmark
Objective
To test the key features that add the object to
object-relational database systems, as defined by
Stonebraker
Inheritance, Complex Objects, ADTs
Not triggers
Based on a university database schema
Exists as an object-relational and a relational
schema
can compare performance tradeoffs between using
object aspects of DBMS compared to purely
relational

18
BUCKY Schema
Taken from The BUCKY Object-Relational
Benchmark, Carey, el. al
19
BUCKY Queries

Aim is to test various object queries,
involving
1. row types with inheritance
2. inter-object references
3. set-valued attributes
4. methods of row objects
5. ADT attributes and their methods
Two BUCKY performance metrics
O-R Efficiency Index, for comparing O-R and
relational implementations
O-R Power Rating, for comparing O-R systems

20
Query Performance

Query performance is necessary to achieve
acceptable performance of a RDBMS
Various ways in which this can be achieved
De-normalisation of data to reduce joins
Creating indexes on frequently retrieved
attributes
Clustering tables to reduce the number of disk
reads
Automatic optimisation of queries

21
Query Performance

Automatic query optimisation can dramatically
improve query execution time
e.g. Consider the simple SQL query
select s.student_no, s.student_name,
c.course_name
from student s, course c
where s.course_id c.course_id
and s.age gt 25
This query is more optimal if the selections and
projections are performed before the join

22
Example

1000 students of which only 100 are over the age
of 25, and there are 50 courses
Alternative 1 Join first
read the 1000 students, read all courses 1000
times (once for each student), construct an
intermediate table of 1000 records (which may be
too large to fit in memory)
restrict the result to those over the age of 25
(100 rows at most)
project the result over the required attributes

23
Example

Alternative Restrict first
read 1000 tuples but restrict to those over the
age of 25, returning an intermediate table of
only 100 rows - which has a much better potential
of being storable in main memory
join the result with the course table, again
returning an intermediate table of only 100 rows
project the result over the required attributes
Obviously this version is BETTER!
Could be improved further by doing the projection
before the join.

24
Query Processing Stages

Four stages in query processing
1. Cast the query into internal form
normally tree based (relational algebra)
2. Convert to canonical form
3. Choose candidate low-level procedures
using indexes, clustering, etc.
4. Generate query plans and choose and run
the optimal query
based on cost formulas and database statistics
Rule or cost based in Oracle

25
Query Cast into Internal Form
RESULT
PROJECT over student_no,
RESTRICT where age gt 25
JOIN over course_id
S
C
26
Canonical Form

Canonical form
given a set Q of queries, and a notion of
equivalence between two queries q1 and q2 in set
Q, then there exists a subset C of Q, the set of
canonical forms for Q, if and only if every query
q in Q is equivalent to only one query c in C.
The query c is the canonical form of the query q
Uses expression transformation rules

27
Expression Transformation Rules

Examples (not complete)
(A WHERE p1) WHERE p2 A WHERE p1 and p2
(A PROJECT x,y) PROJECT y A PROJECT y
(A UNION B) PROJECT x (A PROJECT x) UNION (B
PROJECT x)
(A JOIN B) PROJECT x (A PROJECT x1) JOIN (B
PROJECT x2)
A JOIN B B JOIN A
(A JOIN B) JOIN C A JOIN (B JOIN C)
(A JOIN B) PROJECT x A PROJECT x
where x is FK from B to A

28
Choose Candidate Low-Level Procedures

How to execute the query represented by that
converted form
Take into consideration
Indexes
Other physical access paths
Distribution of data values
Clustering
Specify as a series of low-level operations
Each low level operation has a set of predefined
implementation procedures

29
Generate Query Plans/Choose Cheapest

Each query plan from stage 3 will have a cost
formula generated from the cost formula for each
low-level procedure
Oracle supports
Rule Based
Rank queries according to algebra operations
15 rules in Oracle
Cost Based
Optimal rule based query may not in fact be
optimal due to cost of operating query, e.g. join
order
Need to gather statistics

30
Indexes

Indexes can be created on relations to increase
search/access performance
Types of index
Primary
Clustering
Secondary
Can be sparse or dense
Most DBMSs use B-trees to hold indexes

31
Summary

Have looked at methods for performance
optimisation
benchmarking is a tool for evaluating and
comparing performance of real systems
OO7 benchmark is to ODBS what Wisconsin is to
RDBMS
RDBMS systems normally include query optimisers
to attempt to convert user queries into optimal
queries
Indexes and clustering can significantly improve
query optimisation

32
Further Reading

Bitton, De Witt and Turbyfill Benchmarking
Database Systems A Systematic Approach, Proc.
9th VLDB 1983.
Carey, et. al The BUCKY Object-Relational
Benchmark,
http//www.cs.wisc.edu/naughton/bucky.html
Date, chapter on Query Optimisation
Connolly and Begg, OODB chapter has information
on benchmarking and indexes

Write a Comment

User Comments (0)