Title: Autonomic DBMSs: System Tune Thyself!
1Autonomic DBMSsSystem Tune Thyself!
- Pat MartinDatabase Systems LaboratorySchool of
Computing
Supported by IBM, CITO and NSERC
2Outline of Talk
- The problem system complexity
- The solution autonomic computing systems
- Autonomic DBMSs
- Some current research - tuning multiple buffer
pools - Summary
3The Problem
- Computer systems continually expanded to achieve
greater functionality and efficiency - Expansion has led to a complexity crisis
- Systems are too complex to be managed effectively!
4Can you manage this?
5How about this?
6A Solution Autonomic Computing Systems
- Autonomic Computing Systems, like our nervous
system, manage themselves -
7Autonomic Computing System
- Aware of itself and its environment and acts
accordingly - Able to reconfigure itself under varying and
unpredictable conditions - Able to recover from events that cause it to
malfunction - Able to anticipate optimized resources needed to
perform a task - Able to protect itself
8Autonomic DBMS Project
- Goal is develop a DBMS that can automatically
- Recognize properties of its workload
- Monitor itself with minimal impact on
applications performance - Reallocate resources to improve performance
- Detect and diagnose performance problems
- Recognize and react to changes in its environment
and available resources
9Example Buffer Pool Tuning
- Automatically configure tablespaces to buffer
pools based on an analysis of the database and
the workload (BP Configuration Problem) - Dynamically adjust sizes of buffer pools to
minimize I/O costs for the database and workload
(BP Sizing Problem)
10Multiple Buffer Pools
logical access
physical write
physical read
index
item
warehouse
customer
11BP Configuration Problem
- Given a set of database objects and a workload,
determine a mapping of database objects to buffer
pools to maximize performance for the given
workload.
12Configuration Rules of Thumb
- Separate data and indexes
- Isolate a large data table
- Separate objects that are updated frequently and
objects that are primarily read - Put temporary tables in their own BP
- Separate small frequently accessed tables from
larger tables that are scanned - Isolate tables that are accessed frequently by
short updates
13BPConfig Approach
- Analyze logical page reference trace
- obtain trace of workload on default configuration
- derive access patterns for DB objects
- random, re-reference and sequential accesses
- Create characterization vectors
- type, access patterns, read/write info, size info
- Partition DB objects into buffer pools
- cluster based on characterization vectors
14Partitioning DB Objects
- Partition using k-means clustering algorithm
- Similarity measured by weighted Euclidean
distance - Considered different weighting schemes
- equal
- favour read/write
- favour access pattern
15Experiments
- Experimental environment
- IBM Netfinity 8500R 4 900 MHz PIII Xeon CPU, 16
GB RAM, 70 disks, Windows NT - TPC-C benchmark OLTP workload, 400 warehouse (40
GB) database - DB2 Version 7.1
- 100,000 4K pages for the buffer pools
16Experiments (cont.)
- Configuration schemes
- BPConfig, expert, default (1BP), random,
distributed (1 BP per DB object) - Evaluation criteria
- Weighted Response Time
- TPM
- Physical Reads
17Experiments (cont.)
- Properties of BPConfig configurations (3 buffer
pools) - separates index and data objects
- separates heavy access and light access objects
- WID tables isolated (equal and read/write
weightings)
18Experiments (cont.)
Equal Weight Read/Write AccessPattern Expert Random Default Dist
WRT 11.11 11.20 10.86 10.95 10.95 14.05 12.50
TPM 8129 8047 8331 8287 8287 6371 7159
PR 5.6 4.7 4.6 4.6 4.6 10.4 8.1
19BP Sizing Problem
- Given a workload, a set of buffer pools and a
fixed number of buffer pages, determine the
appropriate size of each buffer pool to maximize
performance for the given workload.
20Approaches to Sizing BPs Class-based
Optimization
- Specify performance goals for each transaction
class - Algorithm tries to satisfy goals
- Logical access cost proportional to physical
access cost - Physical access cost determined by buffer pool
miss rates
21Class-based Optimization (cont.)
- Collect performance data
- Choose target class
- Loop until goal metChoose target buffer
poolChoose source buffer poolReallocate pages - End
Ti with worstperformance
BP with greatestbenefit
BP with leastcost
22Class-based Optimization (cont.)
- Problems
- How do we select appropriate performance goals
for a class? - Some classes may be favoured over others
- Thrashing between buffer pool states is a
possibility
23Approaches to Sizing BPs System-based
Optimization
- BP sizes chosen to maximize system performance
metric, eg. throughput - Use a simple greedy algorithm
- Considered 2 cost functions
- Minimize hit rate
- Minimize data access time (physical reads dont
all cost the same!)
24System-based Optimization - Experiments
- Experimental environment
- IBM xSeries 240 PC Server 2 1 GHz PIII CPUs, 2
GB RAM, 22 disks, Windows NT - TPC-C benchmark
- DB2 Version 7.1
- 50,000 4K buffer pool pages
- 3 buffer pools configured with BPConfig
25Experiments (cont.)
DAT-Based HR-Based
BP Sizing lt25000, 4000, 21000gt lt19000, 5000, 26000gt
WHR 0.9308 0.9342
WcostLR 1.5375 1.5639
TPM 4493 4318
26(No Transcript)
27Other AutoDBA Projects
- Automatic diagnosis
- Automatic recognition of workload type
- Integration of BPConfig and sizing algorithm
- Automatic BP management in PostgreSQL
- Tools for DBMS capacity planning
28AutoDBA Project Members
- Queens
- Wendy Powley, Darcy Benoit, Said Elnaffar, Wenhu
Tian, Xiaoyi Xu, Xilin Cui, Ted Wasserman, Nailah
Ogeer - IBM
- Berni Schiefer, Sam Lightstone, Randy Horman,
Robin Van Boeschoten, Keri Romanufa, Calisto
Zuzarte