Tuning Database Configuration Parameters with iTuned

About This Presentation

Title:

Tuning Database Configuration Parameters with iTuned

Description:

TUNING DATABASE CONFIGURATION PARAMETERS WITH ITUNED. Vamsidhar Thummala. Collaborators: Songyun Duan, Shivnath Babu. Duke University ... – PowerPoint PPT presentation

Number of Views:134

Avg rating:3.0/5.0

Slides: 31

Provided by: vam34

Category:

more less

Transcript and Presenter's Notes

Title: Tuning Database Configuration Parameters with iTuned

1
Tuning Database Configuration Parameters with
iTuned

Vamsidhar Thummala
Collaborators Songyun Duan, Shivnath Babu
Duke University

2
Performance Tuning of Database Systems

Physical design tuning
Indexes SIGMOD98, VLDB04
Materialized views SIGMOD00, VLDB04
Partitioning SIGMOD82, SIGMOD88, SIGMOD89
Statistics tuning ICDE00, ICDE07
SQL Query tuning VLDB04
Configuration parameter or Server parameter
tuning

This talk
3
Database Configuration Parameters

Parameters that control
Memory distribution
shared_buffers, work_mem
I/O optimization
wal_buffers, checkpoint_segments,
checkpoint_timeout, fsync
Parallelism
max_connections
Optimizers cost model
effective_cache_size, random_page_cost
default_statistics_target, enable_indexscan

4
Need for Automated Configuration Parameter Tuning
(1/2)
5
Need for Automated Configuration Parameter Tuning
(2/2)

Number of threads related to configuration
parameter tuning vs. other under PostgreSQL
performance mailing list

Recently, there has been some effort from
community to summarize the important parameters
PgCon08

6
Typical Approach Trial and Error
7
Doing Experiments to Understand the Underlying
Response Surface

TPC-H 4 GB database, 1 GB memory, Query 18

8
Challenges

Large number of configuration parameters
Total 100
10-15 are important depending on OLTP vs. OLAP
Brute-Force will not work
Results in exponential number of experiments
Parameters can have complex interactions
Sometimes non-monotonic and counterintuitive
Limits the one-parameter-at-a-time approach
No holistic configuration tuning tools
Existing techniques focus on specific memory
related parameters or recommend default settings

9
Our Solution iTuned

Practical tool that uses planned experiments to
tune configuration parameters
An adaptive sampling algorithm to plan the
sequence of experiments (Planner)
A novel workbench for conducting experiments in
enterprise systems (Executor)
Features for scalability like sensitivity
analysis and use of parallel experiments to
reduce
Total number of experiments
Per experiment running time/cost

10
Outline of the talk

iTuned Planner
iTuned Executor
Evaluation
Conclusion

11
Problem Abstraction

Given
A database D and workload W
Configuration Parameter Vector X ltx1, x2, , xm
gt
Cost Budget R
Goal Find high performance setting X subject to
the budget constraint
Problem Response surface y ?(X) is unknown
Solution Approach
Conduct experiments to learn about the response
surface
Each experiment has some cost and gives sample
ltXi, yi gt

12
iTuned Planner

Uses an adaptive sampling algorithm

Stopping Criteria Based on cost budget, R
13
Improvement of an Experiment

Improvement IP(X) is defined as
y(X) y(X) if y(X) lt y(X)
0 otherwise
Issue IP(X) is known only after y(X) is known,
i.e., an experiment has to be conducted at X to
measure y(X)
We estimate IP(X) by calculating the Expected
Improvement, EIP(X)
To calculate EIP(X), we need to approximate

Improvement at each configuration setting
Probability density function of (Uncertainty
estimate)
14
Conducting Experiment at XNEXT using Expected
Improvement
Projection on 1D
15
Generating pdf through Gaussian Process

We estimate the performance as
Where is a regression model,
is the residual of the model, captured through
Gaussian Process
Gaussian Process, captures the uncertainty
of the surface
is specified by mean and covariance
functions
We use zero-mean Gaussian process
Covariance is a kernel function that inversely
depends on the distance between two samples Xi
and Xj
Residuals at nearby points exhibit higher
correlation

16
Calculating Expected Improvement using Gaussian
Process

Lemma Gaussian Process models as a
uni-variate Gaussian with mean and variance
as
Theorem There exists a closed form for EIP(X)
See paper for proof and details

17
Tradeoff between Exploration vs. Exploitation

Settings X with high EIP are either
Close to known good settings
Assists in exploitation
In highly uncertain regions
Assists in exploration

EIP(X)
Gaussian Process tries to achieve the balance
between exploration vs. exploitation
18
Outline of the talk

iTuned Planner
iTuned Executor
Evaluation
Conclusion

19
Goal of the Executor

To conduct experiments
Without impacting production system
As close to real production runs as possible
Traditional choices
Production system itself
May impact running applications
Test system
Hard to replicate exact production settings
Manual set-up

20
iTuned Executor

Exploits the underutilized resources to conduct
experiments
Production systems, Stand-by systems, Test
systems, On the cloud
Design
Mechanisms Home garage containers, efficient
snapshots of data
Policies Specified by admins
If CPU, memory, disk utilization is below 20 for
the past 10 minutes, then 70 resources can be
taken for experiments

21
Example Mechanism set-up on Stand-by System using
ZFS, Solaris, and PITR
Standby Environment
Production Environment
Standby Machine
Middle Tier
Write Ahead Log shipping
Copy on Write
Database
Policy Manager
Interface
Experiment Planner Scheduler
Engine
22
Outline of the talk

iTuned Planner
iTuned Executor
Evaluation
Conclusion

23
Empirical Evaluation (1)

Two database systems, PostgreSQL v8.2 and MySQL
v5.0
Cluster of machines with 2GHz processor and 3GB
RAM
Mixture of workloads
OLAP Mixes of TPC-H queries
Varying queries, query_types, and MPL
Varying scale factors (SF 1 to SF 10)
OLTP TPC-W and RuBIS
Number of parameters varied up to 30

24
Empirical Evaluation (2)

Techniques compared
Default parameter settings shipped (D)
Manual rule-based tuning (M)
Smart Hill Climbing (S)
State-of-the-art technique
Brute-Force search (B)
Run many experiments to find approximation to
optimal setting
iTuned (I)
Evaluation metrics
Quality workload running time after tuning
Efficiency time needed for tuning

25
Comparison of Tuning Quality
Simple Workload with one TPC-H Query (Q1)
Complex Workload with mix of TPC-H Queries
(Q1Q18)
26
iTuneds Efficiency and Scalability

Run experiments in parallel
Abort low-utility experiments early

27
iTuneds Sensitivity Analysis

Identify important parameters quickly
Use Sensitivity Analysis to reduce experiments

28
Related work

Parameter tuning
Focus on specific classes of parameters (mainly
memory related buffer pools) ACM TOS08,
VLDB06
Statistical Approach for Ranking Parameters
SMDB08
Brute force approach to experiment design
Tools like DB2 Configuration advisor and pg_tune
recommend default settings
Adaptive approaches to sampling SIGMETRICS06
Work related to iTuneds executor
Oracle SQL Performance Analyzer SIGMOD09,
ICDE09
Virtualization, snapshots, suspend-resume

29
Conclusion

iTuned automates the tuning process by adaptively
conducting experiments
Our initial results are promising
Future work
Apply database-specific knowledge to keep
optimizer in loop for end-to-end tuning
Query plan information
Workload compression
Experiments in cloud

Tuning Database Configuration Parameters with iTuned - PowerPoint PPT Presentation

Tuning Database Configuration Parameters with iTuned

TUNING DATABASE CONFIGURATION PARAMETERS WITH ITUNED. Vamsidhar Thummala. Collaborators: Songyun Duan, Shivnath Babu. Duke University ... – PowerPoint PPT presentation