Automated Parameter Setting Based on Runtime Prediction: - PowerPoint PPT Presentation

About This Presentation

Title:

Automated Parameter Setting Based on Runtime Prediction:

Description:

Frank Hutter, Univ. of British Columbia, Vancouver, Canada ... Performance depends crucially on parameter setting. New application/algorithm: ... – PowerPoint PPT presentation

Number of Views:33

Avg rating:3.0/5.0

Slides: 36

Provided by: Ana55

Category:

more less

Transcript and Presenter's Notes

Title: Automated Parameter Setting Based on Runtime Prediction:

1
Automated Parameter Setting Based on Runtime
Prediction

Towards an Instance-Aware Problem Solver

Frank Hutter, Univ. of British Columbia,
Vancouver, Canada Youssef Hamadi, Microsoft
Research, Cambridge, UK
2
Motivation(1) Why automated parameter setting ?

We want to use the best available heuristic for a
problem
Strong domain-specific heuristics in tree search
Domain knowledge helps to pick good heuristics
But maybe you dont know the domain ahead of time
...
Local search parameters must be tuned
Performance depends crucially on parameter
setting
New application/algorithm
Restart parameter tuning from scratch
Waste of time both for researchers and
practicioners
Comparability
Is algorithm A faster than algorithm B because
they spent more time tuning it ? ?

3
Motivation(2) operational scenario

CP solver has to solve instances from a variety
of domains
Domains not known a priori
Solver should automatically use best strategy for
each instance
Want to learn from instances we solve

Frank Hutter
Frank Hutter
4
Overview

Previous work on runtime prediction we base
onLeyton-Brown, Nudelman et al. 02 04
Part I Automated parameter setting based on
runtime prediction
Part II Incremental learning for runtime
prediction in a priori unknown domains
Experiments
Conclusions

5
Previous work on runtime prediction for algorithm
selection

General approach
Portfolio of algorithms
For each instance, choose the algorithm that
promises to be fastest
Examples
Lobjois and Lemaître, AAAI98 CSP
Mostly propagations of different complexity
Leyton-Brown et al., CP02 Combinatorial
auctions
CPLEX 2 other algorithms (which were thought
incompetitive)
Nudelman et al., CP04 SAT
Many tree-search algorithms from last SAT
competition
On average considerably faster than each single
algorithm

6
Runtime prediction Basics (1 algorithm)
Leyton-Brown, Nudelman et al. 02 04

Training Given a set of t instances z1,...,zt
For each instance zi
Compute features xi (xi1,...,xim)
Run algorithm to get its runtime yi
Collect (xi ,yi) pairs
Learn function f X ! R (features ! runtime), yi
? f (xi)
Test Given a new instance zt1
Compute features xt1
Predict runtime yt1 f(xt1)

Expensive
Cheap
7
Runtime prediction Linear regression
Leyton-Brown, Nudelman et al. 02 04

The learned function f has to be linear in the
features xi (xi1,...,xim)
yi ¼ f(xi) ?j1..m (xij wj) xi w
The learning problem thus reduces to fitting the
weights w w1,...,wm
To grasp the vast different in runtime better,
estimate the logarithm of runtime e.g. yi 5 ?
runtime is 105 sec

8
Runtime prediction Feature engineering
Leyton-Brown, Nudelman et al. 02 04

Features can be computed quickly (in seconds)
Basic properties like vars, clauses, ratio
Estimates of search space size
Linear programming bounds
Local search probes
Linear functions are not very powerful
But you can use the same methodology to learn
more complex functions
Let ? (?1,...,?q) be arbitrary combinations of
the features x1,...,xm (so-called basis
functions)
Learn linear function of basis functions f(?)
? w
Basis functions used in Nudelman et al. 04
Original features xi
Pairwise products of features xi xj
Only subset of these (drop useless basis
functions)

9
Algorithm selection based on runtime
predictionLeyton-Brown, Nudelman et al. 02
04

Given n different algorithms A1,...,An
Training
Learn n separate functions fj? ! R, j1...n
Test
Predict runtime yjt1 fj(?t1) for each of the
algorithms
Choose algorithm Aj with minimal yjt1

Really Expensive
Cheap
10
Overview

Previous work on runtime prediction we base on
Leyton-Brown, Nudelman et al. 02 04
Part I Automated parameter setting based on
runtime prediction
Part II Incremental learning for runtime
prediction in a priori unknown domains
Experiments
Conclusions

11
Parameter setting based on runtime prediction
Finding the best default parameter setting for a
problem class Generate special purpose code
Minton 93 Minimize estimated error Kohavi
John 95 Racing algorithm Birattari et al.
02 Local search Hutter 04 Experimental
design Adenso-Daz Laguna 05 Decision trees
Srivastava Mediratta, 05
Runtime prediction for algorithm selection on a
per-instance base Predict runtime for each
algorithm and pick the best Leyton-Brown,
Nudelman et al. 02 04
Runtime prediction for setting parameters on a
per-instance base
12
Naive application of runtime prediction for
parameter setting

Given one algorithm with n different parameter
settings P1,...,Pn
Training
Learn n separate functions fj? ! R, j1...n
Test
Predict runtime yjt1 fj(?t1) for each of the
parameter settings
Run algorithm with setting Pj with minimal yjt1

Too expensive
Fairly Cheap

If there are too many parameter configurations
Cannot run each parameter setting on each
instance
Need to generalize (cf. human parameter tuning)
With separate functions there is no way to
generalize

13
Generalization by parameter sharing

Naive approach n separate functions.
Information on theruntime of setting icannot
inform predictions for setting j ? i

Our approach 1 single function.
Information on theruntime of setting ican
inform predictions for setting i ? j

14
Application of runtime prediction for parameter
setting

View the parameters as additional features, learn
a single function
Training Given a set of instances z1,...,zt
For each instance zi
Compute features xi
Pick some parameter settings p1,...,pn
Run algorithm with settings p1,...,pn to get
runtimes y1i ,...,yni
Basic functions ?1i, ..., ?ni include the
parameter settings
Collect pairs (?ji,yji) (n data points per
instance)
Only learn a single function g? ! R
Test Given a new instance zt1
Compute features xt1
Search over parameter settings pj. Evaluation
compute ?jt1, check g(?jt1)
Run with best predicted parameter setting p

Moderately Expensive
Cheap
15
Summary of automated parameter setting based on
runtime prediction

Learn a single function that maps features and
parameter settings to runtime
Given a new instance
Compute the features (they are fix)
Search for the parameter setting that minimizes
predicted runtime for these features

16
Overview

Previous work on runtime prediction we base on
Leyton-Brown, Nudelman et al. 02 04
Part I Automated parameter setting based on
runtime prediction
Part II Incremental learning for runtime
prediction in a priori unknown domains
Experiments
Conclusions

17
Problem setting Incremental learning for
multiple domains
Frank Hutter
Frank Hutter
18
Solution Sequential Bayesian Linear Regression

Update knowledge as new data arrivesprobabilit
y distribution over weights w
Incremental (one (xi, yi) pair at a time)
Seemlessly integrate this new data
Optimal yields same result as a batch approach
Efficient
Computation 1 matrix inversion per update
Memory can drop data we integrated
Robust
Simple to implement (3 lines of Matlab)
Provides estimates of uncertainty in prediction

19
What are uncertainty estimates?
20
Sequential Bayesian linear regression intuition

Instead of predicting a single runtime y, use a
probability distribution P(Y)
The mean of P(Y) is exactly the prediction of the
non-Bayesian approach, but we get uncertainty
estimates

Uncertainty of prediction
P(Y)
Log. runtime Y
Mean predicted runtime
21
Sequential Bayesian linear regression technical

Standard linear regression
Training given training data ?1n, y1n, fit the
weights w such that y1n ¼ ?1n w
Prediction yn1 ?n1 w
Bayesian linear regression
Training Given training data ?1n, y1n, infer
probability distribution P(w?1n, y1n) / P(w)
?i P(yi?i, w)

Prediction P(yn1?n1, ?1n, y1n) s
P(yn1w, ?n1) P(w?1n, y1n) dw

Knowledge about the weights Gaussian (?w, ?w)

22
Sequential Bayesian linear regression visualized
P(wi)

Start with a prior P(w) with very high
uncertainty
First data point (?1,y1)
P(w?1, y1) / P(w) P(y1?1,w)

Weight wi
P(y1?1,w)
Weight wi
P(wi?1, y1)
23
Summary of incremental learning for runtime
prediction

Have a probability distribution over the weights
Start with a Gaussian prior, incremetally update
it with more data
Given the Gaussian weight distribution, the
predictions are also Gaussians
We know how uncertain our predictions are
For new domains, we will be very uncertain and
only grow more confident after having seen a
couple of data points

Frank Hutter
Frank Hutter
24
Overview

Previous work on runtime prediction we base on
Leyton-Brown, Nudelman et al. 02 04
Part I Automated parameter setting based on
runtime prediction
Part II Incremental learning for runtime
prediction in a priori unknown domains
Experiments
Conclusions

25
Domain for our experiments

SAT
Best studied NP-hard problem
Good features already exist Nudelman et al.04
Lots of benchmarks
Stochastic Local Search (SLS)
Runtime prediction has never been done for SLS
before
Parameter tuning is very important for SLS
Parameters are often continuous
SAPS algorithm Hutter, Tompkins, Hoos 02
Still amongst the state-of-the-art
Default setting not always best
Well, I also know it well -)
But the approach is applicable to about anything
whenever we can compute features!!

26
Stochastic Local Search for SATScaling and
Probabilistic Smoothing (SAPS)Hutter, Tompkins,
Hoos 02

Clause weighting algorithm for SAT, was
state-of-the-art in 2002
Start with all clause weights set to 1
Hillclimbing until you hit a local minimum
In local minima
Scaling scale weights of unsatisfied clauses wc
Ã ? wc
Probabilistic smoothing with probability
Psmooth, smooth all clause weights wc Ã ? wc
(1-?) average wc
Default parameter setting (?, ?, Psmooth)
(1.3,0.8,0.05)
Psmooth and ? are very closely related

27
Benchmark instances

Only satisfiable instances!
SAT04rand SAT 04 competition instances
mix mix of lots of different domains from
SATLIB random, graph colouring, blocksworld,
inductive inference, logistics, ...

28
Adaptive parameter setting vs. SAPS default on
SAT04rand

Trained on mix and used to choose parameters for
SAT04rand
? 2 0.5,0.6,0.7,0.8
? 2 1.1,1.2,1.3
For SAPS steps ? time
Adaptive variant on average 2.5 times faster than
default
But default is not strong here

29
Where uncertainty helps in practice qualitative
differences in training test set

Trained on mix, tested on SAT04rand

Estimates of uncertaintyof prediction
Optimal prediction
30
Where uncertainty helps in practice (2)Zoomed
to predictions with low uncertainty
Optimal prediction
31
Overview

Previous work on runtime prediction we base on
Leyton-Brown, Nudelman et al. 02 04
Part I Automated parameter setting based on
runtime prediction
Part II Incremental learning for runtime
prediction in a priori unknown domains
Experiments
Conclusions

32
Conclusions

Automated parameter tuning is needed and feasible
Algorithm experts waste their time on it
Solver can automatically choose appropriate
heuristics based on instance characteristics
Such a solver could be used in practice
Learns incrementally from the instances it solves
Uncertainty estimates prevent catastrophic errors
in estimates for new domains

33
Future work along these lines

Increase predictive performance
Better features
More powerful ML algorithms
Active learning
Run most informative probes for new domains (need
the uncertainty estimates)
Use uncertainty
Pick algorithm with maximal probability of
success (not the one with minimal expected
runtime!)
More domains
Tree search algorithms
CP

34
Future work along related lines

If there are no features
Local search in parameter space to find the best
default parameter setting Hutter 04
If we can change strategies while running the
algorithm
Reinforment learning for algorithm
selectionLagoudakis Littman 00
Low knowledge algorithm control Carchrae and
Beck 05

35
The End

Thanks to
Youssef Hamadi
Kevin Leyton-Brown
Eugene Nudelman
You for your attention ?

36
Related work (1)Finding the best default
parameters

Find single parameter setting that minimizes
expected runtime for a whole class of problems
Generate special purpose code Minton 93
Minimize estimated error Kohavi John 95
Racing algorithm Birattari et al. 02
Local search Hutter 04
Experimental design Adenso-Daz Laguna 05
Decision trees Srivastava Mediratta, 05

37
Related work (2) Algorithm selection on a
per-instance base