Title: Simulated annealing for convex optimization
1Simulated annealing for convex optimization
- Adam Tauman Kalai, TTI-Chicago
- Santosh Vempala, MIT
2Three points of this talk
- Design efficient algorithm for a convex
optimization problem - We get current best (worst-case) bounds
- Analysis of simulated annealing showing provable
efficiency - Better understand simulated annealing
- Simulated annealing is also atype of interior
point algorithm - Rapid convergence to local/global min(we do not
say anything about local vs global min)
3Outline
- The optimization problem
- Previous approaches
- Simulated annealing
- Results
- Simulated annealing works fast
- Geometric cooling schedule is optimal
- Issues with shape/covariance
4The optimization problem
- Linear optimization (f(x) cx) over convex set
K - x argminx2K cx
- Inputs
- n number of dimensions (large)
- unit vector c 2 ltn
- accuracy ? gt 0
- convex set K ½ ltn
- membership oracle K(x) 1 if x 2 K, 0 otherwise
- starting point x0 2 K
- K contains radius-r ball, contained in radius-R
ball - Goal output x where cx cx ?
K
r
R
x0
c
x
5The optimization problem
- Linear optimization (f(x) cx) over convex set
K - x argminx2K cx
- Inputs
- n number of dimensions (large)
- unit vector c 2 ltn
- accuracy ? gt 0
- convex set K ½ ltn
- membership oracle K(x) 1 if x 2 K, 0 otherwise
- starting point x0 2 K
- K contains radius-r ball, contained in radius-R
ball - Goal output x where cx cx ?
K
c
x0
6Previous approaches
- minx2K cx, c 2 ltn, convex K ½ ltn
- Ellipsoid method can solve this problem in
O(n10) membership queries - O(nS) Bertsimas-Vempala stochastic search
- Use uniform sample from convex set subroutine
We get O(n½S)
Given a good starting point, random walk finds
almost uniformly random point in K in SO(n4)
steps
K
x1
x2
c
x3
hides logarithmic factors, O(n10)O(n10
logc(nR/red))
Cut off sections
7O(nS) algorithm BV03
- Elegant analysis
- Requires ?(n) phases in worst case
In n-dimensional cone, most of mass is within
1/n of top
n-dim. cone
c
) ¼ n phases cuts height in half
8Simulated annealing
Completely random
discrete or continuous
T1
- Goal minimize f(x) over set K
- Approach decreasing temp 0 lt T lt 1
- Phase i, temp Ti ?Ti-1, T0 large
- Biased random walk
- During phase i, stationary distribution is
d?i(x) / exp(-f(x)/Ti)
Geometric cooling schedule (? lt1)
T0
Global minimum
x
x
x
Fill in graph
9Simulated annealing alg. for our problem
K
- T0 R (radius of containing ball)
- Temperature Ti, sample from density d?i(x)/
exp((c x)/Ti) - Repeat hit and run random walk S times
- At x, pick random line L passing through x
- Pick random x on K Ã… L with prob. /
exp((cx)/Ti) - Ti1(1-n-½)Ti
- Stop at Tfinal?/n
x
x
L
Temperature is cut in half every ¼ n½ phases
10Analysis
- Sampling at temperature Tfinal?/n brings you
within ? of optcx - With a good starting point, after SO(n4)
steps, hit-and-run is located in K according to
density d?i(x) / exp(-(cx)/Ti) (true for any
log-concave density) LV03 - Good start technical condition
- d?i(x) and d?i-1(x) must be close
11Uniform distribution over truncated cone has
small std. dev.
?i-1
?i
c
d?i(x)/ exp(-(c x)/T) has much larger std. dev.
(factor of n½ larger)
?i-1
?i
12Optimal distributions and schedule
- Cannot do better than n1/2 phases
- Assumptions
- Using a sequence of probability densities d?i(x)
- d?i(x) is log-concave, i.e. log(d?i(x)) is
concave - Variation distance d?i-d?i-1 1-1/poly(n)
- Boltzmann distributions with geometric cooling
schedule are worst-case optimal for this class of
stochastic search strategies
13Shape estimation and covariance
I lied
- To do random walk, its important to have
estimate of shape of object - For isotropic shapes, can just step in random
direction - For non-isotropic shapes
- Maintain a sample of n points at all times
- Use covariance matrix of current sample to bias
direction selection
14Conclusions
- In addition to possibly helping avoid local
optima, S.A. converges rapidly to local opt - Simulated annealing interior point method
- Justification for Boltmann distributions with
geometric cooling schedule - Future work same analysis for convex functions
- Future work understand how simulated annealing
helps avoid local minima - Reverse-annealing used for volume estimation
LV04