CHAPTER 8 ANNEALING-TYPE ALGORITHMS PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: CHAPTER 8 ANNEALING-TYPE ALGORITHMS


1
CHAPTER 8ANNEALING-TYPE ALGORITHMS
Slides for Introduction to Stochastic Search and
Optimization (ISSO) by J. C. Spall
  • Organization of chapter in ISSO
  • Introduction to simulated annealing
  • Simulated annealing algorithm
  • Basic algorithm with noise-free loss measurements
  • With noisy loss measurements
  • Numerical examples
  • Traveling salesperson problem
  • Continuous problems with single and multiple
    minima
  • Annealing algorithms based on stochastic
    approximation with injected noise
  • Convergence theory

2
Background on Simulated Annealing
  • Continues in spirit of Chaps. 2, 6, and 7 in
    working with only loss measurements (no direct
    gradients)
  • Simulated annealing (SAN) based on analogies to
    cooling (annealing) of physical substances
  • Optimal ? analogous to minimum energy state
  • Primarily designed to be global optimization
    method
  • Based on probabilistic criterion for accepting
    increased loss value during search process
  • Metropolis criterion
  • Allows for temporary increase in loss as means of
    reaching global minimum
  • Some convergence theory possible (e.g., Hajek,
    1988, for discrete ? see p. 213 of ISSO Sect.
    8.6 in ISSO for continuous ?)

3
Metropolis Criterion
  • In iterative process, suppose have current ?
    value ?curr and candidate new value ?new. Should
    we accept ?new if ?new is worse than ?curr (i.e.,
    has higher loss value)?
  • Metropolis criterion (from famous 1953 paper of
    Metropolis et al.)
  • gives probability of accepting new value (cb is
    constant and T is temperature set cb 1
    without loss of generality)
  • Repeated application of Metropolis criterion
    (iteration to iteration) provides for convergence
    of SAN to global minimum ??
  • Markov chain theory applies for discrete ?
    stochastic approximation for continuous ?

4
SAN Algorithm with Noise-Free Loss Measurements
  • Step 0 (initialization) Set initial temperature T
    and current parameter ?curr determine L(?curr).
  • Step 1 (candidate value) Randomly determine new
    value ?new and determine L(?new).
  • Step 2 (compare L values) If L(?new) lt L(?curr),
    accept ?new . Alternatively, if L(?new) ?
    L(?curr), accept ?new with probability given by
    Metropolis criterion (implemented via Monte Carlo
    sampling scheme) otherwise keep ?curr .
  • Step 3 (iterate at fixed temperature) Repeat
    steps 1 and 2 until T is changed.
  • Step 4 (decrease temperature) Lower T according
    to the annealing schedule and return to Step 1.
    Continue till effective convergence.

5
SAN Algorithm with Noisy Loss Measurements
  • As with random search (Chap. 2 of ISSO), standard
    SAN not designed for noisy measurements y L ?
  • However, SAN sometimes used with noisy
    measurements
  • Standard approach is to form average of loss
    measurements at each ? in search process
  • Alternative is to use threshold idea of Sect. 2.3
    of ISSO
  • Only accept new value if noisy loss value is
    sufficiently bigger or smaller than current noisy
    loss
  • Can use one-sided Chebyshev inequality to
    characterize likelihood of error at each
    iteration under general noise distribution
  • Very limited convergence theory for SAN with
    noisy measurements

6
Traveling Salesperson Problem (TSP)
  • TSP is famous discrete optimization problem
  • Many successful uses of SAN with TSP
  • Basic problem is to find best way for salesperson
    to hit every city in territory once and only once
  • Setting arises in many problems of optimization
    on networks (communications, transportation,
    etc.)
  • If tour involves n cities, there are (n1) ! /2
    possible solutions
  • Extremely rapid growth in solution space as n
    increases
  • Problem is NP hard
  • Perturbations in SAN steps based on three
    operations on network inversion, translation,
    and switching
  • Depicted below

7
TSP Standard Search Operations Applied to
8-City Tour
Inversion reverses order 2-3-4-5 translation
removes section 2-3-4-5 and places it between
6-7 switching interchanges order of 2 and 5.
8-7
8
TSP (contd)Solution to Trivial 4-City Problem
where Cost/Link Distance (Related to Exercise
8.5 in ISSO)
9
Some Numerical Results for SAN
  • Section 8.3 of ISSO reports on three examples for
    SAN
  • Small-scale TSP
  • Problem with no local minima other than global
    minimum
  • Problem with multiple local minima
  • All examples based on stepwise temperature decay
    in basic SAN steps above and noise-free loss
    measurements
  • All SAN runs require algorithm tuning to pick
  • Initial T
  • Number of iterations at fixed T
  • Choice of 0 lt ? lt 1, representing amount of
    reduction in each temperature decay
  • Method for generating candidate ?new
  • Brief descriptions follow on slides below.

10
Small-Scale TSP (Example 8.1 in ISSO)
  • 10-city tour (very small by industrial standards)
  • Know by enumeration that minimum cost of tour
    440
  • Randomly chose inversion, translation, or
    switching at each iteration
  • Tuning required to choose good probabilities of
    selecting these operators
  • 8 of 10 SAN runs find minimum cost tour
  • Sample mean cost of initial tour is 700 sample
    mean of final tour is 444
  • Essential to success is adequate use of inversion
    operator 0 of 10 SAN runs find optimal tour if
    probability of inversion is ? 0.50
  • SAN successfully used in much larger TSPs
  • E.g., seminal 1983 (!) Kirkpatrick et al. paper
    in Science considers TSP with 400 cities

11
Comparison of SAN and Two Random Search
Algorithms (Example 8.2 in ISSO)
  • Considered very simple p 2 quartic loss seen
    earlier
  • Function has single global minimum no local
    minima
  • Table below gives sample mean terminal loss
    value, where initial loss 4.00 L(??) 0
  • SAN performs well, but random search even better
    in this problem

12
Evaluation of SAN in Problem with Multiple Local
Minima (Example 8.3 in ISSO)
  • Many numerical studies in literature showing
    favorable results for SAN
  • Loss function in study of Brooks and Morgan
    (1995)
  • with ? t1,?t2T and ? ?1,?12
  • Function has many local minima with a unique
    global minimum
  • Study compares quasi-Newton method and SAN
  • Apples vs. oranges (gradient-based vs.
    non-gradient-based)
  • 20 of quasi-Newton runs and 100 of SAN runs
    ended near ?? (random initial conditions)

13
Global Optimization via Annealing of Stochastic
Approximation
  • SAN not only way annealing used for global
    optimization
  • With appropriate annealing, stochastic
    approximation (SA) can be used in global
    optimization
  • Standard approach is to inject Gaussian noise to
    r.h.s. of SA recursion
  • where Gk is direct gradient measurement (Chap.
    5) or gradient approximation (FDSA or SPSA), bk ?
    0 (the annealing), and wk ? N(0,Ip?p)
  • Injected noise wk generated by Monte Carlo
  • Eqn. () has theoretical basis for formal
    convergence (Sect. 8.4 of ISSO)

14
Global Optimization via Annealing of Stochastic
Approximation (contd)
  • Careful selection of ak and bk required to
    achieve global convergence
  • Stochastic rate of convergence is slow
  • when
    ak a/(k1)?, ? lt 1

  • when ak a/(k1)
  • Above slow rates are price to be paid for global
    convergence
  • SPSA without injected randomness (i.e., bk 0)
    is global optimizer under certain conditions
  • Much faster convergence rate (0lt
    ? ? 2/3)

15
Ratio of Asymptotic Estimation Errors
with and without Injected Randomness (bk gt 0 and
bk 0, resp.)
70
60
50
40
30
20
10
Iterations, k
103
104
105
106
Write a Comment
User Comments (0)
About PowerShow.com