CHAPTER 2 DIRECT METHODS FOR STOCHASTIC SEARCH - PowerPoint PPT Presentation

About This Presentation
Title:

CHAPTER 2 DIRECT METHODS FOR STOCHASTIC SEARCH

Description:

Step 1 (Reflection) Identify where max, second highest, and min loss values ... Step 2a (Accept reflection) If L( min) L( refl) L( 2max), then refl replaces ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 18
Provided by: Moor97
Learn more at: https://www.jhuapl.edu
Category:

less

Transcript and Presenter's Notes

Title: CHAPTER 2 DIRECT METHODS FOR STOCHASTIC SEARCH


1
CHAPTER 2 DIRECT METHODS FOR STOCHASTIC SEARCH
Slides for Introduction to Stochastic Search and
Optimization (ISSO) by J. C. Spall
  • Organization of chapter in ISSO
  • Introductory material
  • Random search methods
  • Attributes of random search
  • Blind random search (algorithm A)
  • Two localized random search methods (algorithms B
    and C)
  • Random search with noisy measurements
  • Nonlinear simplex (Nelder-Mead) algorithm
  • Noise-free and noisy measurements

2
Some Attributes of Direct Random Search with
Noise-Free Loss Measurements
  • Ease of programming
  • Use of only L values (vs. gradient values)
  • Avoid artful contrivance of more complex
    methods
  • Reasonable computational efficiency
  • Generality
  • Algorithms apply to virtually any function
  • Theoretical foundation
  • Performance guarantees, sometimes in finite
    samples
  • Global convergence in some cases

3
Algorithm A Simple Random (Blind) Search
  • Step 0 (initialization) Choose an initial value
    of ? inside of ?. Set k 0.
  • Step 1 (candidate value) Generate a new
    independent value ?new(k??1) ? ?, according to
    the chosen probability distribution. If
    L(?new(k??1)) lt set
    ?new(k??1). Else take .
  • Step 2 (return or stop) Stop if maximum number of
    L evaluations has been reached or user is
    otherwise satisfied with the current estimate for
    ? else, return to step 1 with the new k set to
    the former k??1.

4
First Several Iterations of Algorithm A on
Problem with Solution ?? 1.0, 1.0T (Example
2.1 in ISSO)
5
Algorithm B Localized Random Search
  • Step 0 (initialization) Choose an initial value
    of ? inside of ?. Set k 0.
  • Step 1 (candidate value) Generate a random dk.
    Check if ??. If not, generate new dk or
    move to nearest valid point. Let
    ?new(k??1) ? ? be or the modified
    point.
  • Step 2 (check for improvement) If L(?new(k??1))
    lt set ?new(k??1). Else
    take .
  • Step 3 (return or stop) Stop if maximum number of
    L evaluations has been reached or if user
    satisfied with current estimate else, return to
    step 1 with new k set to former k??1.

6
Algorithm C Enhanced Localized Random Search
  • Similar to algorithm B
  • Exploits knowledge of good/bad directions
  • If move in one direction produces decrease in
    loss, add bias to next iteration to continue
    algorithm moving in good direction
  • If move in one direction produces increase in
    loss, add bias to next iteration to move
    algorithm in opposite way
  • Slightly more complex implementation than
    algorithm B

7
Formal Convergence of Random Search Algorithms
  • Well-known results on convergence of random
    search
  • Applies to convergence of ? and/or L values
  • Applies when noise-free L measurements used in
    algorithms
  • Algorithm A (blind random search) converges under
    very general conditions
  • Applies to continuous or discrete functions
  • Conditions for convergence of algorithms B and C
    somewhat more restrictive, but still quite
    general
  • ISSO presents theorem for continuous functions
  • Other convergence results exist
  • Convergence rate theory also exists how fast to
    converge?
  • Algorithm A generally slow in high-dimensional
    problems

8
Functions for Convergence (Parts (a) and(b)) and
Nonconvergence (Part (c)) of Blind Random Search
(a) Continuous L(?) probability density for ?new
is gt 0 on ? 0, ?)
(b) Discrete L(?) discrete sampling for ?new
with P(?new i ) gt 0 for i 0, 1, 2,...
(c) Noncontinuous L(?) probability density for
?new is gt 0 on ? 0, ?)
??
9
Example Comparison of Algorithms A, B, and C
  • Relatively simple p 2 problem (Examples 2.3 and
    2.4 in ISSO)
  • Quartic loss function (plot on next slide)
  • One global solution several local minima/maxima
  • Started all algorithms at common initial
    condition and compared based on common number of
    loss evaluations
  • Algorithm A needed no tuning
  • Algorithms B and C required trial runs to tune
    algorithm coefficients

10
Multimodal Quartic Loss Function for p 2
Problem (Example 2.3 in ISSO)
11
Example 2.3 in ISSO (contd) Sample Means of
Terminal Values L(??) in Multimodal Loss
Function(with Approximate 95 Confidence
Intervals)
12
Examples 2.3 and 2.4 in ISSO (contd)Typical
Adjusted Loss Values ( L(??)) and ?
Estimates in Multimodal Loss Function (One Run)
13
Random Search Algorithms with Noisy Loss Function
Measurements
  • Basic implementation of random search assumes
    perfect (noise-free) values of L
  • Some applications require use of noisy
    measurements y(?) L(?) noise
  • Simplest modification is to form average of y
    values at each iteration as approximation to L
  • Alternative modification is to set threshold ? gt
    0 for improvement before new value is accepted in
    algorithm
  • Thresholding in algorithm B with modified step 2
  • Step 2 (modified) If y(?new(k??1)) lt
    set ?new(k??1). Else
    take .
  • Very limited convergence theory with noisy
    measurements

14
Nonlinear Simplex (Nelder-Mead) Algorithm
  • Nonlinear simplex method is popular search method
    (e.g., fminsearch in MATLAB)
  • Simplex is convex hull of p 1 points in ?p
  • Convex hull is smallest convex set enclosing the
    p 1 points
  • For p 2 ? convex hull is triangle
  • For p 3 ? convex hull is pyramid
  • Algorithm searches for ?? by moving convex hull
    within ?
  • If algorithm works properly, convex hull
    shrinks/collapses onto ??
  • No injected randomness (contrast with algorithms
    A, B, and C), but allowance for noisy loss
    measurements
  • Frequently effective, but no general convergence
    theory and many numerical counterexamples to
    convergence

15
Steps of Nonlinear Simplex Algorithm
  • Step 0 (Initialization) Generate initial set of p
    1 extreme points in ?p, ?i (i 1, 2, , p
    1), vertices of initial simplex
  • Step 1 (Reflection) Identify where max, second
    highest, and min loss values occur denote them
    by ?max, ?2max, and ?min, respectively. Let
    ?cent centroid (mean) of all ?i except for
    ?max. Generate candidate vertex ?refl by
    reflecting ?max through ?cent using ?refl (1
    ?)?cent ? ? ?max (? gt 0).
  • Step 2a (Accept reflection) If L(?min) ? L(?refl)
    lt L(?2max), then ?refl replaces ?max proceed to
    step 3 else go to step 2b.
  • Step 2b (Expansion) If L(?refl) lt L(?min), then
    expand reflection using ?exp ??refl (1 ?
    ?)?cent, ? gt 1 else go to step 2c. If L(?exp) lt
    L(?refl), then ?exp replaces ?max otherwise
    reject expansion and replace ?max by ?refl. Go
    to step 3.

16
Steps of Nonlinear Simplex Algorithm (contd)
  • Step 2c (Contraction) If L(?refl) ? L(?2max),
    then contract simplex Either case (i) L(?refl) lt
    L(?max), or case (ii) L(?max) ? L(?refl).
    Contraction point is ?cont ??max/refl (1 ?
    ?)?cent?, 0 ? ? ? 1, where ?max/refl ?refl if
    case (i), otherwise ?max/refl ?max. In case
    (i), accept contraction if L(?cont) ? L(?refl)
    in case (ii), accept contraction if L(?cont) lt
    L(?max). If accepted, replace ?max by ?cont and
    go to step 3 otherwise go to step 2d.
  • Step 2d (Shrink) If L(?cont) ? L(?max), shrink
    entire simplex using a factor 0 lt ? lt 1,
    retaining only ?min. Go to step 3.
  • Step 3 (Termination) Stop if convergence
    criterion or maximum number of function
    evaluations is met else return to step 1.

17
Illustration of Steps of Nonlinear Simplex
Algorithm with p 2
Write a Comment
User Comments (0)
About PowerShow.com