CHAPTER 5 STOCHASTIC GRADIENT FORM OF STOCHASTIC APROXIMATION - PowerPoint PPT Presentation

About This Presentation
Title:

CHAPTER 5 STOCHASTIC GRADIENT FORM OF STOCHASTIC APROXIMATION

Description:

Q( , V) represents 'observed' cost (noisy measurement of L ... Unbiased measurement satisfies key convergence conditions of SA (Section 4.3 in ISSO) ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 11
Provided by: wayne89
Learn more at: https://www.jhuapl.edu
Category:

less

Transcript and Presenter's Notes

Title: CHAPTER 5 STOCHASTIC GRADIENT FORM OF STOCHASTIC APROXIMATION


1
CHAPTER 5STOCHASTIC GRADIENT FORM OF STOCHASTIC
APROXIMATION
Slides for Introduction to Stochastic Search and
Optimization (ISSO) by J. C. Spall
  • Organization of chapter in ISSO
  • Stochastic gradient
  • Core algorithm
  • Basic principles
  • Nonlinear regression
  • Connections to LMS
  • Neural network training
  • Discrete event dynamic systems
  • Image processing

2
Stochastic Gradient Formulation
  • For differentiable L(?), recall familiar set of p
    equations and p unknowns for use in finding a
    minimum ??
  • Above is special case of root-finding problem
  • Suppose cannot observe L(?) and g(?) except in
    presence of noise
  • Adaptive control (target tracking)
  • Simulation-based optimization
  • Etc.
  • Seek unbiased measurement of ?L/?? for
    optimization

3
Stochastic Gradient Formulation (Contd)
  • Suppose L(?) EQ(?,?V?)
  • V represents all random effects
  • Q(?,?V) represents observed cost (noisy
    measurement of L(?))
  • Seek a representation where ?Q/?? is an unbiased
    measurement of ?L/??
  • Not true when distribution function for V depends
    on ?
  • Above implies that desired representation is
  • not
  • where pV(?) is density function for V

4
Stochastic Gradient Measurement and Algorithm
  • When density pV(?) is independent of ?,
  • is unbiased measurement of ?L/??
  • Above requires derivativeintegral interchange in
    ?L/?? ?EQ(?,?V)/?? E?Q(?,?V)/?? to be
    valid
  • Can use root-finding (Robbins-Monro) SA algorithm
    to attempt to find ??
  • Unbiased measurement satisfies key convergence
    conditions of SA (Section 4.3 in ISSO)

5
Stochastic Gradient Tendency to Move Iterate in
Correct Direction
6
Stochastic Gradient and LMS Connections
  • Recall basic linear model from Chapter 3
  • Consider standard MSE loss L(?)
  • Implies Q
  • Recall basic LMS algorithm from Chapter 3
  • Hence LMS is direct application of stochastic
    gradient SA
  • Proposition 5.1 in ISSO shows how SA convergence
    theory applies to LMS
  • Implies convergence of LMS to ??

7
Neural Networks
  • Neural networks (NNs) are general function
    approximators
  • Actual output zk represented by a NN according to
    standard model zk h(?,?xk) vk
  • h(?,?xk) represents NN output for input xk and
    weight values ?
  • vk represents noise
  • Diagram of simple feedforward NN on next slide
  • Most popular training method is backpropagation
    (mean-squared-type loss function)
  • Backpropagation is following stochastic gradient
    recursion

8
Simple Feedforward Neural Network with p 25
Weight Parameters
9
Discrete-Event Dynamic Systems
  • Many applications of stochastic gradient methods
    in simulation-based optimization
  • Discrete-event dynamic systems frequently modeled
    by simulation
  • Trajectories of process are piecewise constant
  • Derivativeintegral interchange critical
  • Interchange not valid in many realistic systems
  • Interchange condition checked on case-by-case
    basis
  • Overall approach requires knowledge of inner
    workings of simulation
  • Needed to obtain ?Q(?,?V)/??
  • Chapters 14 and 15 of ISSO have extensive
    discussion of simulation-based optimization

10
Image Restoration
  • Aim is to recover true image subject to having
    recorded image corrupted by noise
  • Common to construct least-squares type problem
  • where H ? s represents a convolution of the
    measurement process (H) and the true
    pixel-by-pixel image (s)
  • Can be solved by either batch linear regression
    methods or the LMS/RLS methods
  • Nonlinear measurements need full power of
    stochastic gradient method
  • Measurements modeled as Z F(s, x, V)
Write a Comment
User Comments (0)
About PowerShow.com