Least-Mean-Square Algorithm - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Least-Mean-Square Algorithm

Description:

Function: pattern classification. Type: Nonlinear single layer ... in stationary environment only (no adaptation or ... algorithm is used to determine ... – PowerPoint PPT presentation

Number of Views:2291
Avg rating:3.0/5.0
Slides: 16
Provided by: asimk
Category:

less

Transcript and Presenter's Notes

Title: Least-Mean-Square Algorithm


1
Least-Mean-Square Algorithm
  • CS/CMPE 537 Neural Networks

2
Linear Adaptive Filter
  • Linear adaptive filter performs a linear
    transformation of signal according to a
    performance measure which is minimized or
    maximized
  • The development of LAFs followed work of
    Rosenblatt (perceptron) and early neural network
    researchers
  • LAFs can be considered as linear single layer
    feedforward neural networks
  • Least-mean-square algorithm is a popular learning
    algorithm for LAFs (and linear single layer
    networks)
  • Wide applicability
  • Signal processing
  • Control

3
Historical Note
  • Linear associative memory (early 1970s)
  • Function memory by association
  • Type linear single layer feedforward network
  • Perceptron (late 50s, early 60s)
  • Function pattern classification
  • Type Nonlinear single layer feedforward network
  • Linear adaptive filter or Adaline (1960s)
  • Function adaptive signal processing
  • Type linear single layer feedforward network

4
Spatial Filter
5
Wiener-Hopf Equations (1)
  • The goal is to find the optimum weights that
    minimizes the difference between the system
    output y and some desired response d in the
    mean-square sense
  • System equations
  • y Sk1 p wkxk
  • e d y
  • Performance measure or cost function
  • J 0.5Ee2 E expectation operator
  • Find the optimum weights for which J is a minimum

6
Wiener-Hopf Equations (2)
  • Substituting and simplifying
  • J 0.5Ed2 ESk1 p wkxkd 0.5ESj1 pSk1
    pwjwkxjxk
  • Noting that expectation is a linear operator and
    w a constant
  • J 0.5Ed2 Sk1 p wkExkd 0.5Sj1 pSk1
    pwjwkExjxk
  • Let
  • rd Ed2 rdx(k) Edxk rx(j, k) Exjxk
  • Then
  • J 0.5rd Sk1 p wkrdx(k) 0.5Sj1 pSk1
    pwjwkrx(j,k)
  • To find the optimum weight
  • NablawkJ dJ/ dwk 0 k 1, 2,, p
  • -rdx(k) Sj1 pwjrx(j,k)

7
Wiener-Hopf Equations (3)
  • Let wok be the optimum weights, then
  • Sj1 pwojrx(j,k) rdx(k) k 1, 2,, p
  • These system of equations are known as the
    Wiener-Hopf equations. Their solution yields the
    optimum weights for the Wiener filter (spatial
    filter)
  • The solution of the Wiener-Hopf equations require
    the inverse of the autocorrelation matrix rx(j,
    k). This can be computationally expensive

8
Method of Steepest Descent (1)
9
Method of Steepest Descent (2)
  • Iteratively move in the direction of steepest
    descent (opposite the gradient direction) until
    the minimum is reached approximately
  • Let wk(n) be the weight at iteration n. Then, the
    gradient at iteration n is
  • NablawkJ(n) -rdx(k) Sj1 pwj(n)rx(j,k)
  • Adjustment applied to wk(n) at iteration n is
    given by
  • ? positive learning rate parameter

10
Method of Steepest Descent (3)
  • Cost function J(n) 0.5Ee2(n) is the ensemble
    average of all squared errors at the instant n
    drawn from a population of identical filters
  • An identical update rule can be derived when cost
    function is J 0.5Si1 n e2(i)
  • Method of steepest descent requires knowledge of
    the environment. Specifically, the terms rdx(k)
    and rx(j, k) must be known
  • What happens in an unknown environment?
  • Use estimates -gt least-mean-square algorithm

11
Least-Mean-Square Algorithm (1)
  • LMS algorithm is based on instantaneous estimates
    of rx(j, k) and rdx(k)
  • rx(j, kn) xj(n)xk(n)
  • rdx(kn) xk(n)d(n)
  • Substituting these estimates, the update rule
    becomes
  • wk(n1) wk(n) ?xk(n)d(n) Sj1
    pwj(n)xj(n)xk(n)
  • wk(n1) wk(n) ?d(n) Sj1
    pwj(n)xj(n)xk(n)
  • wk(n1) wk(n) ?d(n) y(n)xk(n) k 1,
    2,, p
  • This is also know as the delta rule or the
    Widrow-Hoff rule

12
LMS Algorithm (2)
13
LMS Vs Method of Steepest Descent
LMS Steepest Descent
Can operate in unknown environment (i.e. w(0) 0) Cannot operate in unknown environment (rx and rdx mut be known
Can operate in stationary and non-stationary environment (optimum seeking and tracking) Can operate in stationary environment only (no adaptation or tracking)
Minimizes instantaneous square error Minimizes mean-square-error (or sum of squared errors)
Stochastic Deterministic
Approximate Exact
14
Adaline (1)
15
Adaline (2)
  • Adaline (adaptive linear element) is an adaptive
    signal processing/pattern classification machine
    that uses LMS algorithm. Developed by Widrow and
    Hoff
  • Inputs x are either -1 or 1, threshold is
    between 0 and 1 and output is either -1 or 1
  • LMS algorithm is used to determine the weights.
    Instead of using the output y, the net input u is
    used in the error computation, i.e., e d u
    (because y is quantized in the Adaline)
Write a Comment
User Comments (0)
About PowerShow.com