129, v2'0 - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

129, v2'0

Description:

In these two lectures, we're looking at basic discrete time representations of ... be sufficiently exciting, they need to excite all relevant modes in the model ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 30
Provided by: intranetE
Category:
Tags: excite | uk

less

Transcript and Presenter's Notes

Title: 129, v2'0


1
Lectures 5 6Least Squares Parameter Estimation
Dr Martin Brown Room E1k, Control Systems
Centre Email martin.brown_at_manchester.ac.uk Teleph
one 0161 306 4672 http//www.eee.manchester.ac.uk
/intranet/pg/coursematerial/
2
L56 Resources
  • Core texts
  • Ljung, Chapters 47
  • Norton, Chapter 4
  • On-line, Chapters 45
  • In these two lectures, were looking at basic
    discrete time representations of linear, time
    invariant plants and models and seeing how their
    parameters can be estimated using the normal
    equations.
  • The key example is the first order, linear,
    stable RC electrical circuit which we met last
    week, and which has an exponential response.

3
L56 Learning Objectives
  • L5 Linear models and quadratic performance
    criterion
  • ARX ARMAX discrete-time, linear systems
  • Predictive models, regression and exemplar data
  • Residual signal
  • Performance criterion
  • L6 Normal equations, interpretation and
    properties
  • Quadratic cost functions
  • Derive the normal equations for parameter
    estimation
  • Examples
  • Were not too concerned with system dynamics
    today, were concentrating on the general form of
    least squares parameter estimation

4
Introduction to Parametric System Identification
  • In a full, physical, linear model, the models
    structure and coefficients can be determined from
    first principles
  • In most cases, we have to estimate/tune the
    parameters because of an incomplete understanding
    about the full system (unknown drag, )
  • We can use exemplar data (input/output examples),
    x(t), y(t), to estimate the unknown parameters
  • Initially assume that the structure is known
    (unrealistic, but ), and all that remains to be
    estimated are the parameter values.


y(t)
Model q
u(t)

w(t)
y(t)
Plant q
e(t)
v(t)
5
Recursive Parameter Estimation Framework
Model q(t-1)
-


y(t)
u(t)
Controller
Plant q


w(t)
e(t)
v(t)
  • where
  • q, q(t-1) are the real and estimated parameter
    vectors, respectively.
  • u(t) is the control input sequence
  • y(t), y(t), are the real and estimated outputs,
    respectively
  • e(t) is a white noise sequence (output/measurement
    noise)
  • w(t) is the disturbances from measurable sources



6
Basic Assumptions in System Identification
  • It is assumed that the unobservable disturbances
    can be aggregated and represented by a single
    additive noise e(t). There may also be input
    noise. Generally, it is assumed to be zero-mean,
    Gaussian
  • The system is assumed to be linear with
    time-invariant parameters, so q is not
    time-varying. This is only approximately true
    within certain limits
  • The input signal u(t) is assumed exactly known.
    Often there is noise associated with
    reading/measuring it
  • The system noise e(t) is assumed to be
    uncorrelated with the input process u(t). This
    is unlikely to be true for instance to due
    feedback of y(t)
  • The input signals need to be sufficiently
    exciting, they need to excite all relevant modes
    in the model for identification and testing

7
Discrete-Time Transfer Function Models
  • On this course, were primarily concerned with
    discrete time signals and systems.
  • Real-world physical, mechanical, electrical
    systems are continuous
  • Consider the CT resistor-capacitor circuit
  • So let q-1 denote the backward shift operator
    q-1y(t)y(t-1), then we have
  • NB we can use the c2d() Matlab function to go
    from the continuous time (transfer function,
    state space) domain to the discrete time,
    z-domain.

8
Transfer Function/ARX DT LTI Model
  • The previous model is an example of an
    AutoRegressive with eXogenous input), which can
    be expressed more generally as
  • Some comments about the form of this model.
  • The degree of the polynomials determines the
    complexity of the systems response and the
    number of parameters that have to be estimated.
    The roots of A(q) determine system stability
  • a01, without loss of generality, so the model
    can be written as a predictive model y(t)
    y(t-1) u(t-1)
  • b00, as it is assumed that an input cannot
    instantly affect the output, and so there must be
    at least a delay of one time instant between u
    y (assumes a fast enough sample time, relative to
    the system dynamics).
  • Typically eN(0,s2) independent and identically
    distributed
  • Close relationship between the q-shift and
    z-transform
  • When n0, this produces a finite impulse response

9
Linear Regression
  • The ARX systems prediction model can be
    expressed as
  • Here the models parameters can be written as
  • Treat the model as a deterministic system
  • This is natural if the error term is considered
    to be insignificant or difficult to guess
  • This denotes the model structure M (linear, time
    invariant, for example), and a particular model
    with a parameter value q, is M(q).
  • This can be written as a linear regression
    structure
  • where
  • Parameter vector
  • Input vector
  • The term regression comes from the statistics
    literature and provides a powerful set of
    techniques for determining the parameters and
    interpretating the models. Need access to
    previous outputs y(t-1)

10
LTI DT ARMAX Model
  • A more general discrete time, linear time
    invariant model also includes Moving Average
    terms on the error/residual signal
  • Here, we describe the equation error term, e(t),
    as a moving average of white noise (non-iid
    measurement errors)
  • Simple example
  • y(t) 0.5y(t-1) 0.3y(t-2) 1.2u(t-1) -
    0.3u(t-2) 0.5e(t) 0.5e(t-1)
  • This can be written as a pseudolinear regression

11
Exemplar Training Data
  • To estimate the unknown parameters q, we need to
    collect some exemplar input-output data, and
    system identification is then a process of
    estimating the parameter values that best fit the
    data.
  • The data is generated by a system of noisy ARX
    linear equations of the form
  • where
  • y is a column vector of measured plant outputs
    (T,1)
  • X is a matrix of input regressors (T,nm)
  • q is the true parameter vector (nm,1)
  • e is the error vector (T,1)
  • Each row of X represents a single input/output
    sample. Each column of X represents a time
    delayed output or input.
  • Note that there is a burn-in period to measure
    the time-delayed outputs y(1), y(2), which are
    necessary to form the inputs to the time-delayed
    vector
  • y(1), , y(t-n)

12
Example Data for 1st Order ARX Model
  • 1st Order model representation
  • First order plant model (exponential decay) with
    no external disturbances and the measurement
    noise is additive (Slide 7)
  • Input vector, output signal and parameters
  • At time t, the 1st order DT model is represented
    as
  • Output y(t)
  • Input x(t) y(t-1) u(t-1)
  • Parameters q q1 q2
  • Data
  • As there are two parameters, if the system is
    truly first order and there is no measurement
    noise on any of the signals, we just need two
    (linearly independent) samples to estimate q.
  • If there is measurement noise in y(t), we need to
    collect more data to reduce the effect of the
    random noise.
  • Store Xy(1) u(1) y(2) u(2) y(3) u(3) ,
    yy(2) y(3) y(4)

13
Prediction Residual Signal
  • The residual signal (measured-predicted) is
    defined as
  • and can be represented as
  • A simple regression interpretation is
  • (each x represents an exemplar
  • sample from a single input,
  • single output system)

r(t)
residual
x(t)
Model q
-


y(t)
x(t)
output measurement
Plant q


e(t)
x
x
x
x
x
x
x
14
Measures of Model Goodness
  • The models response can be expressed as
  • y(t) xT(t)q
  • where q is the models estimated parameter vector
    and x(t) is the input vector
  • If y(t)y(t), the models response is correct for
    that single time sample. The residual
    r(t)y(t)-y(t) is zero. The residuals magnitude
    gives us an idea of the goodness of the
    parameter vector estimate for that data point.
  • For a set of measured outputs and predictions
    y(t),y(t)t, the size of the residual vector
    ry-y, is an estimate of the parameter goodness
  • We can determine the size by looking at the norm
    of r.








15
Residual Norm Measures
  • A vector p-norm (of a vector r) is defined by
  • The most common p-norm is the 2-norm
  • The vector p-norm has the properties that
  • r ? 0
  • r 0 iff r 0
  • kr kr
  • r1r2 ? r1r2
  • For the residual vector, the norm is only zero if
    all the residuals are zero. Otherwise, a small
    norm means that, on average, the individual
    residuals are small in magnitude.

16
Sum of Squared Residuals
  • The most common discrete time performance index
    is the sum of squared residuals (2-norm squared)
  • For each data point, the models output is
    compared against the plants and error is squared
    and summed over the remaining points.
  • Any non-zero value for any of the residual values
    will mean that the performance index is positive
  • The performance function f(q) is a function of
    the parameter values, because some parameter
    values will cause large residuals, others will
    cause small residuals.
  • We want the parameter values that minimize f(q)
    (?0).

17
Relationship between Noise Residual
  • The aim of parameter estimation is to estimate
    the values of q that minimize this performance
    index (sum squared residuals or errors SSE).
  • When the model can predict the model exactly
  • r(t) e(t)
  • The residual signal is equal to the additive
    noise signal
  • Note that the SSE is often replaced by the mean
    squared error MSE defined by
  • MSE SSE/T ? s2 (the variance of the additive
    noise signal)
  • This is the variance of the residual signal.
  • This is simply represents the average squared
    error and ensures that the performance function
    does not depend on the amount of data
  • Example, when we have 1000 repeated trials (step
    responses) of 9 data points for the DT electric
    circuit, with additive noise N(0,0.01)
  • MSE r22/T 0.0103 ? s2
  • RMSE 0.1015 ? s

18
Example DT RC Electrical Circuit
  • Consider the DT, first order, LTI representation
    of the RC circuit which is an ARX model (Slide 7
    12)
  • Assume that D/RC0.5, then
  • y(t) 0.5y(t-1) 0.5u(t-1)
  • Here the system is initially at rest y(0)0.
    Note that u here refers to a step signal which is
    switched on at t1 u(0)0, rather than the
    control signal
  • Assume that 10 steps are taken, we collect 9 data
    points for system identification
  • gtgt Xy(1end-1) u(1end-1)
  • gtgt y1 y(2end)
  • Gaussian random noise of standard error 0.05 was
    also added to y1
  • gtgt y1e y10.05randn(size(y1))

19
Example Noisy Electric Circuit
NB randn(state, 123456)
  • Note here, were cheating a bit by assuming the
    exact measurement y(t-1) is available to the
    models input but only the noisy measurement
    ye(t) is available to the models output.
  • NB, in these notes, y() generally denotes the
    noisy output

20
Parameter Estimation
  • An important part of system identification is
    being able to estimate the parameters of a linear
    model, when a quadratic performance function is
    used to measure the models goodness.
  • This produces the well-known normal equations for
    least squares estimation
  • This is a closed form solution
  • Efficiently and robustly solved (in Matlab)
  • Permits a statistical interpretation
  • Can be solved recursively
  • Investigated over the next 3-4 lectures

21
Noise-free Parameter Determination
  • Parameter estimation works by assuming a
    plant/model structure, which is taken to be
    exactly known.
  • If there are nm parameters in the model, we can
    collect nm pieces of data (linearly independent
    to ensure that the input/data matrix, X, is
    invertible)
  • Xq y
  • and invert the matrix to find the exact parameter
    values
  • q X-1y
  • In Matlab, both of the following forms are
    equivalent
  • theta inv(X)y
  • theta X\y
  • theta 0.5 0.5 Previous example

22
Linear Model and Quadratic Performance
  • When the model is linear and the data is noisy
    (missing inputs, unmeasurable disturbances), the
    Sum Squared Error (SSE) performance index can be
    expressed as
  • This expression is quadratic in q. Typically
  • size(X,1)gtgtsize(X,2)
  • It is of the form (for 2 inputs/parameters)
  • The equivalent system of linear equations Xqye
    is inconsistent

23
Quadratic Matrix Representation
  • This can also be expressed in matrix form
  • The general form for a quadratic is
  • where

Hessian/covariance matrix
Cross-correlation vector
24
Normal Equations for a Linear Model
  • When the parameter vector is optimal
  • For a quadratic MSE with a linear model
  • At optimality
  • In Matlab, the normal equations are
  • thetaHat inv(XX)Xy
  • thetaHat pinv(X)y
  • thetaHat X\y

25
Example 1 2 Parameter Model
  • Data 3 data and 2 unknowns

Find Least Squares solution to
Form variance/covariance matrix and cross
correlation vector
Invert variance/covariance matrix
Least squares solution
26
Example 2 Electrical Circuit ARX Model
See slides 7, 12, 18 19
  • 9 exemplars and 2 parameters. Additive
    measurement noise

Hessian (variance/covariance) matrix and
correlation vector
Inverse Hessian matrix
Least squares solution
NB randn(state, 123456)
27
Investigation into the Performance Function
  • We can plot the performance index against
    different parameter values in a model
  • As shown earlier, f() is a quadratic function in
    q
  • It is centred at q, I.e. f(q) min f(q)
  • The shape (contours) depends on the Hessian
    matrix X, this influences the ability to identify
    the plant. See next lectures



28
L56 Summary
  • ARX and ARMAX discrete time linear models are
    widely used
  • System identification is being considered simply
    as parameter estimation
  • The residual vector is used to assess the quality
    of the model (parameter vector)
  • The sum, squared error/residual (2-norm) is
    commonly used to measure the residuals size
    because it can be interpreted as the noise
    variance and because it is analytically
    convenient
  • For a linear model, the SSE is a quadratic
    function of the parameters, which can be
    differentiated to estimate the optimal parameter
    via the normal equations

29
L56 Lab
  • Theory
  • Make sure you can derive the normal equations
    S22-24
  • Matlab
  • Implement the DT RC circuit simulation, S18, so
    you can perform a least squares parameter
    estimation given noisy data about the electrical
    circuit
  • Set the Gaussian random seed, as per S26 and
    check your estimates are the same
  • Set different seed and note that the optimal
    parameter values are different
  • Perform the step experiment 10, 100, 1000,
    times and note that the estimated optimal
    parameter values tend towards the true values of
    0.5 0.5.
  • Load the data into the identification toolbox GUI
    and create a first order parametric model with
    model orders 1 1 1. NB you do not need to
    remove the means from the data (why not?).
    Calculate the model and view the value of the
    parameters and the model fit, as well as checking
    the step response and validating the model.
Write a Comment
User Comments (0)
About PowerShow.com