Title: Chapter 2 Minimum Variance Unbiased estimation
1Chapter 2Minimum Variance Unbiased estimation
2Introduction
- In this chapter we will begin our search for good
estimators of unknown deterministic parameters. - We will restrict our attention to estimators
which on the average yield the true parameter
value. - Then, within this class of estimators the goal
will be to find the one that exhibits the least
variability. - The estimator thus obtained will produce values
close to the true value most of the time. - The notion of a minimum variance unbiased
estimator is examined within this chapter.
3Unbiased Estimators
- For an estimator to be unbiased we mean that on
the average the estimator will yield the true
value of the unknown parameter. - Since the parameter value may in general be
anywhere in the interval ,
unbiasedness asserts that no matter what the true
value of ?, our estimator will yield it on the
average.
(2.1)
4Example 2.1 (1/2)
- Consider the observations
- where A is the parameter to be estimated and
wn is WGN. The parameter A can take on any
value in the interval . - The reasonable estimator for the average value of
xn is - or the sample mean.
(2.2)
5Example 2.1 (2/2)
- Due to the linearity properties of the
expectation operator - for all A. The sample mean estimator is unbiased.
6Unbiased Estimators
- The restriction that for all ?
is an important one. - It is possible that may hold for some values of ?
and not others.
7Example 2.2
- Consider again Example 2.1 but with the modified
sample mean estimator - Then,
- It is seen that (2.3) holds for the modified
estimator only for A 0. - Clearly, it is a biased estimator.
8Unbiased Estimators
- That an estimator is unbiased does not
necessarily mean that it is a good estimator. - It only guarantees that on the average it will
attain the true value. - A persistent bias will always result in a poor
estimator. - As an example, the unbiased property has an
important implication when several estimators are
combined. A reasonable procedure is to combine
these estimates into a better one by averaging
them to form
9Unbiased Estimators
- Assuming the estimators are unbiased, with the
same variance, and uncorrelated with each other, - and
-
- so that as more estimates are averaged, the
variance will decrease.
10Unbiased Estimators
- However, if the estimators are biased or
- , then
- and no mater how many estimators are averaged,
will not converge to the true value. - Note that, in general,
- is defined as the bias of the estimator.
11(No Transcript)
12Minimum Variance Criterion
- In searching for optimal estimators we need to
adopt some optimality criterion. - A natural one is the mean square error (MSE),
defined as - Unfortunately, adoption of this natural criterion
leads to unrealizable estimators, ones that
cannot be written solely as a function of the
data.
13Minimum Variance Criterion
- To understand the problem which arises we first
rewrite the MSE as -
- which shows that the MSE is composed of errors
due to the variance of the estimator as well as
the bias.
(2.6)
14Minimum Variance Criterion
- As an example, for the problem in Example 2.1
consider the modified estimator - for come constant a.
- We will attempt to find the a which results in
the minimum MSE. - Since and
, we have
15Minimum Variance Criterion
- Differentiating the MSE with respect to a yields
- which upon setting to zero and solving yields
the optimum value - It is seen that the optimal value of a depends
upon the unknown parameter A. The estimator is
therefore not realizable.
16Minimum Variance Criterion
- In retrospect the estimator depends upon A since
the bias term in (2.6) is a function of A. - It would seem that any criterion which depends on
the bias will lead to an unrealizable estimator. - From a practical view point the minimum MSE
estimator needs to be abandoned.
17Minimum Variance Criterion
- An alternative approach is to constrain the bias
to be zero and find the estimator which minimizes
the variance. - Such an estimator is termed the minimum variance
unbiased (MVU) estimator. - Note that from (2.6) that the MSE of an unbiased
estimator is just the variance. - Minimizing the variance of an unbiased estimator
also has the effect of concentrating the PDF of
the estimation error about zero. - The estimation error will therefore be less
likely to be large.
18Existence of the Minimum Variance Unbiased
Estimator
- The question arises as to whether a MVU estimator
exists, i.e., an unbiased estimator with minimum
variance for all ?.
19Example 2.3 (1/3)
- Assume that we have two independent observations
x0 and x1 with PDF - The two estimators
- can easy be shown to be unbiased.
20Example 2.3 (2/3)
- To compute the variances we have that
- so that
-
- and
21Example 2.3 (3/3)
- Clearly, between these two estimators no MVU
estimator exists. - No single estimator can have a variance uniformly
less than or equal the minima.
22Finding the Minimum Variance Unbiased Estimator
- Even if a MV estimator exists, we may not be able
to find it. - In the next few chapters we shall discuss several
possible approaches. - They are
- Determine the Cramer-Rao lower bound (CRLB) and
check to see if some estimator satisfies it
(Chapters 3 and 4). - Apply the Rao-Blackwell-Lehmann-Scheffe (RBLS)
theorem (Chapter 5). - Further restrict the class of estimators to be
not only unbiased but also linear. Ten, find the
minimum variance estimator within this restricted
class (Chapter 6).
23Finding the Minimum Variance Unbiased Estimator
- The CRLB allow us to determine that for any
unbiased estimator the variance must be greater
than or equal to a given value. - If an estimator exists whose variance equals the
CRLB for each value of ?, then it must be the MVU
estimator.
24Extension to a Vector Parameter
- If is a vector of
unknown parameters, then we say that an estimator
is unbiased if - for i 1, 2, , p.
- By defining
(2.7)
25Extension to a Vector Parameter
- We can equivalently define an unbiased estimator
to have the property - for every ? contained within the space defined
in (2.7). - A MVU estimator has the additional property that
for i 1, 2, , p is minimum among all
unbiased estimators.
26Chapter 3Cramer-Rao Lower Bound
27Introduction
- Place a lower bound on the variance of any
unbiased estimator and assert that an estimator
is the MVU estimator. - Although many such variance bounds exist McAulay
and Hofstetter 1971, Kendall and Stuart 1979,
Seidman 1970, Ziv and Zakai 1969, the Cramer-Rao
lower bound (CRLB) is the easiest to determine.
283.3 Estimator Accuracy Considerations
- Consider the hidden factors that determine how
well we can estimate a parameter. - The more the PDF is influenced by the unknown
parameter, the better we should be able to
estimate it. - Example 3.1 - PDF dependence on unknown
parameterIf a single sample is observed
aswhere , and it
is desired to estimate A
293.3 Estimator Accuracy Considerations
- Example 3.1(cont.)A good unbiased estimator
isThe variance isThe estimator accuracy
improves as decreases. If
and
303.3 Estimator Accuracy Considerations
- Example 3.1(cont.)the latter is a much
weaker dependence on A.
313.3 Estimator Accuracy Considerations
- The sharpness of the likelihood functions
determines how accurately we can estimate the
unknown parameter.
323.3 Estimator Accuracy Considerations
- For this examplethe second derivative does
not depend on - In general ,a more appropriate measure of
curvature is
333.3 Estimator Accuracy Considerations
- Which measures the average curvature of the
log-likelihood function. - The expectation is taken with respect to
,resulting in a function of A only. - The larger the quantity, the smaller the variance
of the estimator.
343.4 Cramer-Rao Lower Bound
- Theorem 3.1 (CRLB Scalar Parameter)It is
assumed that the PDF satisfies the
regularity condition
for allthen , the
variance of any unbiased estimator must
satisfy
353.4 Cramer-Rao Lower Bound
- Theorem 3.1(cont.)furthermore, an unbiased
estimator attains the bound if and only
ifand min variance
363.4 Cramer-Rao Lower Bound
- Prove when the CRLB is attained,
thenproofBecause CRLB is attained and
373.4 Cramer-Rao Lower Bound
- Proof(cont.)so we getand thenfinally,
383.4 Cramer-Rao Lower Bound
393.4 Cramer-Rao Lower Bound
- Example 3.2 CRLB for Example 3.1
403.4 Cramer-Rao Lower Bound
- Example 3.3 DC level in white Gaussian
Noiseconsider the multiple observationsPDF
413.4 Cramer-Rao Lower Bound
423.4 Cramer-Rao Lower Bound
- Example 3.3(cont.)we see that the sample mean
estimator attains the bound and must therefore be
the MVU estimator.
433.4 Cramer-Rao Lower Bound
- Example 3.4 Phase EstimatorA and f0 are
assumed known, and we wish to estimate the phase
443.4 Cramer-Rao Lower Bound
- Example 3.4(cont.) So we get
453.4 Cramer-Rao Lower Bound
- Example 3.4(cont.)In this example the condition
for the bound to hold is not satisfied.Hence,
a phase estimator does not exist which unbiased
and attains the CRLB. - But, a MVU estimator may exist
463.4 Cramer-Rao Lower Bound
- Efficiency vs min
variance
473.4 Cramer-Rao Lower Bound
- Fisher information properties
- Nonnegative
- Additive for independent observations
483.4 Cramer-Rao Lower Bound
- The latter property leads to the result that the
CRLB for N IID observations is 1/N times that for
one observation. - For completely dependent samples,
493.5 General CRLB for Signals in White Gaussian
Noise
503.5 General CRLB for Signals in White Gaussian
Noise
513.5 General CRLB for Signals in White Gaussian
Noise
- Example 3.5 Sinusoidal Frequency
EstimationAssume where A and phase are known.
So we get the CRLB - If , the CRLB goes to infinity.
523.5 General CRLB for Signals in White Gaussian
Noise
533.6 Transformation of Parameters
- Usually, the parameter we wish to estimate is a
function of some more fundamental parameter. - In Example 3.3, we wish to estimate A2. Knowing
the CRLB for A, we can easily obtain it for A2. - As shown in Appendix 3A, if it is desired to
estimate - , then the CRLB is
543.6 Transformation of Parameters
- For the present example this becomes
and - In Example 3.3, the sample mean estimator was
efficient for A. It might be supposed that
is efficient for A2. - But actually, is not even an unbiased
estimator! - proof Since
553.6 Transformation of Parameters
- The efficiency of an estimator is destroyed by a
nonlinear transformation. - But the efficiency is maintained for linear
transformation. - Proof Assume that an efficient estimator for
exists and is given by . It is desired to
estimate . - We choose .
Then - So that is unbiased.
563.6 Transformation of Parameters
- The CRLB for
- But
, so that the CRLB is achieved. - So, the efficiency is maintained for linear
transformation.
573.6 Transformation of Parameters
- The efficiency is approximately maintained over
nonlinear transformations if the data record is
large enough. - Ex The example of estimating .
Although is biased, we note that
is asymptotically
unbiased or unbiased as . -
- Since , we can evaluate the
variance
583.6 Transformation of Parameters
- Using the result that if
- therefore
- For our problem, we have
- Hence, as , is an asymptotically
efficient estimator of A2.
593.6 Transformation of Parameters
- This situation occurs due to the statistical
linearity of the transformation, as illustrated
in figure. As N increased, the PDF of
becomes more concentrated about the mean A.
603.6 Transformation of Parameters
- If we linearize g about A, we have the
approximation - Within this approximation,
- the estimator is unbiased (asymptotically).
Also, - so the estimator achieves the CRLB
(asymptotically).
613.7 Extension to a Vector Parameter
- Now we extend the results to a vector parameter
- . As derived in Appendix 3B,
the CRLB is found as the i, i element of the
inverse of a matrix - where is the Fisher
information matrix. - for .
623.7 Extension to a Vector Parameter
- Example 3.6 DC Level in White Gaussian Noise
- We now extend example 3.3
- to the case where in addition to A the noise
variance - is also unknown.
- The parameter vector is , hence
p 2. The 2x2 Fisher information matrix is
633.7 Extension to a Vector Parameter
- The log-likelihood function is
- The derivatives are easily found as
643.7 Extension to a Vector Parameter
- Taking the negative expectation, the Fisher
information matrix becomes - Although not true in general, for this example
the Fisher information matrix is diagonal and
hence easily inverted to yield