Title: Econometric Methods 1
1Econometric Methods 1
- Lecture 5 Stochastic regressors and IV estimation
2The non-stochastic assumption
- Suppose conducting trials of a pesticide effects
on harvest - Randomly choose which fields you apply pesticide
and how much - Collect data harvests (y), dose of pesticide is
X variable helping to explain the size of the
harvest from a field. - Since the process of spraying field was
undertaken by the experimenter, a statistician
could then treat the dose as fixed.
3The non-stochastic assumption
- Now think of walking into randomly sampled
households - Ask them how many boxes of breakfast cereal they
keep, how many pints of milk in refrigerator and
how many people in the house, and particularly
how many children in the house. - All of these quantities are things you, the
investigator, didnt know before you performed
the interview and all could be correlated with
holdings of breakfast cereal. - For all of them can calculate a mean and a
standard deviation, which allow you to say
something about the relevant population. In
other words we might treat them all as
realisations of random variables.
4Estimation with stochastic regressors
- Consider modelling cereal holding per household
with these variables. - Run a regression of cereal holding with number of
children as an explanatory variable. - Would it be OK to treat the number of children as
if it were a control variable like the pesticide
in the field trial example? - (1)
- Assumption we require is the independence of
nchild and e, or, that the conditional
expectation of e given nchild equals zero. - With independence (1) can be estimated by OLS,
and all desirable properties (unbiasedness) can
be asserted.
5Estimation with stochastic regressors
- Recall OLS estimator with non-stochastic x in
deviation form is - b
- And E(b) ß because E(Sxu)
0 - If x is stochastic we can no longer decompose
E(Sxu) so conveniently. Taking expectations - E(b) ß ESxu/ Sx2 where both u and x are
r.v.s. - Unfortunately, ESxu/ Sx2 ? ESx/ Sx2 x E(u)
0 unless x and u are independent. - So OLS will be biased unless x and u are
independent
6Independence can be violated in a number of ways
- Simultaneity bias (reverse causality)
- Suppose having lots of breakfast cereal increases
fertility (Unlikely!!!) - Then 2nd model with nchild as dependent variable
and cereal as explanatory variable. Then nchild
contains e! - Covariance of nchild and e, cannot be zero
7Independence can be violated in a number of ways
- Measurement error in X
- Suppose that the true number of children is
nchild - But we measure this with error nchild nchild
? - Where ? is a random error with mean zero E(v)0
and it is uncorrelated with the true nchild - True model is
- But we estimate
- X is now correlated with the error term
- Cov(nchild,v) E(nchild.v) E(nchild.v)E(v2)
8Measurement error in X
- It can be shown (Wooldridge p322) that
- Where is the variance of the true X
variable - So bias in b is dependent upon the ratio
Var(x)/Var(x) - This is sometimes called the signal to noise
ratio - It is always less than 1
- So b is biased towards zero (attenuation bias)
- If variance of x large relative to measurement
error then bias is small
9Independence can be violated in a number of ways
- Omitted variables
- These can be observed or unobserved
- The true model is Y b1 b2 X2 b3 X3 u
- But we omit X3 then
- E(b2)
- This equals ß2 only if ß3 0 or there is zero
correlation between X2 and X3
10Consequences of stochastic X
- Suppose that there is positive correlation
between x and u (reflecting lack of
independence), then the OLS estimate of the slope
is biased upwards, see formula. - Can we assume independence?
- We need to be reasonably convinced there are
- (a) No feedbacks/simultaneity
- (b) No (serious) measurement error
- (c) No omitted variables likely to be correlated
with x. - Independence requires all of the observations on
x are uncorrelated with all of the error terms - Perhaps plausible that my error term is
uncorrelated with other households number of
children
11How about in time series?
- Much less plausible that independence holds
- Last years inflation error is uncorrelated with
this years unemployment? - It does not apply in model with lagged dependent
variable - yt a bxt gyt-1 ut
- Since one of the x variables (yt-1 ) is
correlated with ut-1
12Asymptotic properties
- Recall consistency. If an estimator is
consistent, its sampling distribution 'homes in'
on the true value of the parameter as the sample
size increases. - limit P(b - b lt e) 1 for arbitrarily
small e plim b b - n ? ?
- Sufficient (not necessary) conditions for
consistency are that the bias and sampling
variance of estimator both go to zero. - Example consider 1/n as an estimator of m.
Biased (bias 1/n) in small samples and sampling
variance s2 /n. - Both of these go to zero as n goes to infinity,
so it is a consistent estimator. - Hence consistency might be a nice property to
have in the absence of unbiasedness (at least if
you've got a large sample).
13Properties of plim
- Plim has some convenient properties. In
particular, if b1 is a consistent estimator of b1
and b2 is a consistent estimator of b2 (i.e. plim
b1 b1 etc.) then - f(b1 ) is a consistent estimator of f(b1 ) for
any function f. E.g. log b1 is a consistent
estimator of log b1. - f1 (b1 )/f2 (b2 ) is a consistent estimator of f1
(b1 )/f2 (b2 ) for functions f1 and f2. This
implies that b1 /b2 is consistent for b1 /b2, b1
? b2 is consistent for b1 ?b2 etc. - These apply irrespective of whether b1 and b2 are
independent or not.
14Asymptotic properties of regression coefficients
- Using the plim properties we can now write
- plim b b
- Note we have divided top and bottom by n to
obtain sensible probability limits. - The second term on the RHS will be 0 as long as
- (i) plim(Sxu)/n 0 and (ii) plim(Sx2)/n exists
and is non-zero. - The first part of this is satisfied if E(Sxu) 0
- (the contemporaneously uncorrelated case)
- This is a weaker condition than independence,
since only xt and ut need to be uncorrelated (not
all past and future values). This allows the
Lagged Dependent Variable model.
15(No Transcript)
16Instrumental Variables Estimation
- When x and u contemporaneously correlated need an
alternative method of estimation. - Instrumental variables (IV) is a general method
which always yields consistent estimates, in
particular when x and u are correlated - The principle is simple - replace each of the
problem x variables with an instrument z
which is highly correlated with x but is
uncorrelated with u. - Two properties instrument relevance (i.e.
important determinant of x) and instrument
exogeneity that is not correlated with u. A
valid instrument must be both.
17Instrumental Variables Estimation
- Intuition for relevance seems pretty
straightforward. - Exogeneity means instrument has no independent
place in the regression only affects y through
x. - Instrument need not consist of a single variable,
several could be used to instrument a single x. - We start with the simpler case with one
instrument for each problem variable. - If we call the n ? k matrix of instruments W
then the IV estimator of ß is given by - bIV (WX)-1 Wy
18Instrumental Variables Estimation
- IV is effectively a two-stage procedure. Take
case of one x and one instrument w. First do OLS
regression of x gw e. - Then Now let
- Now do the OLS regression
- IV is the two stages collapsed into one.
- In using only using that variation in x due
to w - Have purged it of variation that is correlated
with u
19Variance of IV estimates
- IV estimates have higher variances than OLS
- The variance of the IV estimator is given by
-
- V(bIV) s2 (WX)-1 WW(XW)-1
-
- (for the single instrument per regressor case).
- s2 is estimated by (y Xb) (y Xb)/(n-k).
20Testing instrument validity
- Testing for instrument relevance is
straightforward - Estimate first stage regression to see if w
affects x - Rule of thumb (F-stat gt 10)
- Checking instrument exogeneity much harder and
impossible when have just one w for each x - We have
- Want to test w is uncorrelated with u cov(w,u)0
- But which us do we use here
- IV residuals not
appropriate since may be inconsistent if w not
exogenous instrument - OLS residuals also likely to be inconsistent
21Testing instrument validity
- If these are more instruments than endogenous xs
then we can test exogeneity, using the
over-identifying restrictions test. See Stock
and Watson p354. The intuition of it is that
because there are more instuments than endogenous
xs, then we can test if the instruments
significantly enter the equation additionally to
their role as instruments. - Since cannot test exogeneity of instruments we
always need a convincing story - Appeal to economic theory or introspection