Title: Generalized method of moments estimation
1Generalized method of moments estimation
- FINA790C
- HKUST Spring 2006
2Outline
- Basic motivation and examples
- First case linear model and instrumental
variables - Estimation
- Asymptotic distribution
- Test of model specification
- General case of GMM
- Estimation
- Testing
3Basic motivation
- Method of moments suppose we want to estimate
the population mean ? and population variance ?2
of a random variable vt - These two parameters satisfy the population
moment conditions - Evt - ? 0
- Evt2 (?2?2) 0
4Method of moments
- So lets think of the analogous sample moment
conditions - T-1?vt - ?? 0
- T-1?vt2 ( ??2 ??2 ) 0
- This implies
- ?? T-1?vt
- ??2 T-1?(vt - ??)2
- Basic idea population moment conditions provide
information which can be used for estimation of
parameters
5Example
- Simple supply and demand system
- qtD a pt utD
- qtS ?1nt ?2pt utS
- qtD qtS qt
- qtD, qtS are quantity demanded and supplied pt
is price and qt equals quantity produced - Problem how to estimate a, given qt and pt
- OLS estimator runs into problems
6Example
- One solution find ztD such that cov(ztD, utD)
0 - Then cov(ztD,qt) acov(ztD,pt) 0
- So if EutD 0 then EztDqt aEztDpt 0
- Method of moments leads to
- a (T-1?ztDqt)/(T-1?ztDpt)
- which is the instrumental variables estimator of
a with instrument ztD
7Method of moments
- Population moment condition vector of observed
variables, vt, and vector of p parameters ?,
satisfy a px1 element vector of conditions
Ef(vt,?) 0 for all t - The method of moments estimator ?T solves the
analogous sample moment conditions - gT(?) T-1?f(vt,?T) 0 (1)
- where T is the sample size
- Intuitively ?T ? in probability to the solution
?0 of (1)
8Generalized method of moments
- Now suppose f is a qx1 vector and qgtp
- The GMM estimator chooses the value of ? which
comes closest to satisfying (1) as the estimator
of ?0, where closeness is measured by - QT(?) T-1?f(vt,?)WTT-1?f(vt,?)
gT(?)WTgT(?) - and WT is psd and plim(WT) W pd
9GMM example 1
- Power utility based asset pricing model
- Et ?(Ct1/Ct)-aRit1 1 0 with unknown
parameters b, a - The population unconditional moment conditions
are - E(?(Ct1/Ct)-aRit1 1)zjt 0 for j1,,q
where zjt is in information set
10GMM example 2
- CAPM says
- E Rit1 - ?0(1-bi) biRmt1 0
- Market efficiency
- E( Rit1 - ?0(1-bi) biRmt1 )zjt 0
11GMM example 3
- Suppose the conditional probability density
function of the continuous stationary random
vector vt, given Vt-1vt-1,vt-2, is
p(vt?0,Vt-1) - The MLE of ?0 based on the conditional log
likelihood function is the value of which
maximizes LT(?) ?lnp(vt?,Vt-1), i.e. which
solves ?LT(?)/?? 0 - This implies that the MLE is just the GMM
estimator based on the population moment
condition - E ?lnp(vt?,Vt-1)/?? 0
12GMM estimation
- The GMM estimator ?T argmin? e ?QT(?)
generates the first order conditions - T-1??f(vt,?T)/??WTT-1?f(vt,?T) 0 (2)
- where ?f(vt,?T)/?? is the q x p matrix with
i,j element ?fi(vt,?T)/??j - There is typically no closed form solution for
?T so it must be obtained through numerical
optimization methods
13Example Instrumental variable estimation of
linear model
- Suppose we have yt xt?0 ut, t1,,T
- Where xt is a (px1) vector of observed
explanatory variables, yt is an observed scalar,
ut is an unobserved error term with Eut 0 - Let zt be a qx1 vector of instruments such that
Eztut0 (contemporaneously uncorrelated) - Problem is to estimate ?0
14IV estimation
- The GMM estimator ?T argmin? e ?QT(?)
- where QT(?) T-1u(?)ZWTT-1Zu(?)
- The FOCs are
- (T-1XZ)WT(T-1Zy) (T-1XZ)WT(T-1ZX)?T
- When of instruments q of parameters p
(just-identified) and (T-1ZX) is nonsingular
then - ?T (T-1ZX)-1(T-1Zy)
- independently of the weighting matrix WT
15IV estimation
- When instruments q gt parameters p
(over-identified) then - ?T (T-1XZ)WT(T-1ZX)-1(T-1XZ)WT(T-1Zy)
16Identifying and overidentifying restrictions
- Go back to the first-order conditions
- (T-1XZ)WT(T-1Zu(?T)) 0
- These imply that GMM method of moments based on
population moment conditions ExtztWEztut(?0)
0 - When q p GMM method of moments based on
Eztut(?0) 0 - When q gt p GMM sets p linear combinations of
Eztut(?0) equal to 0
17Identifying and over-identifying restrictions
details
- From (2), GMM is method of moments estimator
based on population moments - E?f(vt,?0)/??WEf(vt,?0) 0 (3)
- Let F(?0)W½E?f(vt,?0)/?? and rank(F(?0))p.
(The rank condition is necessary for
identification of ?0). Rewrite (3) as - F(?0)W½Ef(vt,?0) 0 or
- F(?0)F(?0)F(?0)-1F(?0)W½Ef(vt,?0)0 (4)
-
- This says that the least squares projection of
W½Ef(vt,?0) on to the column space of F(?0) is
zero. In other words the GMM estimator is based
on rankF(?0)F(?0)F(?0)-1F(?0) p
restrictions on the (transformed) population
moment condition W½Ef(vt,?0). These are the
identifying restrictions GMM chooses ?T to
satisfy them.
18Identifying and over-identifying restrictions
details
- The restrictions that are left over are
- Iq - F(?0)F(?0)F(?0)-1F(?0)W½Ef(vt,?0)
0 - This says that the projection of W½Ef(vt,?0) on
to the orthogonal complement of F(?0) is zero,
generating q-p restrictions on the transformed
population moment condition. These
over-identifying restrictions are ignored by the
GMM estimator, so they need not be satisfied in
the sample - From (2),
- WT½T-1?f(vt,?T) Iq - FT(?T)FT(?T)FT(?
T)-1FT(?T)WT½T-1f(vt,?T) where FT(?)
WT½T-1??f(vt,?)/ ??. Therefore QT(?T) is like
a sum of squared residuals, and can be
interpreted as a measure of how far the sample is
from satisfying the over-identifying
restrictions.
19Asymptotic properties of GMM instrumental
variables estimator
- It can be shown that
- ?T is consistent for ?0
- (?T,i ?0,i)/vVT,ii/T N(0,1) asymptotically
where - VT (XZWTZX)-1XZWTSTWTZX (XZWTZX)-1
- ST limT?8 VarT-½Zu
20Covariance matrix estimation
- Assuming ut is serially uncorrelated
- VarT-½Zu ET-½?ztutT-½?ztut
- T-1?Eut2ztzt
- Therefore,
- ST T-1?ut(?T)2ztzt
21Two step estimator
- Notice that the asymptotic variance depends on
the weighting matrix WT - The optimal choice is WT ST-1 to give VT
(XZ ST-1ZX)-1 - But we need ?T to construct ST This suggests a
two-step (iterative) procedure - (a) estimate with sub-optimal WT, get ?T(1),get
ST(1) - (b) estimate with WT ST(1)-1
- (c) iterated GMM from (b) get ?T(2), get
ST(2), set WT ST(2)-1, repeat
22GMM and instrumental variables
- If ut is homoskedastic varu?2IT then ST
?2ZtZt where Zt is a (Txq) matrix with t-th row
zt and ?2 is a consistent estimate of ?2 - Choosing this as the weighting matrix WT ST-1
gives - ?T XZ(ZZ)-1ZX)-1 XZ(ZZ)-1Zy
- XX-1Xy
- where XZ(ZZ)-1ZX is the predicted value of X
from a regression of X on Z. This is two-stage
least squares (2SLS) first regress X on Z, then
regress y on the fitted value of X from the first
stage.
23Model specification test
- Identifying restrictions are satisfied in the
sample regardless of whether the model is correct - Over-identifying restrictions not imposed in the
sample - This gives rise to the overidentifying
restrictions test - JT TQT(?T) T-½u(?T)Z ST-1 T-½ Zu(?T)
- Under H0 Eztut(?0) 0, JT asymptotically
follows a ?2(q-p) distribution
24(From last time)
- In this case the MRS is
- mt,tj ?j Ctj /Ct -?
- Substituting in above gives
- Etln(Rit,tj)
- -jln? ?Et?ctj - ½ ?2 vart?lnctj -
½vartlnRit,tj - ?covt ?lnctj,lnRit,tj
25Example power utility lognormal distributions
- First order condition from investor maximizing
power utility with log-normal distribution gives - ln(Rit1) ui a?lnCt1 uit1
- the error term uit1 could be correlated with
?lnCt1 so we cant use OLS - However Etuit1 0 means uit1 is uncorrelated
with any zt known at time t
26Instrumental variables regressions for returns
and consumption growth
- Return Stage 1 Stage 2 JT test
- r ?lnc ? ? r ?lnc
- CP 0.297 0.102 -0.953 -0.118 0.221
0.091 - (0.00) (0.15) (0.57) (0.11)
(0.00) (0.10) - Stock 0.110 0.102 -0.235 -0.008 0.105
0.097 - (0.11) (0.15) (1.65) (0.06)
(0.06) (0.08) - Annual US data 1889-1994 on growth in log real
consumption of nondurables services, log real
return on SP500, real return on 6-month
commercial paper. Instruments are 2 lags each
of real commercial paper rate, real consumption
growth rate and log of dividend-price ratio
27GMM estimation general case
- Go back to GMM estimation but let f be a vector
of continuous nonlinear functions of the data and
unknown parameters - In our case, we have N assets and the moment
condition is that E(m(xt,?0)Rit-1)zjt-10 using
instruments zjt-1 for each asset i1,,N and each
instrument j1,,q - Collect these as f(vt,?) zt-1?ut(Xt,?) where zt
is a 1xq vector of instruments and ut is a Nx1
vector. f is a column vector with qN elements
it contains the cross-product of each instrument
with each element of u. - The population moment condition is
- Ef(vt,?0) 0
28GMM estimation general case
- As before, let gT(?) T-1?f(vt,?). The GMM
estimator is ?T argmin? e ?QT(?) - The FOCs are
- GT(?)WTgT(?) 0
- where GT(?) is a matrix of partial derivatives
with i, j element dgTi/d?j
29GMM asymptotics
- It can be shown that
- ?T is consistent for ?0
- asyvar(?T,ii) MSM where
- M (G0WG0)-1G0W
- G0 E?f(vt,?0)/??
- S limT?8 VarT-½gT(?0)
30Covariance matrix estimation for GMM
- In practice VT MTSTMT is a consistent
estimator of V, where - MT (GT(?T)WTGT(?T))-1 GT(?T)WT
- ST is a consistent estimator of S
- Estimator of S depends on time series properties
of f(vt,?0). In general it is - S G0 ?( Gi Gi)
- where Gi Eft-E(ft)ft-i-E(ft-i)Eftft-I
is the i-th autocovariance matrix of ft
f(vt,?0).
31GMM asymptotics
- So (?T,i ?0,i)/vVT,ii/T N(0,1)
asymptotically - As before, we can choose WT ST-1 and
- VT is then (GT(?T)ST-1GT(?T))-1
- a test of the models over-identifying
restrictions is given by JT TQT(?T) which is
asymptotically ?2(qN-p)
32Covariance matrix estimation
- We can consistently estimate S with
- ST G0(?T) ?( Gi(?T) Gi(?T))
- where Gi(?T) T-1?ft(vt,?T)ft-i(vt-i,?T)
- If theory implies that the autocovariances of
f(vt,?0) 0 for some lag j then we can exclude
these from ST - e.g. ut are serially uncorrelated implies ST
T-1? ut(vt,?T)ut(vt,?T) ? (ztzt)
33GMM adjustments
- Iterated GMM is recommended in small samples
- More powerful tests by subtracting sample means
of ft(vt,?T) in calculating Gi(?T) - Asymptotic standard errors may be understated in
small samples multiply asymptotic variances by
degrees of freedom adjustment T/(T-p) or
(Nq)T/(Nq)T k where k p((Nq)2Nq)/2