Econometric Methods 1

About This Presentation

Title:

Econometric Methods 1

Description:

Can more than one hypothesis at time in regression. ... Tests based on the skewness and kurtosis - skewness related to the 3rd moment kurtosis to the 4th ... – PowerPoint PPT presentation

Number of Views:48

Avg rating:3.0/5.0

Slides: 33

Provided by: itserv5

Category:

more less

Transcript and Presenter's Notes

Title: Econometric Methods 1

1
Econometric Methods 1

Lecture 4 Specification Tests and Violations of
the OLS assumptions

2
Tests of multiple hypotheses

Can more than one hypothesis at time in
regression. i.e. the constant equals -60 and the
slope coefficients are equal but opposite in
sign.
By multiple hypothesis mean there is more than
one sign in H0. e.g. the above hypothesis can
be written
Hypothesis is essentially a restriction on the
model, a simplification. Here we have q 2
restrictions in the null hypothesis. We might
consider estimating both the unrestricted and the
restricted model.

H0 ß1 -60 and ß2 - ß 3 vs H1 ß1 ? -60 or ß2
? - ß 3 or both.
3
Tests of multiple hypotheses

Restricted model
y b1 b2 x2 b3 x3 u
while the restricted version is
y 60 b2 (x2 x3) u'

4
Example

Illustrate with farm data. Let our
unrestricted regression be
Source SS df MS
Number of obs 369
-------------------------------------------
F( 6, 362) 29.56
Model 67.6016953 6 11.2669492
Prob gt F 0.0000
Residual 137.966779 362 .381123698
R-squared 0.3289
-------------------------------------------
Adj R-squared 0.3177
Total 205.568474 368 .558609983
Root MSE .61735
--------------------------------------------------
----------------------------
lva Coef. Std. Err. t
Pgtt 95 Conf. Interval
-------------------------------------------------
----------------------------
lfs -.1879987 .0519786 -3.62
0.000 -.2902165 -.0857808
propwell 1.01175 .1179663 8.58
0.000 .7797651 1.243736
propcan 1.14663 .1882754 6.09
0.000 .7763792 1.516881
lpopd .2375268 .0527616 4.50
0.000 .1337692 .3412845
lplotn .2219142 .071631 3.10
0.002 .081049 .3627793
lfamsize .0256384 .0846921 0.30
0.762 -.1409118 .1921886
_cons 7.071058 .171034 41.34
0.000 6.734713 7.407403

5
Example

Test joint restriction b(lfamsize) 0 and
b(propwell) b(propcan). Excluding lfamsize and
creating a variable (propirr) which is sum of
propwell and propcan.
Source SS df MS
Number of obs 369
-------------------------------------------
F( 4, 364) 44.37
Model 67.3785946 4 16.8446486
Prob gt F 0.0000
Residual 138.189879 364 .379642526
R-squared 0.3278
-------------------------------------------
Adj R-squared 0.3204
Total 205.568474 368 .558609983
Root MSE .61615
--------------------------------------------------
----------------------------
lva Coef. Std. Err. t
Pgtt 95 Conf. Interval
-------------------------------------------------
----------------------------
lfs -.1845138 .0516282 -3.57
0.000 -.2860408 -.0829868
propirr 1.034815 .11124 9.30
0.000 .8160611 1.253569
lpopd .2299474 .0515964 4.46
0.000 .1284829 .3314119
lplotn .2234388 .0682978 3.27
0.001 .089131 .3577466
_cons 7.110554 .1296769 54.83
0.000 6.855544 7.365564
--------------------------------------------------
----------------------------

6
Example of hypothesis testing

Can test using classical or ML methods, comparing
the two regressions (restricted and
unrestricted). Need information on the RSS and
likelihoods.
Remember that

7
Example

For the F test
0.2886 lt 3.0 F(2,) at 5 Cannot Reject H0
q is the number of restrictions in the null.
Sometimes tricky to work out number of
restrictions in hypothesis tests. q (degrees of
freedom in the numerator) can be calculated as
the difference between the degrees of freedom in
each of the restricted and unrestricted models.
In the denominator we simply have the number of
degrees of freedom in the unrestricted model.
LR test. -2 ln (LogLr LogLu) 0.6 lt (5)
5.99

8
Specification Problems Stability testing

Common problem is that regression may not apply
over whole sample.
e.g. the oil price shocks in the 1970s may
have altered price and wages setting and hence
the parameters of the Phillips curve.
We use a Chow test
Suppose we want to test if the sample can be
split into two sub-samples. We could
- (a) estimate one equation for each sub-sample,
or
- (b) estimate one equation over the whole
sample
The RSS in the latter case must be larger than
the (sum of the) former. If there is a big enough
difference we reject the null of stability.

9
Chow Test

Formally, we wish to test for a break after n1
periods.
estimate y Xb e over the first n1
observations. Obtain RSS1.
then estimate y Xb' e over the remaining n2
observations. Obtain RSS2.
calculate RSSU RSS1 RSS2
estimate the equation over the whole sample.
Obtain RSSR.
use the F-test F
If greater than the critical value, reject H0
coefficients are stable.

10
Specification Problems Omitted explanatory
variables

The true model is Y b1 b2 X2 b3 X3 u
we estimate Y b1 b2 X2 u by mistake,
i.e. leave out X3.
Consequences OLS estimate of b2 is in general
biased.
b2
and E(b2)
This equals b2 only if (a) b3 0 or (b) there is
zero correlation between X2 and X3.

11
Omitted explanatory variables other
consequences

The estimates are also inconsistent, unless all
correlations between included and excluded
variables disappear as n ? ?.
The estimated variance of b2 in the mis-specified
model could be larger or smaller than in the
correct model. In practice, it is often 'too
large'.
Omission could lead to symptoms such as
autocorrelation or heteroscedasticity in the
error term (see later)
Tests of functional form, stability and normality
of the errors may be failed.
Exercise had example of omission of house
characteristics in regression of price on
distance to incinerator

12
Specification Problems Inclusion of an
irrelevant regressor

This is generally a less severe problem.
The true model is Y b1 b2 X2 u
We estimate Y b1 b2 X2 b3 X3 u
The coefficient b3 should be insignificantly
different from zero (its true value). The
estimate of b2 will be unbiased but, in general,
there is some loss in efficiency,
i.e. the sampling variance of b2 is larger than
it need be. Hence, should omit X2 .

13
Specification ProblemsMulticollinearity

If two variables move together (over time or in
cross-section) then it is difficult to tell which
is influencing Y.
If move exactly together, then it is impossible.
Multicollinearity is problem of degree, ranging
from perfect to zero.
Perfect multicollinearity
Linear relationship variables, e.g. X2 a bX3.
- X has less than full column rank.
- (X'X)-1 doesn't exist, so no estimates.
In practice, perfect multicollinearity often
occurs by mistake - entering the same X variable
twice or too many dummy variables.

14
Near Perfect Multicollinearity

Approximate linear dependence X still has full
column rank, so OLS estimates exist.
(X'X) is nearly singular (i.e. determinant is
near to 0) and (X'X)-1 is large.
OLS is still BLUE, but V(b) may be large
Hence inference may be affected - t-ratios are
small and you will think variables are not
significant when in fact they are.
All models contain some multicollinearity,
question of when it is problem.

15
Near Perfect Multicollinearity

Symptoms
There is no one test, since multicollinearity is
a question of degree.
High R2 and F, but high variances and low
t-statistics
High correlations between X variables (neither
necessary nor sufficient).
Small changes in sample data lead to big changes
in estimates (of coefficients and variances)
Cures
respecify after thinking hard about the
possible causalities and correlations .

16
Specification ProblemsFunctional form

Have assumed the equation is linear. Ramsey's
RESET tests whether this is appropriate
Add powers of the X variables to the estimating
equation.
Instead of using powers of X, we use powers of
The procedure is
1. estimate Y b1 b2 X2 ... bk Xk u
2. obtain fitted values
3. estimate Y b'1 b'2 X2 ... b'k Xk c
2 u'
4. test for the significance of c. If it is
significantly different from 0, reject the linear
specification.

17
Example RESET Test on Farm data

Take fitted values and square them ? fits2.
reg lva lfs propirr lpopd frag fits2
Source SS df MS
Number of obs 369
-------------------------------------------
F( 5, 363) 35.79
Model 67.8727105 5 13.5745421
Prob gt F 0.0000
Residual 137.695763 363 .379327172
R-squared 0.3302
-------------------------------------------
Adj R-squared 0.3209
Total 205.568474 368 .558609983
Root MSE .6159
--------------------------------------------------
----------------------------
lva Coef. Std. Err. t
Pgtt 95 Conf. Interval
-------------------------------------------------
----------------------------
lfs -.6413701 .4036006 -1.59
0.113 -1.435059 .1523188
propirr 3.653044 2.296729 1.59
0.113 -.8635211 8.169609
lpopd .7999175 .5020516 1.59
0.112 -.1873774 1.787212
frag .7609251 .4758566 1.60
0.111 -.1748567 1.696707
fits2 -.1606089 .1407221 -1.14
0.254 -.4373418 .116124
_cons 15.21461 7.101781 2.14
0.033 1.248809 29.18041

18
Specification Problems Normality

Errors assumed to be Normal for inference. Are
they?
Tests based on the skewness and kurtosis
- skewness related to the 3rd moment kurtosis to
the 4th
- these should be 0 and 3 respectively in Normal
Jarque-Bera Test is based on this, H0 Normal
JB
Under the null
Where sum of residuals each
raised to the
power j, divided by the sample size

19
Heteroscedasticity

Relax some of the classical assumptions. We look
now at assumption A2 and relax this.
Recall
A2. E(uu) s2 I.
This is a matrix with s 2 on the diagonals,
zeroes everywhere else. We can relax this in two
ways
1. have non-constant values on the diagonal
2. have non-zero elements off the diagonal
1. leads to the problem of heteroscedasticity,
2. leads to autocorrelation (later).

20
Heteroscedasticity

Variance of error term now varies across
observations.
It may increase with one X variable, with time,
or in some more complex way.
(NB we still assume independence of the errors).
Often occurs in cross-section data when wide
range to the X variables,
e.g. holiday expenditure and income probably
rises with income but also becomes more variable.
It can also arise in grouped data, when each
observation is an average for a group (village,
industry, etc) and groups different sizes.

21
Heteroscedasticity

We now assume
V(b) s2O
and in general s12 ? s 22 etc.

22
Heteroscedasticity
23
Consequences of heteroscedasticity

OLS estimates of ß unbiased and consistent
- but inefficient (not BLUE) and not
asymptotically efficient.
- E(b) b E(S(xu)/Sx2) ) still no
correlation between x and u
(E(xu) 0). Hence OLS still unbiased.
estimate of the error variance is biased,
invalidating inference.
Inefficiency is harder to demonstrate,
- OLS gives equal weight to all observations.
- If know which error terms have high variance,
better to give less weight to them.

24
Goldfeld-Quandt test

Applicable if heteroscedasticity is related to
only one of the x variables, say x1. The
procedure is (where s2 increases with x1)
order your observations in increasing order of x1
omit the central r ? n/3 observations
run a regression on the first n1 (n-r)/2 obs,
obtaining RSS1
run a regression on the last n2 (n-r)/2 obs,
obtaining RSS2.
the test statistic for H0 s12 s22 ... sn2
is
F RSS2/RSS1 Fc1,c2 where ci (ni - r -
2k)/2.
bit restrictive since you have to know which is
x1.

25
White's test

We assume that V(ui) s 2 g(E(Yi))2
i.e. the variance of u depends upon the expected
value of Y (and hence on at least one of the x
variables). Hence
V(ui) s2 g(b1 b2 x2 ... bk xk)2
If H0 g 0 then we have homoscedasticity,
otherwise heteroscedasticity.
Hence we regress e2 on x22, x32, etc and x2 x3,
x3 x4, etc (ie all the cross-products).
From this auxiliary regression we obtain nR2
c2(k-1) as an LM statistic.

26
Example Farm data

After running our standard regression, regress
squared residuals on original Xs, plus their
squares and cross-product
nR2 369.0803 29.6 ?2(14) Here 5 is 26.1
and 1 is 29.1, so it is significant at the 1
level.
Reject the null of homoscedasticity
Simpler version uses instead of all the Xs
In other words the auxiliary regression is
e2 a b 2 v
Test is H0 b 0.
This gives similar results. t -2.1, again
between the 5 and 1 levels.

27
A now-standard cure

Whites heteroscedasticity consistent standard
errors (HCSEs)
Since it is only the standard errors rather than
the coefficients themselves which are biased (and
inconsistent), another approach is to find
alternative estimates of the standard errors.
White suggests replacing si2 s2 with si2 ei2,
the appropriate OLS residual. This is
implemented in STATA using the robust option.

28
Estimation when heteroscedasticity is present

We illustrate the use of generalised least
squares (GLS).
OLS assumes E(uu) s2 I, a very restricted form
of the error structure.
GLS allows the assumption that E(uu) s2 W,
where W is any n ? n matrix.
Taking account of W GLS estimates give us BLUEs.
There are two (equivalent) ways of obtaining GLS
estimates
use a different formula from OLS
transform the data and use OLS on the transformed
data.

29
Generalised least squares (GLS)

Need to use the GLS estimator for efficient
estimates b (X' W-1 X)-1 X' W-1 y
Let y Xb u with E(uu) s2 W. e.g. of W li
on diagonal, zeroes elsewhere.
Let PP W. (P always exists if W is
symmetric and positive-definite).
Hence P-1 W P-1 I (easy to show)
and P-1 P-1 W-1 (NOTE prime)
Now premultiply the original model by P-1
P-1 y P-1 Xb P-1 u ? y Xb u
where y P-1 y, etc.

30
Generalised least squares (GLS)

Note now that E(uu) E(P-1 u u P-1 )
P-1 E(uu) P-1 P-1 s2 W P-1 s2 I
So the starred model satisfies the classical
assumptions about u and we could apply OLS to
obtain BLUEs.
We have b (XX)-1 Xy (X P-1 P-1
X)-1 X P-1 ' P-1 y (X' W-1 X)-1 X' W-1 y
Note that W-1 matrix with 1/li on diagonal.
V(b) s2 (X X)-1 s2 (X W-1 X)-1
and our estimate of s2 is e W-1 e/(n-k)

31
Generalised least squares (GLS)

If know W can use the GLS formulae to find BLUEs.
Alternatively, from W find P-1 and transform the
data and use OLS
How do we find P-1? Have to assume something
giving us feasible GLS. Suppose a graph suggests
that s2 ? x22, i.e. E(ui2) s2 x2i2.
Hence W

32
Generalised least squares (GLS)

Hence P-1
and so P-1 y y1/x21 y2 /x22 ... yn /x2n
Similarly, all the explanatory variables are
divided by x2. Hence instead of estimating
y b1 b2 x2 ... bkxk u. We estimate
(by OLS)
y/x2 b1 /x2 b2 b3 x3 /x2 ... bk xk /x2
u/x2
i.e. we deflate all variables by x2.
Note that V(u/x2) V(u)/x22 s2 x2/x2 s2.