Heteroskedasticity - PowerPoint PPT Presentation

About This Presentation
Title:

Heteroskedasticity

Description:

Error Term has a mean of zero: E(e) = 0 E(y) = 1 2xError term has constant variance: Var(e) = E(e2) ... Run the regression on each half of the data. ... – PowerPoint PPT presentation

Number of Views:100
Avg rating:3.0/5.0
Slides: 20
Provided by: unkn755
Learn more at: http://cob.jmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Heteroskedasticity


1
Heteroskedasticity
  • Outline
  • 1) What is it?
  • 2) What are the consequences for our Least
    Squares estimator when we have heteroskedasticity
  • 3) How do we test for heteroskedasticity?
  • 4) How do we correct a model that has
    heteroskedasticity

2
What is Heteroskedasticity
Review the assumption of Gauss-Markov
  • Linear Regression Model
    y ?1 ?2x e
  • Error Term has a mean of zero E(e) 0 ? E(y)
    ?1 ?2x
  • Error term has constant variance Var(e) E(e2)
    ?2
  • Error term is not correlated with itself (no
    serial correlation) Cov(ei,ej) E(eiej) 0
    i?j
  • Data on X are not random and thus are
    uncorrelated with the error term Cov(X,e)
    E(Xe) 0

This is the assumption of a homoskedastic error
A homoskedastic error is one that has constant
variance. A heteroskedastic error is one that has
a nonconstant variance.
Heteroskedasticity is more commonly a problem for
cross-section data sets, although a time-series
model can also have a non-constant variance.
3
This diagram shows a non-constant variance for
the error term that appears to increase as X
increases. There are other possibilities. In
general, any error that has a non-constant
variance is heteroskedastic.
f(yx)
y
.
.
.
x
x1
x2
x3
4
What are the Implications for Least Squares?
  • We have to ask where did we used the
    assumption? Or why was the assumption needed in
    the first place?
  • We used the assumption in the derivation of the
    variance formulas for the least squares
    estimators, b1 and b2.
  • For b2 is was

This last step uses the assumption that ?t2 is a
constant ?2.
5
If we dont make this assumption, then the
formula is
Remember
Therefore, if we ignore the problem of a
heteroskedastic error and estimate the variance
of b2 using the formula on the previous slide,
when in fact we should have used the formula
directly on this slide, then our estimates of
the variance of b2 are wrong. Any hypothesis
tests or confidence intervals based on them will
be invalid. However, E(b2) ??2 (Verify that
the proof of Unbiasedness did not use the
assumption of a homoskedastic error.
6
How do We Test for a Heteroskedastic Error
  • 1) Visual Inspection of the residuals
  • Because we never observe actual values for the
    error term, we never know for sure whether it is
    heteroskecastic or not. However, we can run a
    least squares regression and examine the
    residuals to see if they show a pattern
    consistent with a non- constant variance.

7
This regression resulted in the following
residuals plotted against the variable X (weekly
income). What do you see?
8
  • 2) Formal Tests for Heteroskedasticity (Goldfeld
    Quandt Test)
  • Many different tests, we will study the Goldfeld
    Quandt test
  • a) Examine the residuals and notice that the
    variance in the residuals appears to be larger
    for larger values of xt
  • Must make some assumption about the form of the
    heteroskedasticity (how the variance of et
    changes)
  • For the food expenditure problem, the residuals
    tell us that an increasing function of xt (weekly
    income) is a good candidate. Other models may
    have a variance that is a decreasing function of
    xt or is a function of some variable other than
    xt.

9
  • The Goldfeld Quandt Test
  • Sort the data in descending order, and the split
    the data in half.
  • Run the regression on each half of the data.
  • use the SSE from each regression to conduct a
    formal hypothesis test for heteroskedasticity
  • If the error is heteroskedastic with a larger
    variance for the larger values of xt, then we
    should find

Where
And where SSE1 comes from the the
regression using the subset of large values of
xt., which has t1 observations SSE2 comes from
the regression using the subset of small values
of xt, which has t2 observations
10
  • Conducting the Test

The error is Homoskedastic so that
The error is Heteroskedastic
It can be shown that the GQ statistic has a
F-distribution with (t1-k) d.o.f. in the
numerator and (t2-k) d.o.f. in the
denominator. If GQ gt Fc ? we reject Ho. We find
that the error is heteroskedastic.
11
Food Expenditure Example
This code sorts the data according to X because
we believe that the error variance is increasing
in xt.
proc sort datafood
by descending x
data food1
set food
if _n_ lt 20
proc reg
bigvalues model y x
data food2
set food
if _n_ gt 21
proc reg
littlevalues
model y x run
This code estimates the model for the first 20
observations, which are the observations with
large values of xt.
This code estimates the model for the second 20
observations, which are the observations will
small values of xt.
12
The REG Procedure
Model bigvalues
Dependent Variable y
Analysis of Variance
Sum of
Mean Source DF Squares
Square F Value Pr gt F   Model
1 4756.81422 4756.81422
2.08 0.1663 Error 18
41147 2285.93938 Corrected Total
19 45904   Root MSE
47.81150 R-Square 0.1036
Dependent Mean 148.32250 Adj R-Sq
0.0538 Coeff Var
32.23483 Parameter
Estimates Parameter
Standard Variable DF Estimate
Error t Value Pr gt t Intercept
1 48.17674 70.24191 0.69
0.5015 x 1 0.11767
0.08157 1.44 0.1663
The REG Procedure
Model littlevalues
Dependent Variable y  
Analysis of Variance
Sum of
Mean Source DF Squares
Square F Value Pr gt F Model
1 8370.95124 8370.95124
12.27 0.0025 Error 18
12284 682.45537 Corrected Total
19 20655   Root MSE
26.12385 R-Square 0.4053
Dependent Mean 112.30350 Adj R-Sq
0.3722 Coeff Var
23.26183   Parameter
Estimates Parameter
Standard Variable DF Estimate
Error t Value Pr gt t Intercept
1 12.93884 28.96658 0.45
0.6604 x 1 0.18234
0.05206 3.50 0.0025
Fc 2.22 (see SAS) ? Reject Ho
13
How Do We Correct for a Heteroskedastic Error?
  • White Standard Errors the correct formula for
    the variance of b2 is
  • Estimate ?2t in the above formula using the
    squared residual for each observation as the
    estimate of its variance
  • This gives us what are called Whites estimator
    of the error variance.
  • In SAS PROC REG
  • MODEL Y X / ACOV RUN
  • Food Expenditure example
  • White standard error se(b2) 0.0382
  • Typical Least Squares se(b2) 0.0305

14
  • 2) Generalized Least Squares
  • Idea Transform the model with a heteroskedastic
    error into a model with a homoskedastic error.
    Then do least squares.

Where
Suppose we knew st. Transform the model by
dividing every piece of it by the standard
deviation of the error
This model has an error with a constant variance
15
  • 2) Generalized Least Squares (cont)
  • Problem we dont know st. This requires us to
    assume a specification for the error variance.
    Lets assume that the variance increases linearly
    with xt.

Where
Transform the model by dividing every piece of it
by the standard deviation of the error.
16
This new model has an error term that is the
original error term divided by the square root of
xt. Its variance is constant.
  • This method is called Weighted Least Squares.
  • More efficient than Least Squares
  • Least Squares gives equal weight to all
    observations.
  • Weighted Least Squares gives each observation a
    weight that is inversely related to its value of
    the square root of xt ? large values of xt which
    we have assumed have a large variance will get
    less weight than smaller values of xt when
    estimating the intercept and slope of the
    regression line

17
We need to estimate this model
This requires us to construct 3 new variables
We estimate this model
Notice that it doesnt have an intercept
18
SAS code to do Weighted Least Squares
ystar y/sqrt(x)
x1star 1/sqrt(x)
x2star x/sqrt(x)
proc reg
foodglsmodel ystarx1star x2star/noint
run

19
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com