Conceptualizing Heteroskedasticity - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Conceptualizing Heteroskedasticity

Description:

Conceptualizing Heteroskedasticity & Autocorrelation Quantitative Methods II Lecture 18 Edmund Malesky, Ph.D., UCSD * ... – PowerPoint PPT presentation

Number of Views:100
Avg rating:3.0/5.0
Slides: 54
Provided by: male8
Learn more at: http://irps.ucsd.edu
Category:

less

Transcript and Presenter's Notes

Title: Conceptualizing Heteroskedasticity


1
Conceptualizing Heteroskedasticity
Autocorrelation
  • Quantitative Methods II
  • Lecture 18

Edmund Malesky, Ph.D., UCSD
2
OLS Assumptions about Error Variance and
Covariance
Remember, the formula for covariance cov(A,B)E(
A-µA) (B-µB)
  • Just finished our discussion of Omitted Variable
    Bias
  • Violates the assumption E(u)0
  • This was only one of the assumptions we made
    about errors to show that OLS is BLUE
  • Also assumed cov(u)E(uu)s2In
  • That is, we assumed u (0, s2In)

3
What Should uu Look Like?
  • Note uu is an nxn matrix
  • Different from uu a scalar sum of squared
    errors
  • Variances of u1.un on diagonal
  • Covariances of u1u2, u1u3are off the diagonal

4
A Well Behaved uu Matrix
5
Violations of E(uu)s2In
  • Two basic reasons that E(uu) may not be equal to
    s2In
  • Diagonal elements of uu may not be constant
  • Off-diagonal elements of uu may not be zero

6
Problematic Population Error Variances and
Covariances
  • Problem of non-constant error variances is known
    as HETEROSKEDASTICITY
  • Problem of non-zero error covariances is known as
    AUTOCORRELATION
  • These are different problems and generally occur
    with different types of data.
  • Nevertheless, the implications for OLS are the
    same.

7
The Causes of Heteroskedasticity
  • Often a problem in cross-sectional data
    especially aggregate data
  • Accuracy of measures may differ across units
  • data availability or number of observations
    within aggregate observations
  • If error is proportional to decision unit, then
    variance related to unit size (example GDP)

8
Demonstration of the Homskedasticity
Assumption Predicted Line Drawn Under
Homoskedasticity
F(y/x)
y
Variance across values of x is constant
x1
x2
x3
x4
x
9
Demonstration of the Homskedasticity
Assumption Predicted Line Drawn Under
Heteroskedasticity
F(y/x)
y
Variance differs across values of x
x1
x2
x3
x4
x
10
(No Transcript)
11
Looking for Heteroskedasticity
  • In a classic case, a plot of residuals against
    dependent variable or other variable will often
    produce a fan shape

12
Sometimes the variance if different across
different levels of the dependent variable.
13
Causes of Autocorrelation
  • Often a problem in time-series data
  • Spatial autocorrelation is possible and is more
    difficult to address
  • May be a result of measurement errors correlated
    over time
  • Any excluded xs cause y but are uncorrelated
    with our xs and are correlated over time
  • Wrong Functional Form

14
Looking for Autocorrelation
  • Plotting the residuals over time will often show
    an oscillating pattern
  • Correlation of ut u t-1 .85

15
Looking for Autocorrelation
  • As compared to a non-autocorrelated model

16
How does it impact our results?
  • Does not cause bias or inconsistency in OLS
    estimators (ßhat).
  • R-squared also unaffected.
  • The variance of ßhat is biased without
    homoskedastic assumption.
  • T-statistics become invalid and the problem is
    not resolved by larger sample sizes.
  • Similarly, F-tests are invalid.
  • Moreover, if Var(uX) is not constant, OLS is no
    longer BLUE. It is neither BEST or EFFICIENT.
  • What can we do??

17
OLS if E(uu) is not s2In
  • If errors are heteroskedastic or autocorrelated,
    then our OLS model is
  • YXßu
  • E(u)0
  • Cov(u)E(uu)W
  • Where W is an unknown n x n matrix
  • u (0,W)

18
OLS is Still Unbiased if E(uu) is not s2In
  • We dont need uu for unbiasedness

19
But OLS is not Best if E(uu) is not s2In
  • Remember from our derivation of the variance of
    the ßhats
  • Now, we square the distances to get the variance
    of ßhats around the true ßs

20
Comparing the Variance of ßhat
  • Thus if E(uu) is not s2In then
  • Recall CLM assumed E(uu) s2In and thus
    estimated cov(ßhat) as

Numerator
Denominator
21
Results of Heteroskedasticity and Autocorrelation
  • Thus if we unwittingly use OLS when we have
    heteroskedastic or autocorrelated errors, our
    estimates will have the wrong error variances
  • Thus our t-tests will also be wrong
  • Direction of bias depends on nature of the
    covariances and changing variances

22
What is Generalized Least Squares (GLS)?
  • One solution to both heteroskedasticity and
    autocorrelation is GLS
  • GLS is like OLS, but we provide the estimator
    with information about the variance and
    covariance of the errors
  • In practice the nature of this information will
    differ specific applications of GLS will differ
    for heteroskedasticity and autocorrelation

23
From OLS to GLS
  • We began with the problem that E(uu)W instead
    of E(uu) s2In
  • Where W is an unknown matrix
  • Thus we need to define a matrix of information O
  • Such that E(uu)WOs2In
  • The O matrix summarizes the pattern of variances
    and covariances among the errors

24
From OLS to GLS
  • In the case of heteroskedasticity, we give
    information in O about variance of the errors
  • In the case of autocorrelation, we give
    information in O about covariance of the errors
  • To counterbalance the impact of the variances and
    covariances in O, we multiply our OLS estmator by
    O-1

25
From OLS to GLS
  • We do this because
  • if E(uu)WOs2In
  • then W O-1 Os2In O-1s2In
  • Thus our new GLS estimator is
  • This estimator is unbiased and has a variance

26
What IS GLS?
  • Conceptually what GLS is doing is weighting the
    data
  • Notice we are multiplying X and y by the inverse
    of error covariance O
  • We weight the data to counterbalance the variance
    and covariance of the errors

27
GLS, Heteroskedasticity and Autocorrelation
  • For heteroskedasticity, we weight by the inverse
    of the variable associated with the variance of
    the errors
  • For autocorrelation, we weight by the inverse of
    the covariance among errors
  • This is also referred to as weighted regression

28
The Problem of Heteroskedasticity
  • Heteroskedasticity is one of two possible
    violations of our assumption E(uu)s2In
  • Specifically, it is a violation of the assumption
    of constant error variance
  • If errors are heteroskedastic, then coefficients
    are unbiased, but standard errors and t-tests are
    wrong.

29
How Do We Diagnose Heteroskedasticity?
  • There are numerous possible tests for
    heteroskedasticity
  • We have used two. The white test and hettest.
  • All of them consist of taking residuals from our
    equation and looking for patterns in variances.
  • Thus no single test is definitive, since we cant
    look everywhere.
  • As you have noticed, sometimes hettest and
    whitetst conflict.

30
Heteroskedasticity Tests
  • Informal Methods
  • Graph the data and look for patterns!
  • The Residual versus Fitted plot is an excellent
    one.
  • Look for differences in variance across the
    fitted values, as we did above.

31
Heteroskedasticity Tests
  • Goldfeld-Quandt test
  • Sort the n cases by the x that you think is
    correlated with ui2.
  • Drop a section of c cases out of the
    middle(one-fifth is a reasonable number).
  • Run separate regressions on both upper and lower
    samples.

32
Heteroskedasticity Tests
  • Goldfeld-Quandt test (cont.)
  • Difference in variance of the errors in the two
    regressions has an F distribution
  • n1-n1 is the degrees of freedom for the first
    regression and n2-k2 is the degrees of freedom
    for the second

33
Heteroskedasticity Tests
  • Breusch-Pagan Test (Wooldridge, 281).
  • Useful if Heteroskedasticity depends on more than
    one variable
  • Estimate model with OLS
  • Obtain the squared residuals
  • Estimate the equation

34
Heteroskedasticity Tests
  • Where z1-zk are the variables that are possible
    sources of heteroskedasticity.
  • The ratio of the explained sum of squares to the
    variance of the residuals tells us if this model
    is getting any purchase on the size of the errors
  • It turns out that
  • Where kthe number of z variables

35
White Test (WHITETST)
  • Estimate the model using OLS. Obtain the OLS
    residuals and the predicted values. Compute the
    squared residuals and squared predicted values.
  • Run the equation
  • Keep the R2 from this regression.
  • Form the F-statistic and compute the p-value.
    Stata uses the ?2 distribution which resembles
    the F distribution.
  • Look for a significant p-value.

36
Problems with tests of Heteroskedasticity
  • Tests rely on the first four assumptions of the
    classical linear model being true!
  • If assumption 4 is violated. That is, the zero
    conditional mean assumption, then a test for
    heteroskedasticity may reject the null hypothesis
    even if Var(yX) is constant.
  • This is true if our functional form is specified
    incorrectly (omitting a quadratic term or
    specifying a log instead of a level).

37
If Heteroskedasticy is discovered
  • The solution we have learned thus far and the
    easiest solution overall is to use the
    heterosekdasticity-robust standard error.
  • In stata, this command is robust after the
    regression in the robust command.

38
Remedying Heteroskedasticity Robust Standard
Errors
  • By hand, we use the formula
  • The square root of this formula is the
    heteroskedasticity robust standard error.
  • t-statistics are calculated using the new
    standard errror.

39
Remedying Heteroskedasticity GLS, WLS, FGLS
  • Generalized Least Squares
  • Adds the O-1 matrix to our OLS estimator to
    eliminate the pattern of error variances and
    covariances
  • A.K.A. Weighted Least Squares
  • An estimator used to adjust for a known form of
    heteroskedasticity where each squared residual is
    weighted by the inverse of the estimated variance
    of the error.
  • Rather than explicitly creating O-1 we can weight
    the data and perform OLS on the transformed
    variables.
  • Feasible Generalized Least Squares
  • A Type of WLS where the variance or correlation
    parameters are unknown and therefore must first
    be estimated.

40
Before robust, statisticians used Generalized or
Weighted Least
  • Recall our GLS Estimator
  • We can estimate this equation by weighting our
    independent and dependent variables and then
    doing OLS
  • But what is the correct weight?

41
GLS, WLS and Heteroskedasticity
  • Note, that we have XX and Xy in this equation
  • Thus to get the appropriate weight for the Xs
    and ys we need to define a new matrix F
  • Such that FF is an nxn matrix where
  • FF O-1

42
GLS, WLS and Heteroskedasticity
  • Then we can weight the xs and y by F such that
  • XFX and yFy
  • Now we can see that
  • Thus performing OLS on the transformed data IS
    the WLS or FGLS estimator

43
How Do We Choose the Weight?
  • Now our only remaining job is to figure out what
    F should be
  • Recall if there is a heteroskedasticity problem,
    then

44
Determining F
  • Thus

45
Determining F
  • And since FF O-1

46
Identifying our Weights
  • That is, if we believe that the variance of the
    errors depends on some variable h.
  • then we create our estimator by weighting our x
    and y variables by the square root of the inverse
    of that variable (WLS)
  • If the error is unknown, I estimate by regressing
    the squared residuals on the independent variable
    and use that square root of the inverse of the
    predicted (h-hat) as my weight.
  • Then we perform OLS on the equation

47
FGLS An Example
  • I created a dataset where
  • Y12x1-3x2u
  • Where uh_hatu
  • And u N(0,25)
  • x1 x2 are uniform and uncorrelated
  • h_hat is uniform and uncorrelated with y or the
    xs
  • Thus, I will need to re-weight by h_hat

48
FGLS Properties
  • FGLS is no longer unbiased, but it is consistent
    and asymptotically efficient.

49
FGLS An Example
reg y x1 x2 Source SS df MS
Number of obs
100 ---------------------------------------
F( 2, 97) 16.31 Model
29489.1875 2 14744.5937 Prob gt
F 0.0000 Residual 87702.0026 97
904.144357 R-squared
0.2516 ---------------------------------------
Adj R-squared 0.2362 Total
117191.19 99 1183.74939 Root
MSE 30.069 -------------------------------
-----------------------------------------------
y Coef. Std. Err. t Pgtt
95 Conf. Interval ----------------------
--------------------------------------------------
----- x1 3.406085 1.045157 3.259
0.002 1.331737 5.480433 x2
-2.209726 .5262174 -4.199 0.000
-3.254122 -1.16533 _cons -18.47556
8.604419 -2.147 0.034 -35.55295
-1.398172 ----------------------------------------
--------------------------------------
50
Tests are Significant
. whitetst White's general test statistic
1.180962 Chi-sq( 2) P-value .005 . Bpagan
x1 x2 Breusch-Pagan LM statistic 5.175019
Chi-sq( 1) P-value .0229
51
FGLS in STATAGiving it the Weight
reg y x1 x2 aweight1/h_hat (sum of wgt is
4.9247e001) Source SS df
MS Number of obs
100 ---------------------------------------
F( 2, 97) 44.53 Model
26364.7129 2 13182.3564 Prob gt
F 0.0000 Residual 28716.157 97
296.042856 R-squared
0.4787 ---------------------------------------
Adj R-squared 0.4679 Total
55080.8698 99 556.372423 Root
MSE 17.206 -------------------------------
-----------------------------------------------
y Coef. Std. Err. t Pgtt
95 Conf. Interval ----------------------
--------------------------------------------------
----- x1 2.35464 .7014901 3.357
0.001 .9623766 3.746904 x2
-2.707453 .3307317 -8.186 0.000
-3.363863 -2.051042 _cons -4.079022
5.515378 -0.740 0.461 -15.02552
6.867476 -----------------------------------------
-------------------------------------
52
FGLS By Hand
reg yhhat x1hhat x2hhat weight, noc Source
SS df MS
Number of obs 100 -------------------------
-------------- F( 3, 97)
75.54 Model 33037.8848 3 11012.6283
Prob gt F 0.0000 Residual
14141.7508 97 145.791245
R-squared 0.7003 -------------------------
-------------- Adj R-squared
0.6910 Total 47179.6355 100 471.796355
Root MSE 12.074 --------------
--------------------------------------------------
-------------- yhhat Coef. Std. Err.
t Pgtt 95 Conf.
Interval ---------------------------------------
-------------------------------------- x1hhat
2.35464 .7014901 3.357 0.001
.9623766 3.746904 x2hhat -2.707453
.3307317 -8.186 0.000 -3.363863
-2.051042 weight -4.079023 5.515378
-0.740 0.461 -15.02552
6.867476 -----------------------------------------
-------------------------------------
53
Tests Now Not-Significant
. whitetst White's general test statistic
1.180962 Chi-sq( 2) P-value .589 . Bpagan
x1 x2 Breusch-Pagan LM statistic 5.175019
Chi-sq( 1) P-value .229
Write a Comment
User Comments (0)
About PowerShow.com