Heteroske...what? - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

Heteroske...what?

Description:

If the sampling distribution centers on the true population mean, our estimates ... Plan A: If heteroskedasticity comes from one specific variable, we can use that ... – PowerPoint PPT presentation

Number of Views:120
Avg rating:3.0/5.0
Slides: 45
Provided by: DAN3179
Category:

less

Transcript and Presenter's Notes

Title: Heteroske...what?


1
Heteroske...what?
2
O.L.S. is B.L.U.E.
  • BLUE means Best Linear Unbiased Estimator.
  • What does that mean?
  • We need to define
  • Unbiased The mean of the sampling distribution
    is the true population parameter.
  • What is a sampling distribution?
  • Imagine taking a sample, finding b, take another
    sample, find b again, and repeat over and over.
    Describes the possible values b can take on in
    repeated sampling

3
We hope that
ß
If the sampling distribution centers on the true
population mean, our estimates will, on average
be right. We get this with the 10 assumptions
4
If some assumptions dont hold
ß
Average
  • We can get a biased estimate. That is,

5
Bias is Bad
  • If your parameter estimates are unbiased, your
    answers (coefficients) relating x and y are
    wrong. They do not describe the true
    relationship.

6
Efficiency / Inefficiency
  • What makes one unbiased estimator better than
    another?

7
Efficiency
  • Sampling Distributions with less variance
    (smaller standard errors) are more efficient
  • OLS is the Best linear unbiased estimator
    because its sampling distribution has less
    variance than other estimators.

OLS Regression
LAV Regression
8
Under the 10 regression assumptions and assuming
normally distributed errors
  • We will get estimates using OLS
  • Those estimates will be unbiased
  • Those estimates will be efficient (the best)
  • They will be the Best Unbiased Estimator out of
    all possible estimators

9
If we violate
  • Perfect Collinearity or n gt k
  • We cannot get any estimatesnothing we can do to
    fix it
  • Normal Error Term assumption
  • OLS is BLUE, but not BUE.
  • Heteroskedasticity or Serial Correlation
  • OLS is still unbiased, but not efficient
  • Everything else (omitted variables, endogeneity,
    linearity)
  • OLS is biased

10
What do Bias and Efficiency Mean?
ß
ß

Biased, but very efficient
Unbiased, but inefficient
ß
ß

Biased and inefficient
Unbiased and efficient
11
Today Heteroskedasticity
  • Consequence OLS is still Unbiased, but it is not
    efficient (and std. errors are wrong)
  • Today we will learn
  • How to diagnose Heteroskedasticity
  • How to remedy Heteroskedasticity
  • New Estimator for coefficients and std. errs.
  • Keep OLS estimator but fix std. errs.

12
What is heteroskedasticity?
  • Heteroskedasticity occurs when the size of the
    errors varies across observations. This arises
    generally in two ways.
  • When increases in an independent variable are
    associated with changes in error in prediction.

13
What is Heteroskedasticity?
  • Heteroskedasticity occurs when the size of the
    errors varies across observations. This arises
    generally in two ways.
  • When you have subgroups or clusters in your
    data.
  • We might try to predict presidential popularity.
    We measure average popularity in each year. Of
    course, there are clusters of years where the
    same president is in office. Because each
    president is unique, the errors in predicting
    Bushs popularity are likely to be a bit
    different from the errors predicting Clintons.

14
How do we recognize this beast?
  • Three Methods
  • Think about your datalook for analogs of the two
    ways heteroskedasticity can strike.
  • Graphical Analysis
  • Formal statistical test

15
Graphical Analysis
  • Plot residuals against and independent
    variables.
  • Expect to see residuals randomly clustered around
    zero
  • However, you might see a pattern. This is bad.
  • Examples

16
As x increases, so does the
As x increases, the
error variance
error variance decreases
scatter resid x
scatter resid x
rvfplot (or scatter resid yhat)
As the predicted value of y increases, So
does the error variance
17
Good Examples
scatter y x
scatter resid x
rvfplot (scatter resid yhat)
18
Formal Statistical Tests
  • Whites Test
  • Heteroskedasticity occurs when the size of the
    errors is correlated with one or more independent
    variables.
  • We can run OLS, get the residuals, and then see
    if they are correlated with the independent
    variables

19
More Formally,
state district turnout diplomau mdnincm pred_turnout residual
AL 1 151,188 14.7 27,360 200,757.4 -49,569.4
AL 2 216,788 16.7 29,492 205,330 11,457.96
AL 3 147,317 12.3 26,800 197,491.7 -50,174.7
AL 4 226,409 8.1 25,401 191,310.8 35,098.16
AL 5 186,059 20.4 33,189 213,514.6 -27,455.6
20
So, if error increases with x, we violate
heteroskedasticity
  • If we can predict error with a regression line,
    we have heteroskasticity.
  • To make this prediction, we need to make
    everything positive (square it)

21
So, if error increases with x, we violate
homoskedasticity
  • Finally, we use these squared residuals as the
    dependent variable in a new regression.
  • If we can predict increases/decreases in the size
    of the residual, we have found evidence of
    heteroskedasticity
  • For Ind. Vars., we use the same ones as in the
    original regression plus their squares and their
    cross-products.

22
The Result
  • Take the r2 from this regression and multiply it
    by n.
  • This test statistic is distributed ?2 with
    degrees of freedom equal to the number of
    independent variables in the 2nd regression
  • In other words, r2n is the ?2 you calculate from
    your data, compare it to a critical ?2 from a ?2
    table. If your ?2 is greater than ?2 then you
    reject the null hypothesis (of homoskedasticity)

23
A Sigh of Relief
  • Stata will calculate this for you
  • After running the regression, type
  • imtest, white

. imtest, white White's test for Ho
homoskedasticity against Ha
unrestricted heteroskedasticity chi2(5)
9.97 Prob gt chi2
0.0762 Cameron Trivedi's decomposition of
IM-test -----------------------------------------
---------- Source chi2
df p ---------------------------------------
----------- Heteroskedasticity 9.97
5 0.0762 Skewness 3.96
2 0.1378 Kurtosis -28247.96
1 1.0000 ----------------------------------
---------------- Total
-28234.03 8 1.0000 -----------------------
----------------------------
24
An Alternative Test Breusch/Pagan
  • Based on similar logic
  • Three changes
  • Instead of using e2 as the D.V. in the 2nd
    regression, use where
  • Instead of using every variable (plus squares and
    cross-products), you specify the variables you
    think are causing the heteroskedasticity
  • Alternatively, use only as a catch all

25
An Alternative Test Breusch/Pagan
  • 3. Test Statistic is RegSS from 2nd regression
    divided by 2. It is distributed ?2 with degrees
    of freedom equal to the number of independent
    variables in the 2nd regression.

26
Stata Command hettest
. hettest Breusch-Pagan / Cook-Weisberg test for
heteroskedasticity Ho Constant
variance Variables fitted values of
turnout chi2(1) 8.76
Prob gt chi2 0.0031 . hettest
senate Breusch-Pagan / Cook-Weisberg test for
heteroskedasticity Ho Constant
variance Variables senate
chi2(1) 4.59 Prob gt chi2
0.0321 . hettest , rhs Breusch-Pagan /
Cook-Weisberg test for heteroskedasticity
Ho Constant variance Variables
diplomau mdnincm senate guber chi2(4)
11.33 Prob gt chi2 0.0231
27
What are you gonna do about it?
  • Two Remedies
  • We might need to try a different estimator. This
    will be the Generalized Least Squares
    estimator. This GLS Estimator can be applied
    to data with heteroskedasticity and serial
    correlation.
  • OLS is still consistent (just inefficient) and
    Standard Errors are wrong. We could fix the
    standard errors and stick with OLS.

28
Generalized Least Squares
  • When used to correct heteroskedasticity, we refer
    to GLS as Weighted Least Squares or WLS.
  • Intuition
  • Some data points
  • have better quality
  • information about
  • the regression line
  • than others because
  • they have less error.
  • We should give those
  • observations more
  • weight.

29
Non-Constant Variance
  • We want constant error variance for all
    observations,
  • E(ei2) s 2 , estimated by RMSE
  • However, with Heteroskedasticity, error variance
    (si2) is not constant
  • E(ei2) si2, not constant (indexed by i)
  • If we know what si2 is, we can re-weight the
    equation to make the error variance constant

30
Re-weighting the regression
Begin with the formula Add x0i, a variable that
is always 1
Divide through by si to weight it
We can simplify notation and show its really
just a regression with transformed variables.
Last, we just need to show that the
transformation makes the new error term, ei,
constant
31
GLS vs. OLS
  • In OLS, we minimize the sum of the squared
    errors
  • In GLS, y we minimize a weighted sum of the
    squared errors.

let
Set partial derivatives to 0, solve for a and b
to get eqs.
32
GLS vs. OLS
  • Minimize Errors
  • Minimize Weighted Errors
  • GLS (WLS) is just doing OLS with transformed
    variables.
  • In the same way that we transformed a
    non-linear data to fit the assumptions of OLS, we
    can transform the data with weights to help
    heteroskedastic data meet the assumptions of OLS

33
GLS vs. OLS
  • In Matrix form,
  • OLS b (xx)-1xy
  • GLS b (xO-1x)-1x O-1y
  • Weights are included in a matrix, O-1

34
Problem
  • We rarely know exactly how to weight our data
  • Solutions
  • Plan A If heteroskedasticity comes from one
    specific variable, we can use that variable as
    the weight
  • Alternatively, we could run OLS and use the
    residuals to estimate the weights (observations
    with large OLS residuals get little weight in the
    WLS estimates)

35
Plan A A Single, Known, Villain
  • Example Household income
  • Households that earn little must spend it all on
    necessities. When income is low, there is little
    variance in spending.
  • Households that earn a great deal can either
    spend it all or buy just essentials and save the
    rest. More error variance as income increases
  • Note the changes in interpretation

36
Plan B Estimate the weights
  • Running OLS, get an estimate of the residuals
  • Regress those residuals (squared) on the set of
    independent variables and get predicted values
  • Use those predicted values as the weights
  • Because this is GLS that is doable, it is
    called Feasible GLS or FGLS
  • FGLS is asymptotically equal to GLS as sample
    size goes to infinity

37
I dont want to do GLS
  • I dont blame you
  • Usually best if we know something about the
    nature of the heteroskedasticity
  • OLS was unbiased, why cant we just use that?
  • Inefficient (but only problematic with very
    severe heteroskedasticity)
  • Incorrect Standard Errors (formula changes)
  • What if we could just fix standard errors?

38
White Standard Errors
  • We can use OLS and just fix the Standard Errors.
    There are a number of ways to do this, but the
    classic is White Standard Errors
  • Number of names for this
  • White Std. Errs.
  • Huber-White Std. Errs.
  • Robust Std. Errs.
  • Heteroskedastic Consistent Std. Errs.

39
The big idea
  • In OLS, Standard Errors come from the
    Variance-Covariance Matrix.
  • Std. Err. is the Std. Dev. Of a Sampling
    Distribution
  • Variance is the square of the Standard Deviation
    (Std. Dev. is the square root of variance)
  • Variance Covariance matrix for OLS is given by
    se2(XX)-1

. vce Variances
diplomau mdnincm _cons ----------------------
------------------ diplomau 254467
mdnincm -178.899 .187128 _cons
1.4e06 -3172.43 9.3e07
40
With Heteroskedasticity
  • Variance Covariance matrix for OLS is given by
    se2(XX)-1
  • Variance Covariance matrix under
    heteroskedasticity is given by
  • (XX)-1 (XO-1X) (XX)-1
  • Problem We still dont know Sigma
  • Solution We can estimate (XO-1X) quite well
    using OLS residuals by

where xi is the row of X for obs. i
41
In Stata
  • Specify the robust option after regression

. regress turnout diplomau mdnincm,
robust Regression with robust standard errors
Number of obs 426
F( 2, 423) 33.93
Prob gt F
0.0000
R-squared 0.1291
Root MSE
47766 -------------------------------------------
----------------------
Robust turnout Coef. Std. Err. t
Pgtt 95 Conf. Interval ---------------------
------------------------------------------- diplom
au 1101.359 548.7361 2.01 0.045 22.77008
2179.948 mdnincm 1.111589 .4638605 2.40
0.017 .19983 2.023347 _cons 154154.4
9903.283 15.57 0.000 134688.6
173620.1 -----------------------------------------
------------------------
42
Drawbacks
  • OLS is still inefficient (though this is not much
    of a problem unless heteroskedasticity is really
    bad)
  • Requires larger sample sizes to give good
    estimates of Std. Errs. (which means t tests are
    only OK asymptotically)
  • If there is no heteroskedasticity and you use
    robust SEs, you do slightly worse than regular
    Std. Errs.

43
Moral of the Story
  • If you know something about the nature of the
    heteroskedasticity, WLS is goodBLUE
  • If you dont, use OLS with robust Std. Errs.
  • Now, Group heteroskedasticity

44
Group Heteroskedasticity
  • No GLS/WLS option
  • There is a Robust Std. Err. Option
  • Essentially Stacks clusters into their own kind
    of mini-White correction
Write a Comment
User Comments (0)
About PowerShow.com