QM1 Week 8 FTest in Multiple Linear Regressions OLS Assumptions - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

QM1 Week 8 FTest in Multiple Linear Regressions OLS Assumptions

Description:

Each t-statistic indicates the statistical significance for one regressor ... into a normally distributed variable (which passes the Skewness-Kurtosis test) ... – PowerPoint PPT presentation

Number of Views:83
Avg rating:3.0/5.0
Slides: 24
Provided by: sascha8
Category:

less

Transcript and Presenter's Notes

Title: QM1 Week 8 FTest in Multiple Linear Regressions OLS Assumptions


1
QM1 Week 8F-Test in Multiple Linear
RegressionsOLS Assumptions
  • Dr Alexander Moradi
  • University of Oxford, Dept. of Economics
  • Email alexander.moradi_at_economics.ox.ac.uk

2
9.1 F-Test
  • Each t-statistic indicates the statistical
    significance for one regressor
  • What if we want to test, whether a group of
    variables has an effect on the dependent
    variable?
  • F Test Multiple linear restrictions
  • Example
  • yab1x1b2x2b3x3b4x4e
  • H0 b1b2b30
  • In words explanatory variables x1, x2 and x3 do
    not jointly influence y
  • H1 H0 is not true
  • In contrast, t-statistics refer to a test for
    each single coefficient
  • H0 b10 H1 b1?0
  • H0 b20 H1 b2?0
  • H0 b30 H1 b3?0

3
9.1 F-Test
  • What is the effect of making the restrictions
    (b1b2b30)?
  • Two regressions (1) without and (2) with the
    restrictions in H0
  • nnumber of observations
  • knumber of estimated coefficients in the
    unrestricted model
  • mnumber of restrictions
  • RSSUResidual Sum of Squares in the unrestricted
    model
  • RSSRResidual Sum of Squares in the restricted
    model
  • Example
  • (1) Unrestricted model yab1x1b2x2b3x3b4x4e
  • H0 b1b2b30
  • (2) Restricted model ydb5x4u
  • ? k4 m3 n 64 RSSU?e² from (1) RSSR ?u²
    from (2)

4
9.1 F-Test
  • If F(m, n-k)gt Fcrit, reject H0
  • ? The restrictions can be rejected
  • ? The variables do significantly explain the
    variation in the dependent variable and
    therefore, the variables must not be excluded
    from the regression model
  • Fcrit depends on m, n, k. The exact value can be
    found in F distribution tables (Appendix in
    almost any statistical textbook)
  • Example Fcrit2.76 for m3, n-k60
  • STATA reports the p-value

5
9.1 F-Test
  • Hint for joint significance when excluding
    variables with low t-values, R²-adj. decreases
    considerably ? F Test should be carried out
  • F Test is used for all kinds of joint hypotheses,
    e.g. whether there is a structural break,
    b1b2b30, etc.
  • Intuition If we impose restrictions on
    parameters in a regression model, will the
    residual sum of squares significantly increase
    (and the goodness of fit decrease)?

6
OLS Assumptions
7
9.2 OLS Assumptions
  • e is normal distributed
  • E(e)0 (no systematic influence of the
    error term on y)
  • 3. var(e)constant (homoscedasticity)
  • 4. cov(ei,ej)0 (residuals do not
    correlate)
  • 5. cov(xi,et)0 (error term and the exogenous
    variables do not correlate)
  • If one or more of the assumptions are violated,
    OLS will lead to inconsistent estimates and
    confidence intervals

8
9.3 Problems of Model Specification
  • Fundamental requirement Relationship between the
    dependent variable and the explanatory variables
    is correctly modelled
  • What is the underlying model that the data
    follows?
  • How is the functional form? Is the relationship
    linear?
  • Is the list of explanatory variables complete?
  • Are there structural breaks (are the parameters
    stable?)

9
9.4 Homoskedasticity
Regression model a b x e (5 observations
were drawn)

y
yabx
a
x3
x
x5
x4
x1
x2
x
Homoscedasticity means that the variance of the
error term is equal across all observations
var(e)constant
10
9.4 Heteroscedasticity
y
yabx
Residuals follow a horn shaped pattern
a
x3
x
x5
x4
x1
x2
x
Heteroscedasticity error term is normal
distributed with mean 0, but the variance is no
longer constant the variance of the error term
differs across observations
11
9.4 Heteroscedasticity
  • Example WAGE aAGEe

12
9.4 Consequences of Heteroscedasticity
  • Note Random differences in the size of the
    residuals across observations does not constitute
    heteroscedasticity
  • ? Error term must have a clear and systematic
    (statistically significant) pattern of distortion
  • Consequences
  • OLS regression coefficients are unbiased
  • Effect on variance of residuals and standard
    errors ? t-tests, F-Tests, and confidence
    intervals are inconsistent and should not be
    interpreted
  • ? Testing for heteroscedasticity before testing
    for statistical significance

13
9.4 Breusch-Pagan Test
  • Is there a pattern in the variation of residuals?
  • Breusch-Pagan Test for heteroscedasticity if
    var(e)constant, there should be no significant
    correlation of squared residuals with the
    independent variables
  • Example three explanatory variables
    yab1x1b2x2b3x3e

e²cd1x1d2x2d3x3 error term
  • Test Do regression coefficients (except for the
    constant c) jointly differ from 0 (H0
    d1d2d30)
  • H0 Constant variance/Homoscedasticity
  • H1 Heteroscedasticity

14
  • WAGE aAGEe
  • Variance of the error term is not constant
  • Residuals follow a horn shaped pattern
  • Squared Residuals Residuals are mirrored by the
    regression line
  • Squared residuals increase with values of the
    explanatory variables

15
9.4 Causes and Remedies
  • 1. Differences in scale of variables
  • Remedy Transformation of dependent variable/
    explanatory variables, e.g. log, square root,
    square, etc.
  • 2. Omitted variables/ factors that, with varying
    values of the dependent variable, gain in
    importance
  • Remedy Including the omitted explanatory
    variables
  • 3. True heteroscedasticity
  • Remedies
  • Weighted Least Squares weights the observations
    with a factor that removes heteroscedasticity,
    i.e. 1/var(ei)
  • Heteroscedasticity robust standard errors
    (Huber/White/ sandwich estimate of variance)

16
9.5 Model Misspecification
  • Including irrelevant variables in a regression
    model (That is, variables that have no partial
    effect on y in the population ? population
    coefficient is 0)
  • Regression coefficient is unbiased
  • Larger standard errors of the regression
    coefficients
  • Omitting relevant variables The true underlying
    model consists of determinants that we omit in
    our regression model
  • Regression coefficients are biased
  • test statistics (t-statistics) are biased and
    invalid

17
9.5 One Source of Endogeneity Omitted Variable
Bias
  • Other sources of endogeneity cov(xi,et)?0
  • Reverse causality
  • Simultaneity
  • If we know the true model, we can predict the
    size and direction of the OVB
  • Example
  • True model yab1x1b2x2e1
  • Estimated model ycdx1e2
  • If cov(x1, x2)?0, then cov(x1, e2)?0

18
9.5 Omitted Variable Bias
True model yab1x1b2x2e1
Estimated model ycdx1e2
y
y
b1
b2
b2
b1
d
x1
x1
x2
x2
b3b2
b3
b3
  • ? Consequence Biased estimate of the influence
    of x1
  • Here x1b3x2u db1b2b3 (destimated impact
    of x1 on y)
  • ? d?b1
  • overestimation of the influence of edu, if
    b2b3gt0
  • underestimation of the influence of edu, if
    b2b3lt0

19
9.5.Omitted Variable Bias
  • Regression coefficients are unbiased, if
  • omitted variable is irrelevant (it does not
    appear in the true model ? b20)
  • omitted variable does not correlate with the
    explanatory variables included in the model ?
    b30)
  • The more omitted variables and the less clear the
    collinearities between included and omitted
    variables, the less clear is the size and
    direction of the bias

20
9.1 Exercise F-Test
  • Data set weimar_election.dta
  • Run a regression of Nazi votes on unemployment
    rate, share of workers, Catholics, farmers, voter
    participation, and dummy variables for each
    general election
  • Are the explanatory variables jointly
    significant?
  • Is the influence of unemployment on Nazi votes
    constant over time? Test for parameter stability
    in the unemployment rate over the four elections
    Hint Use interaction terms election
    dummyunemployment
  • p_nsdapF(unemp, workers, cath, farmers,
    votpart, d3207, d3211, d3303, d3207unemp,
    d3211unemp, d3303unemp)
  • Is the model in (3) correctly specified? What
    about changing influences of the other
    explanatory variables? Run a regression for each
    election
  • Do the parameters significantly vary over the
    four elections?
  • Repeat (3) with the last three elections (t
    3207, 3211, 3303). Test whether the influence of
    unemployment varied significantly in the last
    three elections

21
9 Exercise Heteroscedasticity
  • Dataset india.dta
  • Estimate the model WIab1AGEb2EDUb3FEMALEb4EDU
    FEMALEe. Interpret the results
  • Test for heteroscedasticity
  • Plot a scatterplot with the residuals from the
    model in (1) on the vertical axis and AGE on the
    horizontal axis
  • Would you expect the variance in the residuals to
    be mainly dependent on AGE? What about the dummy
    variables EDU and FEMALE?
  • What is the likely cause of heteroscedasticity?
  • Use the ladder of powers to arrive at a suitable
    transformation of the wage variable
  • Use the log of wage as independent variable. Test
    for heteroscedasticity
  • Estimate the model in (1) using robust standard
    errors. Interpret the results. Compare the
    results with (1) and (6). What model
    specification would you prefer?

22
9 STATA commands
23
9 Homework Exercises Week 8
  • Read chapter 9.3 of FT (p. 268-272)
  • Do the following exercises from FT (p. 278) 3
  • Read chapter 11 of FT (p. 300-311, 316-325)
  • Do the following exercises from FT (p. 325-329)
    1, 6, 8
Write a Comment
User Comments (0)
About PowerShow.com