The Classical Linear Regression Model and Hypothesis Testing - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

The Classical Linear Regression Model and Hypothesis Testing

Description:

The alternative hypothesis (H1) specifies the range of values of the coefficient ... Alternative Hypothesis: 1 0 ... Alternative Hypothesis: the impact is ... – PowerPoint PPT presentation

Number of Views:1391
Avg rating:3.0/5.0
Slides: 21
Provided by: GeorgeP4
Category:

less

Transcript and Presenter's Notes

Title: The Classical Linear Regression Model and Hypothesis Testing


1
The Classical Linear Regression Model and
Hypothesis Testing
2
The Assumptions of the Classical LRM
  • The OLS estimators of the model coefficients have
    some nice properties under certain assumptions
  • These assumptions constitute what is known as the
    classical Linear Regression Model (LRM)
  • We can show that, if these assumptions hold, then
    the OLS estimator is the Best, Linear, Unbiased
    Estimator (BLUE)
  • If one, or more, of these assumptions do not
    hold, then we must compare the OLS estimation
    with an alternative and examine the pros and cons
    of each approach

3
The Assumptions of the Classical LRM
  • The assumptions of the classical LRM are
  • The regression model is linear in the
    coefficients, has an additive error term and is
    correctly specified
  • The error term has a mean zero
  • All explanatory variables are uncorrelated with
    the error term
  • Observations of the error term are uncorrelated
    with each other
  • The error term has a constant variance
  • No explanatory variable is a perfect linear
    function of any other explanatory variable(s)
  • Additionally, we can assume that the error term
    follows a normal distribution

4
What Do These Assumptions Mean?
  • The first assumption says that our model has to
    be linear in the coefficients
  • The regression model does not have to be linear
    in the variables, meaning that OLS can also be
    applied to models that are nonlinear in the
    variables
  • Example An equation where the variables are in
    logs can be estimated by OLS
  • ln(Yi) ?0 ?1ln(Xi) ?i

5
What Do These Assumptions Mean?
  • The second assumption says that, on average, we
    expect the impact of all left-out factors in our
    model to be zero
  • The third assumption says that the observed
    values of the explanatory variables are not
    related to the values of the error term
  • If there were a relationship, then the OLS
    estimates would likely consider some of the
    variation in Y to be explained by X even though
    this came from the error term

6
What Do These Assumptions Mean?
  • If the fourth assumption does not hold, then it
    is difficult to get precise estimates with OLS
  • This phenomenon is common in regression analysis
    with time series data and is known as serial
    correlation or autocorrelation
  • It is commonly observed that a random shock in
    one period will have a lasting effect for several
    periods
  • For example, there have historically been
    extensive periods of above average returns
    (1982-99) and periods of dreadful returns
    (1966-81)

7
What Do These Assumptions Mean?
  • Example Suppose we want to estimate Gillettes
    beta and use the CAPM model
  • We collect data on monthly returns for Gillettes
    stock and the NYSE Composite index for 120 months
  • We estimate the following model
  • RGt ?0 ?1Rmt ?t
  • A random shock that affects the error in period t
    (e.g., the burst of a speculative bubble) will
    have a lasting impact and affect the error in
    period t1, as well

8
What Do These Assumptions Mean?
  • The fifth assumption says that the variance of
    the errors in our model does not change for each
    observation or range of observations in our
    sample
  • This assumption frequently breaks down in
    cross-section data and then we face the problem
    of heteroscedasticity (OLS method not best)
  • Example We estimate a multiple regression model
    of DPS with a cross-section sample of 100 firms
  • DPS ?0 ?1 EPS ?2 AGE ?t

9
What Do These Assumptions Mean?
  • It may be the case that the variation in DPS is
    not the same for small and large firms (defined
    in terms of asset size)
  • Other factors, besides EPS and AGE, captured by
    the error term may affect the DPS of larger firms
    differently from that of smaller firms
  • For example, larger firms shareholders may
    dislike volatility and prefer to receive a target
    level of DPS while smaller firms shareholders
    may be more willing to accept a volatile pattern
    of DPS

10
What Do These Assumptions Mean?
  • If the sixth assumption does not hold in a
    multiple regression model, then we face the
    problem of multicollinearity
  • In this case, two or more of the explanatory
    variables are related (there exists some
    correlation between them)
  • A movement in one explanatory variable is matched
    by a relative movement in another and
  • OLS procedure provides unstable estimates
  • OLS estimates are difficult to interpret

11
What Do These Assumptions Mean?
  • Example Lets return to the example of DPS and
    suppose that we add as a third explanatory
    variable the firms interest expense
  • DPS ?0 ?1 EPS ?2 AGE ?3 INT ?t
  • Since higher interest expenses implies lower
    earnings, the two variables EPS and INT are
    correlated
  • Thus, ?1 does not show the impact on DPS for a
    one-dollar change in EPS holding all other
    variables constant
  • The reason is that it is possible that the higher
    EPS is due to lower interest expenses

12
The Properties of OLS Estimators
  • We want to see how close the OLS estimators of
    the coefficients of a model come to the
    coefficients of the true model
  • If the assumptions of the classical LRM hold,
    then the OLS estimators are the Best, Linear,
    Unbiased Estimators (BLUE)
  • This means that
  • The OLS estimates are centered around the true
    values of the coefficients (unbiased estimates)
  • The distribution of OLS estimates has the lowest
    variance
  • The OLS estimates are normally distributed

13
Testing Hypotheses About the Models Coefficients
  • A major use of regression analysis is that it
    allows us to empirically test hypotheses about
    relationships among financial variables
  • For example, we may want to test the argument
    that the recent consolidation in US banking has
    resulted in a lower supply of credit to small
    businesses
  • Drawing a sample, estimating a model, and testing
    our hypothesis empirically does not necessarily
    allow us to prove that our theory is correct

14
Testing Hypotheses About the Models Coefficients
  • Often, what we are able to do is reject our
    hypothesis with a certain degree of confidence
  • Before estimating a model, we need to specify our
    testable hypothesis in the form of a null and an
    alternative hypothesis
  • The null hypothesis (H0) is a statement of the
    range of values of the estimated coefficient that
    we would expect to occur if our theory were not
    true
  • The alternative hypothesis (H1) specifies the
    range of values of the coefficient that we would
    expect to occur if our theory were true

15
Testing Hypotheses About the Models Coefficients
  • Example Suppose we believe that higher bank
    consolidation will lead to less small business
    lending
  • We estimate a model SBL ?0 ?1(Bank
    Consolidation) error
  • Null Hypothesis ?1 ? 0
  • Alternative Hypothesis ?1 lt 0
  • This is an example of a one-sided hypothesis test
    because the alternative hypothesis is on only one
    side of the null hypothesis

16
Testing Hypotheses About the Models Coefficients
  • Another way to test a hypothesis is through a
    two-sided test
  • Null Hypothesis ?1 0
  • Alternative Hypothesis ?1 ? 0
  • In our example, if we did not have a prior theory
    about the impact of bank consolidation on small
    business lending we could test the
  • Null Hypothesis the impact of bank consolidation
    on SBL is not significantly different from zero
  • Alternative Hypothesis the impact is
    significantly different from zero

17
The t-Test Testing the Significance of
Individual Regression Coefficients
  • Testing the significance of individual regression
    coefficients is equivalent to testing the
    significance of including a particular
    explanatory variable in our model
  • We know the following result
  • The t-statistic for the kth coefficient given by
  • follows the t distribution with n-k-1 degrees
    of freedom

18
The t-Test Testing the Significance of
Individual Regression Coefficients
  • SE(?-hat) is the standard error of the estimated
    coefficient
  • This is nothing other than the standard deviation
    of the sampling distribution of the different
    coefficient estimates
  • In other words, it shows whether the various
    estimated coefficients (from various samples)
    vary a little or a lot
  • Since we usually want to test whether a
    coefficient is significantly different from zero,
    the t-statistic can be stated as

19
The t-Test Decision Rule
  • To decide whether to reject or not the null
    hypothesis, we must compare the calculated
    t-statistic with a critical t-value
  • The critical t-value is based on our choice of
    level of significance
  • The level of significance shows the probability
    that we will make a Type I error, meaning that we
    will reject a true null hypothesis
  • Example A 5 significance level implies that we
    will reject a true null hypothesis only 5 of
    times

20
The t-Test Decision Rule
  • The critical t-value is given in tables of the t
    distribution
  • Decision rule Reject the null hypothesis if
  • A typical choice of significance level in
    empirical work is 5
  • With a large enough sample (n gt 120), the
    critical t-value for a one-sided test at the 5
    level is 1.645 and for a two-sided test 1.96
Write a Comment
User Comments (0)
About PowerShow.com