The Classical Linear Regression Model and Hypothesis Testing - PowerPoint PPT Presentation

1 / 20

About This Presentation

Title:

The Classical Linear Regression Model and Hypothesis Testing

Description:

The alternative hypothesis (H1) specifies the range of values of the coefficient ... Alternative Hypothesis: 1 0 ... Alternative Hypothesis: the impact is ... – PowerPoint PPT presentation

Number of Views:1393

Avg rating:3.0/5.0

Slides: 21

Provided by: GeorgeP4

Category:

more less

Transcript and Presenter's Notes

Title: The Classical Linear Regression Model and Hypothesis Testing

1
The Classical Linear Regression Model and
Hypothesis Testing
2
The Assumptions of the Classical LRM

The OLS estimators of the model coefficients have
some nice properties under certain assumptions
These assumptions constitute what is known as the
classical Linear Regression Model (LRM)
We can show that, if these assumptions hold, then
the OLS estimator is the Best, Linear, Unbiased
Estimator (BLUE)
If one, or more, of these assumptions do not
hold, then we must compare the OLS estimation
with an alternative and examine the pros and cons
of each approach

3
The Assumptions of the Classical LRM

The assumptions of the classical LRM are
The regression model is linear in the
coefficients, has an additive error term and is
correctly specified
The error term has a mean zero
All explanatory variables are uncorrelated with
the error term
Observations of the error term are uncorrelated
with each other
The error term has a constant variance
No explanatory variable is a perfect linear
function of any other explanatory variable(s)
Additionally, we can assume that the error term
follows a normal distribution

4
What Do These Assumptions Mean?

The first assumption says that our model has to
be linear in the coefficients
The regression model does not have to be linear
in the variables, meaning that OLS can also be
applied to models that are nonlinear in the
variables
Example An equation where the variables are in
logs can be estimated by OLS
ln(Yi) ?0 ?1ln(Xi) ?i

5
What Do These Assumptions Mean?

The second assumption says that, on average, we
expect the impact of all left-out factors in our
model to be zero
The third assumption says that the observed
values of the explanatory variables are not
related to the values of the error term
If there were a relationship, then the OLS
estimates would likely consider some of the
variation in Y to be explained by X even though
this came from the error term

6
What Do These Assumptions Mean?

If the fourth assumption does not hold, then it
is difficult to get precise estimates with OLS
This phenomenon is common in regression analysis
with time series data and is known as serial
correlation or autocorrelation
It is commonly observed that a random shock in
one period will have a lasting effect for several
periods
For example, there have historically been
extensive periods of above average returns
(1982-99) and periods of dreadful returns
(1966-81)

7
What Do These Assumptions Mean?

Example Suppose we want to estimate Gillettes
beta and use the CAPM model
We collect data on monthly returns for Gillettes
stock and the NYSE Composite index for 120 months
We estimate the following model
RGt ?0 ?1Rmt ?t
A random shock that affects the error in period t
(e.g., the burst of a speculative bubble) will
have a lasting impact and affect the error in
period t1, as well

8
What Do These Assumptions Mean?

The fifth assumption says that the variance of
the errors in our model does not change for each
observation or range of observations in our
sample
This assumption frequently breaks down in
cross-section data and then we face the problem
of heteroscedasticity (OLS method not best)
Example We estimate a multiple regression model
of DPS with a cross-section sample of 100 firms
DPS ?0 ?1 EPS ?2 AGE ?t

9
What Do These Assumptions Mean?

It may be the case that the variation in DPS is
not the same for small and large firms (defined
in terms of asset size)
Other factors, besides EPS and AGE, captured by
the error term may affect the DPS of larger firms
differently from that of smaller firms
For example, larger firms shareholders may
dislike volatility and prefer to receive a target
level of DPS while smaller firms shareholders
may be more willing to accept a volatile pattern
of DPS

10
What Do These Assumptions Mean?

If the sixth assumption does not hold in a
multiple regression model, then we face the
problem of multicollinearity
In this case, two or more of the explanatory
variables are related (there exists some
correlation between them)
A movement in one explanatory variable is matched
by a relative movement in another and
OLS procedure provides unstable estimates
OLS estimates are difficult to interpret

11
What Do These Assumptions Mean?

Example Lets return to the example of DPS and
suppose that we add as a third explanatory
variable the firms interest expense
DPS ?0 ?1 EPS ?2 AGE ?3 INT ?t
Since higher interest expenses implies lower
earnings, the two variables EPS and INT are
correlated
Thus, ?1 does not show the impact on DPS for a
one-dollar change in EPS holding all other
variables constant
The reason is that it is possible that the higher
EPS is due to lower interest expenses

12
The Properties of OLS Estimators

We want to see how close the OLS estimators of
the coefficients of a model come to the
coefficients of the true model
If the assumptions of the classical LRM hold,
then the OLS estimators are the Best, Linear,
Unbiased Estimators (BLUE)
This means that
The OLS estimates are centered around the true
values of the coefficients (unbiased estimates)
The distribution of OLS estimates has the lowest
variance
The OLS estimates are normally distributed

13
Testing Hypotheses About the Models Coefficients

A major use of regression analysis is that it
allows us to empirically test hypotheses about
relationships among financial variables
For example, we may want to test the argument
that the recent consolidation in US banking has
resulted in a lower supply of credit to small
businesses
Drawing a sample, estimating a model, and testing
our hypothesis empirically does not necessarily
allow us to prove that our theory is correct

14
Testing Hypotheses About the Models Coefficients

Often, what we are able to do is reject our
hypothesis with a certain degree of confidence
Before estimating a model, we need to specify our
testable hypothesis in the form of a null and an
alternative hypothesis
The null hypothesis (H0) is a statement of the
range of values of the estimated coefficient that
we would expect to occur if our theory were not
true
The alternative hypothesis (H1) specifies the
range of values of the coefficient that we would
expect to occur if our theory were true

15
Testing Hypotheses About the Models Coefficients

Example Suppose we believe that higher bank
consolidation will lead to less small business
lending
We estimate a model SBL ?0 ?1(Bank
Consolidation) error
Null Hypothesis ?1 ? 0
Alternative Hypothesis ?1 lt 0
This is an example of a one-sided hypothesis test
because the alternative hypothesis is on only one
side of the null hypothesis

16
Testing Hypotheses About the Models Coefficients

Another way to test a hypothesis is through a
two-sided test
Null Hypothesis ?1 0
Alternative Hypothesis ?1 ? 0
In our example, if we did not have a prior theory
about the impact of bank consolidation on small
business lending we could test the
Null Hypothesis the impact of bank consolidation
on SBL is not significantly different from zero
Alternative Hypothesis the impact is
significantly different from zero

17
The t-Test Testing the Significance of
Individual Regression Coefficients

Testing the significance of individual regression
coefficients is equivalent to testing the
significance of including a particular
explanatory variable in our model
We know the following result
The t-statistic for the kth coefficient given by
follows the t distribution with n-k-1 degrees
of freedom

18
The t-Test Testing the Significance of
Individual Regression Coefficients

SE(?-hat) is the standard error of the estimated
coefficient
This is nothing other than the standard deviation
of the sampling distribution of the different
coefficient estimates
In other words, it shows whether the various
estimated coefficients (from various samples)
vary a little or a lot
Since we usually want to test whether a
coefficient is significantly different from zero,
the t-statistic can be stated as

19
The t-Test Decision Rule

To decide whether to reject or not the null
hypothesis, we must compare the calculated
t-statistic with a critical t-value
The critical t-value is based on our choice of
level of significance
The level of significance shows the probability
that we will make a Type I error, meaning that we
will reject a true null hypothesis
Example A 5 significance level implies that we
will reject a true null hypothesis only 5 of
times

20
The t-Test Decision Rule

The critical t-value is given in tables of the t
distribution
Decision rule Reject the null hypothesis if
A typical choice of significance level in
empirical work is 5
With a large enough sample (n gt 120), the
critical t-value for a one-sided test at the 5
level is 1.645 and for a two-sided test 1.96

Write a Comment

User Comments (0)