Chapter 5 Inference in the Simple Regression Model - PowerPoint PPT Presentation

About This Presentation
Title:

Chapter 5 Inference in the Simple Regression Model

Description:

1) Null Hypothesis: specify a value for the parameter. Ho: 2 = c where c can be any value. For example, let c = 0, then the Null Hypothesis becomes. Ho: 2 = 0. ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 25
Provided by: unkn1156
Learn more at: http://cob.jmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Chapter 5 Inference in the Simple Regression Model


1
Chapter 5 Inference in the Simple Regression Model
  • In this chapter we study how to construct
    confidence intervals and how to conduct
    hypothesis tests using the simple regression
    model from Chapters 3 and 4.
  • Concepts for review
  • The estimators b1 and b2 are random variables
    where
  • b2Normal(?2, Var(b2))
  • b1Normal(?1, Var(b1))

2
Interval Estimation
  • Least Squares gives us point
  • estimates for ?1 and ?2.
  • Need to address the issue of precision using
    knowledge of
  • the variance of b2 and
  • the shape of b2s probability distribution
  • We can construct a margin for error
  • around the point estimates.
  • Review Confidence Intervals
  • We know that 95 of all possible
  • values for a normal random variable
  • lie within 1.96 standard deviations of
  • the mean

0.025
0.025
0.95
b2
?2
3
where
Note that the above interval makes a
probabilistic statement about the width of the
interval, not about ?2
If we knew ?, then we would have no problem
constructing the interval
However, ? is unknown and must be estimated. This
adds an additional source of uncertainty to the
interval and also changes the shape of the
standardized distribution.
4
The Student t-distribution
We know how to estimate ?
However, when we standardize b2 using an estimate
of ?, we no longer have a standard normal random
variable. Instead we have a random variable with
a t-distribution But what is se(b2) ??
5
About the Student t-distribution
Compare a z random variable to a t random
variable 1) In the expression for z, the
only random variable is b2 ? z has the
same distribution as b2 because ?2 and ?b2 are
constants. The distribution is Normal. 2) In the
expression for t, b2 and se(b2) are random
variables where b2 has a normal distribution and
se(b2) is a function of which has
a ?2 distribution. The ratio of a normal random
variable to a ?2 random variable has a
t-distribution.
6
More on the t
t-values have a measure of degrees of
freedom. For a simple regression model, this is T
2. See Table 2 front cover of book. Suppose T
40 ? 38 d.o.f and 95 of the values lie within
? 2.024 of the mean. Identify the relevant area
on the diagram.
7
Confidence Intervals Using the t-Distribution
2.024 is the critical t value that leaves 2.5 of
the values in the tails. Its value depends on
the degrees of freedom and the level of
confidence. A confidence interval for b2 has the
general form
8
Example of a Confidence Interval
In Chapter 3 we found for the food expenditure
example
In Chapter 4 we found for the food expenditure
example
9
This is the 95 confidence interval. There is A
95 probability that this interval contains the
true value of ?2.
10
Hypothesis Testing
  • The Idea
  • A hypothesis is a conjecture about a population
    parameter such as we believe the marginal
    propensity to spend on food is 0.10 out of every
    dollar ? ?2 0.10
  • Remember that population parameters are unknown
    constants.
  • We test hypotheses about ?2 using b2, our
    estimator of ?2.
  • b2 is calculated using a sample of data. If b2
    is reasonably close to the hypothesized value
    for ?2, then we say that the data support the
    hypothesis. If b2 is NOT reasonably close,
    then we say that the data do not support the
    hypothesis.

11
Formal Hypothesis Testing
  • y ?1 ?2x e
  • 1) Null Hypothesis specify a value for the
    parameter
  • Ho ?2 c where c can be any value.
  • For example, let c 0, then the Null Hypothesis
    becomes
  • Ho ?2 0.
  • Note that if this were true, then it says that x
    has no effect on y. This test is called a test
    of significance.

12
  • Alternative Hypothesis a logical alternative to
    the Null Hypothesis because if we reject the Null
    Hypothesis, then we must be prepared to accept
    the Alternative Hypothesis. Typically, it is
  • H1 ?2 ? c or H1 ?2 lt c or H1 ?2
    gt c.
  • If we have a test of significance where Ho ?2
    0, then the Alternative Hypothesis is
  • H1 ?2 ? 0 or H1 ?2 lt 0 or H1 ?2
    gt 0
  • Whether we use ?, lt or gt depends on the
    situation and economic theory. For example, it is
    theoretically impossible that ?2 lt 0 where ?2 is
    the marginal propensity to consume. Therefore, a
    test of significance would be
  • Ho ?2 0
  • H1 ?2 gt 0

13
  • Test Statistic we use a statistic to test the
    hypothesis.
  • The idea if the test statistic disagrees with
    the Ho ? reject Ho.
  • Whether or not the test statistic agrees or
    disagrees with Ho must be addressed in
    probabilistic terms.
  • Our test statistic is based on b2. The mean of
    b2 is ?2 but ?2 is unknown.
  • Make this assumption Ho is true.
  • Suppose Ho ?2 c ? we now know that b2s
    distribution is centered at c.
  • This is our test statistic.
  • What do we do with it ?????

14
  • 4) The Rejection Region
  • We have assumed the Ho to be true ? examine the
    distribution of b2 under this hypothesis.
  • Suppose that we calculate our test statistic and
    it falls into the tail of this distribution.
    There are 2 reasons why this might happen
  • The assumption that Ho is true is a bad one
    (meaning the true distribution is centered at a
    value other than c)
  • The Ho is true but our sample data were very
    unlikely (came from the tail)
  • Extreme values are those values that fall into
    the tails, depending on the alternative
    hypothesis. We typically use the 5 most extreme
    values a region of low probability.

?2 c
b2
0
t
15
Suppose Ho ?2 0 H1 ?2 ? 0 The test
statistic is The rejection region will be t
values that fall into either tail Two Tailed
Test because H1 ?2 ? 0. If we use a 5 level of
significance, then we put 2.5 into each
tail. What t-values leave 0.025 in the tail?
Use t-table. Suppose T40 so that we have 38
degrees of freedom.
0.025
0.025
b2
?20
0
t
16
Suppose Ho ?2 0 H1 ?2 gt 0 The test
statistic is The rejection region will be t
values that fall into the right tail One Tailed
Test If we use a 5 level of significance, then
we put 5 into the right tail What t-values
leave 0.05 in the tail? Use t-table. Suppose
T40 so that we have 38 degrees of freedom.
0.05
b2
?20
0
t
17
  • 5) Conduct the Test
  • Compare the t-statistic to the rejection region
    and conclude whether the data fail to reject or
    reject the null hypothesis (Ho)
  • Example Food Expenditure
  • Ho ?2 0
  • H1 ?2 gt 0
  • Conclusion??

18
  • 6) Think about Possible Errors
  • We never know for sure whether we have made an
    error
  • because the truth is never revealed to us.
  • We can only analyze the probability of making an
    error. When we set our level of significance, we
    are actually setting the probability of a Type I
    error. Why? Suppose that Ho is true ? 5 of the
    time we will get samples of data that generate a
    test statistic t that lies in the rejection
    region, leading us to reject Ho when in fact it
    is true.
  • We can make the probability of a Type I error
    smaller by using a 1 level of significance
    instead of 5

The truth The truth
Our Decision Ho is true Ho is false
Reject Ho Type I Error Correct
Fail to Reject Ho Correct Type II Error
19
  • A Type II Error occurs when we fail to reject Ho
    when in fact it is false (meaning the alternative
    hypothesis H1 is true.). In order to measure the
    probability of this error occurring we need a
    more specific H1

20
  • P-Values
  • As an alternative to specifying the level of
    significance for a test, we can calculate the
    p-value of the test, which stands for
    probability value.
  • It is simply the probability of getting the
    sample test statistic or something more extreme
    under the assumption that Ho is true.
  • Suppose Ho ?2 0
  • H1 ?2 gt 0
  • and our b2 0.1283
  • P-value is P(b2 ? 0.1283) P(t ? 4.20) area
    in right tail.
  • In Excel, use this formula TDIST(4.2,38,1)

?20
b2
0.1283
0
t
4.20
21
  • For a two-tailed test, we multiply the p-value by
    2
  • Suppose Ho ?2 0
  • H1 ?2 ? 0
  • and our b2 0.1283
  • P-value is 2 x P(b2 ? 0.1283)
  • 2 x P(t ? 4.20)
  • In Excel, use this formula
  • TDIST(4.2,38,2)

22
Least Squares Predictor
  • This predictor is a random variable because it
    is a function of b1 and b2 which are random
    variables.
  • Suppose x xo, the model predicts
  • The error is
  • The variance of this error tells us about the
    precision of the prediction

23
An estimator of var(f) uses an estimator for ?2
We can now construct a confidence interval for
our predictor
Example
24
The Idea Behind of Hypothesis Testing
  • The probability distribution for b2 is centered
    at ß2, which is an unknown parameter. Remember
    that E(b2) ß2.
  • Assume a value for ß2. The value we assume is
    the value of ß2 in the null hypothesis. By
    assuming a value, we tie down the distribution
    for b2 (we center the distribution for b2 at the
    assumed value for ß2.)
  • Use a sample of data on X and Y to calculate the
    b2 estimate.
  • Take this value of b2 and match it up to the
    distribution from 2) above. Does the value of b2
    fall near the center of the distribution or out
    into the tails? If it falls near the center,
    then this value of b2 has a high probability of
    occurring under the assumed ß2 value therefore,
    the assumed value is said to be consistent with
    the data. If on the other hand, the b2 value
    falls into the tails, then we say that it has a
    low probability of occurring under the assumed
    value therefore, the assumed value is not
    consistent with the data.
  • Now, we just need to clarify what it means to be
    out into the tails or near the center.this is
    determined by setting a significance level and
    the rejection region.
Write a Comment
User Comments (0)
About PowerShow.com