Title: Chapter 12 Simple Linear Regression
1Chapter 12 Simple Linear Regression
- Simple Linear Regression Model
- Least Squares Method
- Coefficient of Determination
- Model Assumptions
- Testing for Significance
- Using the Estimated Regression Equation
- for Estimation and Prediction
- Computer Solution
- Residual Analysis Validating Model Assumptions
2Simple Linear Regression Model
- The equation that describes how y is related to x
and an error term is called the regression model. - The simple linear regression model is
- y b0 b1x e
- b0 and b1 are called parameters of the model.
- e is a random variable called the error term.
3Simple Linear Regression Equation
- The simple linear regression equation is
- E(y) ?0 ?1x
- Graph of the regression equation is a straight
line. - b0 is the y intercept of the regression line.
- b1 is the slope of the regression line.
- E(y) is the expected value of y for a given x
value.
4Simple Linear Regression Equation
- Positive Linear Relationship
E(y)
Regression line
Intercept b0
Slope b1 is positive
x
5Simple Linear Regression Equation
- Negative Linear Relationship
E(y)
Regression line
Intercept b0
Slope b1 is negative
x
6Simple Linear Regression Equation
E(y)
Regression line
Intercept b0
Slope b1 is 0
x
7Estimated Simple Linear Regression Equation
- The estimated simple linear regression equation
is - The graph is called the estimated regression
line. - b0 is the y intercept of the line.
- b1 is the slope of the line.
- is the estimated value of y for a given x
value.
8Estimation Process
Sample Data x y x1 y1 . . .
. xn yn
Regression Model y b0 b1x e Regression
Equation E(y) b0 b1x Unknown Parameters b0, b1
Estimated Regression Equation Sample
Statistics b0, b1
b0 and b1 provide estimates of b0 and b1
9Least Squares Method
- Least Squares Criterion
- where
- yi observed value of the dependent variable
- for the ith observation
- yi estimated value of the dependent variable
- for the ith observation
10The Least Squares Method
- Slope for the Estimated Regression Equation
11The Least Squares Method
- y-Intercept for the Estimated Regression Equation
- where
- xi value of independent variable for ith
observation - yi value of dependent variable for ith
observation - x mean value for independent variable
- y mean value for dependent variable
- n total number of observations
_
_
12Example Reed Auto Sales
- Simple Linear Regression
- Reed Auto periodically has a special
week-long sale. As part of the advertising
campaign Reed runs one or more television
commercials during the weekend preceding the
sale. Data from a sample of 5 previous sales are
shown on the next slide.
13Example Reed Auto Sales
- Simple Linear Regression
-
- Number of TV Ads Number of Cars
Sold - 1 14
- 3 24
- 2 18
- 1 17
- 3 27
14Example Reed Auto Sales
- Slope for the Estimated Regression Equation
- b1 220 - (10)(100)/5 5
- 24 - (10)2/5
- y-Intercept for the Estimated Regression Equation
- b0 20 - 5(2) 10
- Estimated Regression Equation
- y 10 5x
15Example Reed Auto Sales
16The Coefficient of Determination
- Relationship Among SST, SSR, SSE
- SST SSR SSE
- where
- SST total sum of squares
- SSR sum of squares due to regression
- SSE sum of squares due to error
17The Coefficient of Determination
- The coefficient of determination is
- r2 SSR/SST
- where
- SST total sum of squares
- SSR sum of squares due to regression
18Example Reed Auto Sales
- Coefficient of Determination
- r2 SSR/SST 100/114 .8772
- The regression relationship is very strong
because 88 of the variation in number of cars
sold can be explained by the linear relationship
between the number of TV ads and the number of
cars sold.
19The Correlation Coefficient
- Sample Correlation Coefficient
-
- where
- b1 the slope of the estimated regression
- equation
20Example Reed Auto Sales
- Sample Correlation Coefficient
- The sign of b1 in the equation is .
- rxy .9366
-
21Model Assumptions
- Assumptions About the Error Term ?
- The error ? is a random variable with mean of
zero. - The variance of ? , denoted by ? 2, is the same
for all values of the independent variable. - The values of ? are independent.
- The error ? is a normally distributed random
variable.
22Testing for Significance
- To test for a significant regression
relationship, we must conduct a hypothesis test
to determine whether the value of b1 is zero. - Two tests are commonly used
- t Test
- F Test
- Both tests require an estimate of s 2, the
variance of e in the regression model.
23Testing for Significance
- An Estimate of s 2
- The mean square error (MSE) provides the estimate
- of s 2, and the notation s2 is also used.
- s2 MSE SSE/(n-2)
- where
24Testing for Significance
- An Estimate of s
- To estimate s we take the square root of s 2.
- The resulting s is called the standard error of
the estimate.
25Testing for Significance t Test
- Hypotheses
- H0 ?1 0
- Ha ?1 0
- Test Statistic
26Testing for Significance t Test
- Rejection Rule
- Reject H0 if t lt -t????or t gt t????
- where t??? is based on a t distribution
- with n - 2 degrees of freedom
27Example Reed Auto Sales
- t Test
- Hypotheses
- H0 ?1 0
- Ha ?1 0
- Rejection Rule
- For ? .05 and d.f. 3, t.025 3.182
- Reject H0 if t gt 3.182
28Example Reed Auto Sales
- t Test
- Test Statistics
- t 5/1.08 4.63
- Conclusions
- t 4.63 gt 3.182, so reject H0
29Confidence Interval for ?1
- We can use a 95 confidence interval for ?1 to
test the hypotheses just used in the t test. - H0 is rejected if the hypothesized value of ?1
is not included in the confidence interval for
?1.
30Confidence Interval for ?1
- The form of a confidence interval for ?1 is
- where b1 is the point estimate
- is the margin of error
- is the t value providing an area
- of a/2 in the upper tail of a
- t distribution with n - 2 degrees
- of freedom
31Example Reed Auto Sales
- Rejection Rule
- Reject H0 if 0 is not included in
- the confidence interval for ?1.
- 95 Confidence Interval for ?1
- 5 /- 3.182(1.08) 5 /- 3.44
- or 1.56 to 8.44
- Conclusion
- 0 is not included in the confidence interval.
- Reject H0
32Testing for Significance F Test
- Hypotheses
- H0 ?1 0
- Ha ?1 0
- Test Statistic
- F MSR/MSE
33Testing for Significance F Test
- Rejection Rule
- Reject H0 if F gt F?
-
- where F? is based on an F distribution
- with 1 d.f. in the numerator and
- n - 2 d.f. in the denominator
34Example Reed Auto Sales
- F Test
- Hypotheses
- H0 ?1 0
- Ha ?1 0
- Rejection Rule
- For ? .05 and d.f. 1, 3 F.05
10.13 - Reject H0 if F gt 10.13.
35Example Reed Auto Sales
- F Test
- Test Statistic
- F MSR/MSE 100/4.667 21.43
- Conclusion
- F 21.43 gt 10.13, so we reject H0.
36Some Cautions about theInterpretation of
Significance Tests
- Rejecting H0 b1 0 and concluding that the
relationship between x and y is significant does
not enable us to conclude that a cause-and-effect
relationship is present between x and y. - Just because we are able to reject H0 b1 0 and
demonstrate statistical significance does not
enable us to conclude that there is a linear
relationship between x and y.
37Using the Estimated Regression Equationfor
Estimation and Prediction
- Confidence Interval Estimate of E(yp)
-
- Prediction Interval Estimate of yp
- yp t?/2 sind
- where confidence coefficient is 1 - ? and
- t?/2 is based on a t distribution
- with n - 2 degrees of freedom
-
38Example Reed Auto Sales
- Point Estimation
- If 3 TV ads are run prior to a sale, we expect
the mean number of cars sold to be - y 10 5(3) 25 cars
39Example Reed Auto Sales
- Confidence Interval for E(yp)
- 95 confidence interval estimate of the mean
number of cars sold when 3 TV ads are run is - 25 4.61 20.39 to 29.61 cars
40Example Reed Auto Sales
- Prediction Interval for yp
- 95 prediction interval estimate of the number
of cars sold in one particular week when 3 TV ads
are run is - 25 8.28 16.72 to 33.28 cars
41Residual Analysis
- Residual for Observation i
- yi yi
- Standardized Residual for Observation i
-
- where
-
42Example Reed Auto Sales
43Example Reed Auto Sales
44Residual Analysis
Good Pattern
Residual
0
x
45Residual Analysis
Nonconstant Variance
Residual
0
x
46Residual Analysis
Model Form Not Adequate
Residual
0
x