Title: Chapter 10: Simple Linear Regression
1Chapter 10 Simple Linear Regression
- A model in which a variable, X, explains another
variable, Y, using a linear structure, with
allowance for error, ethe unexplained part of Y
Y b mX e
2Regression Analysis Assesses two sets of Issues
- How well does X explain Y (Regression Analysis)?
- Do the regression residuals behave like they
theoretically should? (Residuals Analysis)?
3Regression Analysis 4 issues
- R2 coefficient of determination Evaluates the
fit of the regression line to the data. 0 R2
1. Ideally, R2 1. - SE standard error of the regression. Measures
the sparseness of the actual data points from the
regression line . The SE is measured in units of
Y, and ideally, SE 0. Can also compare SE to
average(Y) and obtain a Coefficient of Variation
to assess magnitude of SE. - ANOVA Table ? Significance F ? pvalue for the
test of the null hypothesis that the regression
line is statistically insignificant (Ho bm0
vs. Ha m? 0)) - Coefficient s table that reports the estimated
intercept and slope for the regression line,
their respective standard errors, test
statistics and also p-values for the numeric
significance (Ha slope, m ? 0, and Ha intercept
b? 0), versus H0 m0 and H0 b0, respectively.
4Regression Statistics Coefficient of
Determination, r2, and Standard Error
Chapter 10, Regression Analysis
ANOVA
ANOVA df SS MS F Significance F
Regression k SSR MSR SSR/k MSR/MSE P-value of the F Test
Residuals n-k-1 SSE MSE SSE/(n-k-1)
Total n-1 SST
ANOVA
Estimate to perform Regression Analysis using
Least Squares
Assumptions 2 Equations to solve for 2 unknowns intercept b0, and slope b1
Unbiased Explanation Se 0
Explanatory Factor, X, uncorrelated with e SXe 0
Coeff. table
5Residuals Analysis 3 issues
- Normality of residuals requires that we construct
a histogram of the residuals, or a Box-Whisker
Plot of the residuals, or that we construct a
Normal Probability Plot of the residuals with the
assistance of MSExcel. - The residuals plot should show no pattern or
regularities in the scatterplot between X and e.
Otherwise, the linear model inconsistently
explains Y as a function of X, and a nonlinear
function of X would better explain Y. - Autocorrelation of the residuals can be tested by
using excel to compute the Durbin-Watson
statistic from the residuals calculated by the
Regression process. 1.4 DW, 2.6 for no
significant autocorrelation
61. Checking for Normality of Residuals
72. Checking for Uniform Variation in Residuals
Relative to X
83. Checking for autocorrelation in residuals
Durbin-Watson Calculations
Sum of Squared Difference of Residuals 2123665.578
Sum of Squared Residuals 870949.4547
Durbin-Watson Statistic 2.438333897
Want this value to be close to 2.00