Title: Fitting curves through points: Regression
1Chapter 8
- Fitting curves through points Regression
2Introduction
- This chapter considers data comprising a
continuous independent variable x and a
continuous response variable y - Simplest model is a straight line
- Use the model to test the significance of the
linear relationship and make predictions - Least squares most common way to choose the best
line - demoLeastSquares.jsl
3(No Transcript)
4Example
- Data from 232 branches of an East Coast bank
- First look at Number of New Accounts versus
Number of Sales Staff during a particular one
year period
5(No Transcript)
6- Similar to ANOVA except the fitted means are
constrained to fall on a line
7Linear Regression
ANOVA
8Regression Model
- Number of new accounts
- 179.4 (78.9 x sales staff) residual
- Parameter estimates subject to sampling
variability - Can test hypotheses, compute confidence
intervals, etc.
9Confidence Curves Fit
10Regression Model Assumptions
- At each level of x, have a random sample of ys
- At each level of x, the ys are normally
distributed - All those normal distributions have the same
variance
11Residual Plots
- Use residual plots to check assumptions
- Residuals should be approximately normally
distributed - Residuals when plotted against x or y should show
no particular pattern
12(No Transcript)
13Forensic Example (FSI, 1994)
14Example Growth.jmp
Exclude points to the left???
15Polynomial Models
16Transformations
fit special Splines? Piecewise?
17Tukeys Bulging Rule
x2 y2
y2, sqrt x log x, 1/x
x2, sqrt y log y, 1/y
sqrt x, sqrt y, log x, log y, 1/x, 1/y
18Another example Cell Phone Use
Which transformation? polynomial?
19Always look at the data!
20(No Transcript)
21Prediction Cottages Example
22Prediction for 3,000 sq. feet?
- Two components of uncertainty
- Dont know where the true line is
- Even if we know where the true line is, the data
are normally distributed about this line - Confidence curves fit only show uncertainty
about the true line - Confidence curves indiv include both
components of uncertainty
23Fit
Indiv
24Revisit Cell Phone Use
Prediction for Dec, 1999 (period 31)?