Title: Review of Lecture 13
1Lecture 14
- Review of
Lecture 13 - Standard Regression Assumptions
- a). about the form of the model
- b). about the measurement errors
- c). about the predictor variables
- d). about the observations
- II Examples of the Anscombes Quartet Data
show that - a). Gross Violations of assumptions will
lead to serious problems - b). Summary statistics may miss or
overlook the features of the data. - III Types of Residuals
- a). Ordinary b). Standardized
c). Studentized - What well talk
about today? - I Graphical Methods for Exploring Data
Structures - a) Graphs before fitting b) Graphs
after fitting
2Graphical Methods Graphical methods play an
important role in data analysis, especially in
linear regression analysis. It can reveal some
important features that summary statistics may
miss, e.g., the scatter plots of the Anscombes
data. a). Graphs before fitting a model
Functions 1) Detect outliers 2) Suggest a
model b). Graphs after fitting a model
Functions 1) Checking assumption violations
2). Detecting outliers
3- Graphs before fitting a model
- Functions 1) Detect outliers, high leverage
point or influential points -
- 2) Recognize the patterns
- 3) Explore the relationship
between variables - Types 1). One-dimensional
- 2). Two-dimensional
-
- 3). Rotating plot
4- One-dimensional graphs
- Histogram
- Stem-and-leaf display
- Dot plot
- Box plot
- Functions (1) Distribution of a single variable
- (2) Detect outliers, high
leverage points, or influential points - Two-dimensional graphs
- Matrix plot pair-wise scatter plot
- Purpose explore patterns of pair-wise variables
5Stem-and-leaf of Y N 15 Leaf Unit
0.10 2 10 88 3 11 0 4 11 2
6 11 45 6 11 7 11 9 (1) 12 0
7 12 3 6 12 4 5 12 66 3 12
8 2 13 01
6(No Transcript)
7Drawback when pgt1, the scatter plots of Y vs
Xj may or may not show linear patterns even
when Y and X1, X2, ,Xp have a good or perfect
linear relationship.
Hamiltans Data
8Hamiltans Data Y, X1, X2 Fitted Results Y
vs X1 Y11.989.004X1,
t-test.09, R_sq0.0 Question Y
is uncorrelated with X1? Y vs X2
Y10.632.195 X2, t-test1.74,
R_sq.188 Question Y is uncorrelated
with X2? Y vs X1, X2 Y-4.5153.097X11.032X
2, F-test39222,
R_sq1.0 Question Y is almost perfectly
linearly correlated with X1 and X2? Question
What assumption is violated by the Hamiltans
Data?
9Rotating Plots 3-dimensional plot Rotate the
points in different directions s o that
three-dimensional structure becomes
apparent. Dynamic Graphs pgt3 Graphs are in a
dynamic status instead of a static status. Good
for exploring the structural and relationship in
more than 3-dimensions.
10- b) Graphs after fitting a model
- Functions 1) Checking assumptions,
- 2) Detection of outliers, high
leverage points, influential points - 3). Diagnostic plots for the
effect of variables - Standardized Residuals-based Plots
- Normal Probability Plot of standardized
residuals - ordered standardized
residuals vs normal scores - Function
- Main Idea If the residuals are normally
distributed, the ordered standardized residuals
should be approximately the same as the ordered
normal scores. In this case, the plot should
resemble a (nearly) straight-line with intercept
and slope .
112. Scatter Plots of standardized residuals
against each of the predictor variables Function
Check linearity or homogeneity assumptions on
Xj Main Idea Under the standard
assumptions, the standardized residuals are
nearly uncorrelated with each of the predictor
variables. In this case, the residual points
should be randomly scattered in the range. For
example,
123. Scatter Plots of standardized residuals
against the fitted values. Function Check
Independence, homogeneity of the measurement
errors Linearity of the
data Main Idea Under the Independence,
homogeneity of the measurement errors Linearity
of the data assumptions, the standardized
residuals are nearly uncorrelated with the fitted
values. In this case, the residual points should
be randomly scattered in the range, e.g.,
134. Index Plot of standardized residuals i. e.
the scatter plot of standardized residuals
against the indices of observations. Function
Check Independence, homogeneity of the
measurement errors Linearity of
the data Main Idea Under the assumption
of independence errors, the standardized
residuals should be randomly scattered within a
horizontal band around 0.
14(No Transcript)
15(No Transcript)
16(No Transcript)
17(No Transcript)
18- After-class Questions
- Is graphics before fitting a model model-based?
- Is graphics after fitting a model model-based?
- Why is graphics sometimes more useful than a
statistic?