Part II: Inference for MLR - PowerPoint PPT Presentation

1 / 43

About This Presentation

Title:

Part II: Inference for MLR

Description:

So far we have discussed the different parts of the JMP output, ... Occam's razor...'All things being equal, the simplest solution tends to be the best one. ... – PowerPoint PPT presentation

Number of Views:54

Avg rating:3.0/5.0

Slides: 44

Provided by: amy53

Category:

more less

Transcript and Presenter's Notes

Title: Part II: Inference for MLR

1
Chapter 9

Part II Inference for MLR

2
SLR alternative

So far we have discussed the different parts of
the JMP output, Summary of Fit, Parameter
Estimates and just barely Analysis of Variance
(ANOVA from here out)
We can use Parameter estimates to get a CI and
do a Hypothesis test for
H0 ß1 0 vs HA ß1 ? 0
There is another way to do this using the ANOVA
table in SLR

3
SLR alternative

Fact In a SLR context
Under the SLR model, if H0 ß1 0 is true,
has a so-called F1, n-2 distribution

4
F-distribution

An F-distribution with degrees of freedom ?1 and
?2, labeled F(?1, ?2)
Table B.6A-E gives some F distribution quantiles

5
F-distribution

The F distn quantiles tables are very similar to
t-tables

6
F-distribution

Example UF2,8. Find a so that PUgt 0.05.
.95 quantile for F2,8 is 4.46
i.e. PF2,8 lt 4.46 0.95 so
PF2,8 gt 4.46 0.05
Example VF1,6. Find a so that PVgt 0.01.
.99 quantile for F1,6 is 13.75
i.e. PF1,6 lt 13.75 0.99 so
PF1,6 gt 13.75 0.01

7
F-distribution

Find the p-value for the following
f 4, v1 3, v2 10at the v1 3, v2 10
spot,
Q(.95) 3.71 lt 4 lt 6.55 Q(.99)
so .01 lt p-value lt .05
f 10, v1 2, v2 20at the v1 2, v2 20
spot,
Q(.999) 9.95 lt 10
so .001 gt p-value
f 1, v1 8, v2 30at the v1 8, v2 30
spot,
1 lt 1.37 Q(.75)
so p-value gt .25

8
SLR alternative

Fact The square of the t-statistic for testing
H0 ß1 0 is
which has an F1,(n-2) distribution if H0 is true
and tends to be larger if H0 is false
i.e. we can use large F as evidence against
H0 as a sensible testing method

9
SLR alternative

These calculations are summarized in the ANOVA
table
ANOVA table from SLR (for testing H0 ß1 0)
F is the test statistic for this test and it
gives the corresponding p-value as well

10
SLR alternative

Example Stress/ time till failure
x uniaxial stress applied (kg/mm2)
y time till fracture (hours)

11
SLR alternative
Analysis of Variance Parameter Estimates
Summary of Fit
12
SLR alternative

Stress/time till failure ANOVA table
F 13.77 F1,8, p-value 0.006 Reject H0
and conclude ß1? 0.
Using the parameter estimates table yields exact
same result

13
Multiple Linear Regression Review

MLR The term used to describe fitting equations
with multiple experimental (x) variables
i.e. Yi ß0 ß1X1i ß2X2i or Yi ß0
ß1Xi ß2Xi2
The Principle of Least Squares is still used to
fit such models
Minimize
Recall hand formulas dont work so we rely on
JMP to get our least squares estimates.

14
Example

Model I yi ß0 ß1xi ei

15
Example

Model II yi ß0 ß1xi ß2xi2 ei

16
Example

Model III yi ß0 ß1xi ß2xi2 ß3xi3 ei

17
Example

Using the output from the three models to write
out the following models with appropriate
estimates
Model that predicts the mean of y for all
estimates
Polynomial model of degree 1
Polynomial model of degree 2
Polynomial model of degree 3

18
Example

Which model should we use and why?
So far, we can only look at residual plots and R2
Model 1 R2 0.211 and the residual plot is
quadratic BAD
Model II R2 0.873 and the residual plot is
randomnot too BAD
Model III R2 0.876 and the residual plot is
random not too bad but slightly worse than Model
II.

19
Multiple Linear Regression Review

Common Models ei iid N(0, s2)
Constant Mean
Simple Linear Regression
Multiple Linear Regression

20
Multiple Linear Regression Review

Example A table in the textbook contains data
from the operation of a plant for the oxidation
of ammonia to nitric acid. In plant operation,
the nitric oxides produced are absorbed in an
absorption tower. The three experimental
(predictor/x) variables are x1 (the rate of
operation of the plant), x2 (cooling water inlet
temperature), and x3 (acid concentration, which
is the percent circulating minus 50, times 10).
The response variable is y (stack loss, which is
ten times the percentage of ingoing ammonia that
escapes the absorption column unabsorbed, i.e.,
an inverse measure of overall plant efficiency
Note In any model fitting exercise, the first
step should be to visualize the data. For
multiple linear regression models, a good place
to start is by examining the correlation
matrix, and all possible bivariate scatterplots.

21
Multiple Linear Regression Review

Producing the Correlation
Matrix and All Possible
Scatterplots in JMP
Directions Click Analyze, then
Multivariate Methods and finally
Multivariate. For each variable,
highlight the name of the variable and
click Y,Columns. Click OK.

22
Multiple Linear Regression Review

If I gave you the following models with their
respective R2 values, which model would you
choose?
Model R2
.950
.695 .023
.165
.973
.952 .002 (too .975 small?)

23
Multiple Linear Regression Review

What model would you choose?
1st model with R2 .950?
4th model with R2 0.973?
6th model with R2 .975?
Is the difference between 4th and 6th significant
enough to warrant adding an extra term to the
model?
No!
From the calculation of R2, adding terms inflates
R2 slightly regardless of whether it helped in
the model
Check R2adj

24
Multiple Linear Regression Review

If I gave you the following models with their
respective R2 values, which model would you
choose?
Model R2 R2adj
.950 .947
.695 .674
.165 .109
.973 .969
.952 .945
.975 .969

25
Multiple Linear Regression Review

When building statistical models, we must be
careful not to put everything but the kitchen
sink into the model
Occams razorAll things being equal, the
simplest solution tends to be the best one.
Recall with MLR Need to look at residual plots
for the whole model (observed vs predicted) and
for each variable individually

26
Inference for MLR

If and ,
then E(yi) 0
Note our assumptions for the model dont change
when you add variables
Ie Still need to have ei iid N(0, s2) which can
still be checked using residual plots
From SLR notes, we know how to get a confidence
interval and perform appropriate hypothesis tests
for ß0 and ß1. What happens to each parts of the
JMP output when we add variables and include
polynomials?

27
Example

Model I yi ß0 ß1xi ei

28
Example

Model II yi ß0 ß1xi ß2xi2 ei

29
Example

Model III yi ß0 ß1xi ß2xi2 ß3xi3 ei

30
Inference for MLR

Summary of Fit box (Rsquare, Mean of Response,
Observations)
Nothing changes
For MLR, we generally look at R2adj instead of R2
Parameter Estimates (Term, Estimate, Std Error,
t Ratio, Probgtt)
Nothing changes, simply add more terms
Produces t-test results for every term

31
Inference for MLR

What will happen to the ANOVA table? (Source,
DF, SS, MS, F Ratio)
DF change to reflect the number of parameters in
the model
F-test is different depending on the model
Rows stay the same, column values/interpretations
change as the model changes

32
Inference for MLR

In generalANOVA for n of observations, p
of parameters in the model
Source DF Sum of Square s Mean Square
F Ratio
Model p 1 SSM (from table) SSM/(p
1) MSmodel/MSE
Error n p SSE (from table)
SSE/(n p) Prob gt F
C.Total n 1 SST (from table)
p-value
Recall p is the number of parameters (number of
ßs), n is the number of observations)
Performs a test for H0 ß1 ß2 ßp 0 vs
HA at least one ßi ? 0 for i 1, 2, , p

33
Inference for MLR

We can use the F-tests to compare two models!
What does R2 represent?
How do you calculate SSTot?

34
Inference for MLR

We can use the F-tests to compare two models!
How do you calculate SSE?
How do you calculate SSM?

35
Inference for MLR

Recall the 3 models compared earliercompare the
following values
Note
For all three, models SST stays the same (this is
because the predicted value is not in this
equation)
As model complexity increases, so does SSM which
means SSE decreases

36
Inference for MLR

Using these ideas, we can test to find a better
model between two nested models
Nested model all the parameters in the previous
model are contained in the current one along with
at least one more
Example the following are nested models

37
Inference for MLR

We can compare the ANOVA tables for the
polynomial degree 1 and the polynomial degree 2
models

p - 1 1 so there is 1 non-intercept term, this
is our degree 1 polynomial. It tests H0ß1
0 vs HAß1? 0
p - 1 2 so there are 2 non-intercept terms,
this is our degree 2 polynomial. It tests
H0ß1 ß2 0 vs HA at least one ßi? 0, i
1,2
38
Inference for MLR

Example Fill in the missing blanks for the
ANOVA table for the following model yi ß0
ß1x1 ß2x2 ei
State the hypotheses and interpret the p-value
for this test

39
Inference for MLR

Example Fill in the missing blanks for the
ANOVA table for the following model yi ß0
ß1x1 ß2x2 ei
From the table, we know n 42, from the model we
know p 3
1. p 1 2 4. MSM SSM/DFM 10,000
2. DFTotal DFModel 40 DFError 5. MSE
SSE/DFE 2,000
3. SSE SST SSM 80,000 6. F MSM/MSE 5

40
Inference for MLR

State the hypotheses that this F-ratio is test
and find the p-value.
H0 ß1 ß2 0 vs. HA at least one ßi ?0 for
i 1,2
This F-Ratio follows an F2,40 distribution so
the p-value is
P F2,40 gt 5
Q(.95) 3.23 lt 5 lt 5.18 Q(.99)
.01 lt p-value lt .05

41
Inference for MLR

In general, when we compare two nested models,
the format goes
Note FTR H0 means choose the reduced model,
Reject H0 means choose the full model

42
Inference for MLR

Use the polynomial degree 2 and degree 3 models
from before to test these two nested models
Models
Hypotheses
Other important terms
SSEfull 683,453. pfull 4 dffull 11
SSEred 702,153.0 pred 3 dfred 12
n 15

Reduced model Full model
Extra term
43
Inference for MLR

Calculate the test statistic
Determine PFv1,v2 gt f and state the appropriate
conclusion
PF1,11 gt .3 so p-value gt 0.25
With a large p-value (larger than a 0.05), we
FTR H0 and conclude that x3 is not useful to add
to the model so we should go with the reduced
model

Write a Comment

User Comments (0)