Prediction - PowerPoint PPT Presentation

About This Presentation

Title:

Prediction

Description:

Prediction Confidence Intervals, Cross-validation, and Predictor Selection Skill Set Why is the confidence interval for an individual point larger than for the ... – PowerPoint PPT presentation

Number of Views:32

Avg rating:3.0/5.0

Slides: 20

Provided by: Michael2472

Learn more at: http://faculty.cas.usf.edu

Category:

Tags: prediction

more less

Transcript and Presenter's Notes

Title: Prediction

1
Prediction

Confidence Intervals, Cross-validation, and
Predictor Selection

2
Skill Set

Why is the confidence interval for an individual
point larger than for the regression line?
Describe the steps in forward (backward,
stepwise, blockwise, all possible regressions)
predictor selection.

What is cross-validation? Why is it important?
What are the main problems as far as R-square and
prediction are concerned with forward (backward,
stepwise, blockwise, all possible regressions)

3
Prediction v. Explanation

Prediction is important for practice
WWII pilot training
Ability tests, e.g., eye-hand coordination
Built an airplane that flew
Fear of heights
Favorite flavor ice cream
Age and driving accidents
Explanation is crucial for theory. Highly
correlated vbls may not help predict, but may
help explain. Team outcomes as function of team
resources and team backup.

4
Confidence Intervals
CI for the line, i.e., the mean score
Note shape.
MSR. N sample size. The df are for MSR
(variance of residuals).
CI for a single persons score
5
Computing Confidence Intervals
Suppose
Find CI for line (mean) at X1.
df N-k-1 20-1-1 18.
CI 3.81 to 7.79
For an individual at X1, what is the CI?
CI .29 to 11.31
6
Review
Why is the confidence interval for the individual
wider than a similar interval for the regression
line?
Why are the confidence intervals regression
curved instead of being straight lines?
7
Shrinkage
R2 is biased (sample value is too large) because
of capitalizing on chance to minimize SSe in
sample.
If the population value of R2 is zero, the
expected value in the sample is R2 k/(N-1) where
k is the number of predictors and N is the number
of people in the sample. If you have many
predictors, you can make R2 as large as you want.
What is the expected value of R-square if N
101 and k 10? Ethical issue here.
Common adjustment or shrinkage formula
This is reported by SAS (PROC REG) under Adj
R-Sq. Adjusts for both k and N and size of
initial R2.
8
Shrinkage Examples
Suppose R2 is .6405 with k 4 predictors and a
sample size of 30. Then
R2 .6405 N Adj R2
15 .497
30 .583
100 .625
R2 .30 N Adj R2
15 .020
30 .188
100 .271
Note small N means lots of shrinkage but also
smaller initial R2 shrinks more.
9
Cross-Validation

Compute a and b(s) (can have one or more IVs) on
initial sample.
Find new sample, do not estimate a and b, but use
a and b to find Y.
Compute correlation between Y and Y in new
sample square. Ta da! Cross- validation R2.
Cross-validation R2 does not capitalize on chance
and estimates operational R2.

10
Cross-validation (2)

Double cross-validation
Data splitting
Expert judgment weights (dont try this at home)
Math Estimates

Fixed
Random
11
Review

What is shrinkage in the context of multiple
regression? What are the things that affect the
expected amount of shrinkage?
What is cross-validation? Why is it important?

12
Predictor Selection

Widely misunderstood and widely misused.
Algorithms labeled forward, backward, stepwise,
etc.
NEVER use for work involving theory or
explanation (hint this clearly means your thesis
and dissertation).
NEVER use for estimating importance of variables.
Use SOLELY for economy (toss predictors).

13
All Possible Regressions
Data from Pedhazur example.
GPA is grade point average. GREQ is Graduate
Record Exam, Quantitative. GREV is GRE Verbal.
MAT is Miller Analogies Test. AR is Arithmetic
Reasoning test.
14
All Possible Regressions (2)
Note how easy it is to choose the model with the
highest R2 for any given number of predictors.
In predictor selection, you also need to worry
about cost. You get both V and Q GRE in one test.
Also consider what change in R2 means. Accuracy
in prediction of dropout.
15
Predictor Selection Algorithms

Forward build up from start with p value. End
when no variables meet PIN. May include duds.
Backward Start with all vbls and pull out with
POUT. May lose gems.
Stepwise Start forward, check backward at each
step. Not guaranteed to give best R2.
Blockwise not used much. Forward by blocks,
then any method (eg stepwise) within block to
choose best predictors.

16
Things to Consider in PS

Algorithms consider statistical significance, but
you have to consider practical significance and
cost, i.e., algorithms dont work well.
Surviving variables are often there by chance.
Do the analysis again and you would choose a
different set. OK for prediction.
The value of correlated variables is quite
different when considered in path analysis and
SEM.

17
Hierarchical Regression

Alternative to predictor selection algorithms
Theory based (a priori) tests of increments to
R-square

18
Example of Hierarchical Reg
Does personality increase prediction of med
school success beyond that afforded by cognitive
ability? Collect data on 250 med students for
first two years.
Model 1
R2.10 , plt.05
Model 2
R2.13 , plt.05
Model test
F(2,245)4.22, p lt .05
19
Review

Describe the steps in forward (backward,
stepwise, blockwise, all possible regressions)
predictor selection.
What are the main problems as far as R-square and
prediction are concerned with forward (backward,
stepwise, blockwise, all possible regressions)
Why avoid predictor selection algorithms when
doing substantive research (when you want to
explain variance in the DV)?

Write a Comment

User Comments (0)