Statistics%20350%20%20Lecture%2027 - PowerPoint PPT Presentation

About This Presentation
Title:

Statistics%20350%20%20Lecture%2027

Description:

Can be viewed as the final step of the model building process ... based on other (nonstatistical) criteria, such as simplicity or interpretability ... – PowerPoint PPT presentation

Number of Views:11
Avg rating:3.0/5.0
Slides: 13
Provided by: Fuji261
Category:

less

Transcript and Presenter's Notes

Title: Statistics%20350%20%20Lecture%2027


1
Statistics 350 Lecture 27
2
Today
  • Last Day Start Chapter 9 (9.1-9.3)please read
    9.1 and 9.2 thoroughly
  • Today More Chapter 9stepwise regression

3
Comment on ? levels from last day
  • Stepwise Selection

4
Model Validation
  • Can be viewed as the final step of the model
    building process
  • Up to now, you have built a model using
  • Decided which variables are in the final model
    using

5
Model Validation
  • It is important to note that the model selected
    reflects the properties of the data collected
  • If data were collected at same Xs, would we get
    the same model?
  • Want to be sure that the model is capturing main
    features of the population of interest

6
Model Validation
  • Three basic approaches to validate the model

7
Model Validation
  • Collection of new data
  • If a new set of data is available, you can
    compute the Mean Square Error of Prediction, MSPR
    (for each model if there is more than one) by
    using the model to predict each observation in
    the new set, and then computing the mean squared
    deviation between the observed and predicted
    values

8
Model Validation
  • If the MSPR is much bigger than the MSE from the
    original model-building data set, then that means
    that the model was overfitting the data (chasing
    the errors)
  • If several models are being compared, the one
    with the smallest MSPR appears to be the best for
    the new data set
  • If the MSPRs are similar among all the candidate
    models, then the choice of a model can be made
    based on other (nonstatistical) criteria, such as
    simplicity or interpretability

9
Model Validation
  • How much bigger is much bigger than the MSE?
  • The modeling and validation sets ought to have
    the same population variance, because they are
    both (supposedly) drawn from the same population
  • Therefore, it is reasonable to treat the ratio
    MSPR/MSE as approximately

10
Model Validation
  • What to do if overfitting is indicated?

11
Model Validation
  • Comparison to theoretical expectations or earlier
    results

12
Model Validation
  • Data Splitting
  • What is no new dataset is available?
Write a Comment
User Comments (0)
About PowerShow.com