Some Regression Pitfalls - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Some Regression Pitfalls

Description:

... if we were to let x=p, we would compute the square root of each price value, and these square roots would be the values of x that would be used in regression ... – PowerPoint PPT presentation

Number of Views:73
Avg rating:3.0/5.0
Slides: 16
Provided by: dav5214
Category:

less

Transcript and Presenter's Notes

Title: Some Regression Pitfalls


1
Some Regression Pitfalls
  • Chapter 7

2
Observational Data Versus Designed Experiments
  • Observational where the values of the
    independent variables are uncontrolled
  • Experimental where the xs are controlled via a
    designed experiment

3
Caution
  • With observational data, a statistically
    significant relationship between a response y and
    a predictor variable x does not necessarily imply
    a cause-and-effect relationship.

4
Standardized Regression Coefficients
Which variable is most important? I do not
know how to answer that.
5
GrandFather Clock Example
6
Definition 7.1
  • Multicollinearity exists when two or more of the
    independent variables used in regression are
    moderately or highly correlated

7
Detecting Multicollinearity in the Regression
Model
  • The following are indicators of
    multicollinearity
  • Significant correlations between pairs of
    independent variables in the model
  • Nonsignificant t tests for all (or nearly all)
    the individual ß parameters when the F test for
    overall model adequacy.
    is significant
  • Opposite signs (from what is expected) in the
    estimated parameters
  • A variance inflation factor (VIF) for a ß
    parameter greater than 10, where
  • and is the multiple coefficient of
    determination for the model

8
Solutions to Some Problems Created by
Multicollinearity
  • Drop one or more of the correlated independent
    variables from the final model. A screening
    procedure such as stepwise regression is helpful
    in determining which variables to drop.
  • If you decide to keep all the independent
    variables in the model
  • Avoid making inferences about the individual ß
    parameters (such as establishing a
    cause-and-effect relationship between y and the
    predictor variables).
  • Restrict inferences about E(y) and future y
    values to values of the independent variables
    that fall within the experimental region (see
    Section 7.6).
  • If your ultimate objective is to establish a
    cause-and-effect relationship between y and the
    predictor variables, use a designed experiment
    (see Chapters 11 and 12).
  • To reduce rounding errors in polynomial
    regression models, code the independent variables
    so that first-, second-, and higher-order terms
    for a particular x variable are not highly
    correlated (see Section 5.6).
  • To reduce rounding errors and stabilize the
    regression coefficients, use ridge regression to
    estimate the ? parameters (see Section 9.7).

9
Data Transformations
  • The word transformation means to change the form
    of some object or thing. Consequently, the
    phrase data transformation means that we have
    done, or plan to do, something to change the form
    of the data. For example, if one of the
    independent variables in a model is the price p
    of a commodity, we might choose to introduce this
    variable into the model as x1/p,xvp, or xe-p.
    Thus, if we were to let xvp, we would compute
    the square root of each price value, and these
    square roots would be the values of x that would
    be used in regression analysis.

10
Box-Cox
11
Which Model is the Best ?
Can you compare bell peppers and apples?
12
Which Model is the Best ?
  • To compare models using RSQ both models must have
    the same dependent variable
  • To compare models with different dependent
    variables, we use Predicted Mean Squares or PREDMS

13
PREDMS
For the original model, we use
14
PREDMS1
15
Example
Write a Comment
User Comments (0)
About PowerShow.com