Multiple Regression: Explanatory variables, slopes - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Multiple Regression: Explanatory variables, slopes

Description:

Introduce different kinds of explanatory variables. ... including x will allow you to model a parabola. other alternatives are ln(x), x , etc. ... – PowerPoint PPT presentation

Number of Views:96
Avg rating:3.0/5.0
Slides: 29
Provided by: hagt
Category:

less

Transcript and Presenter's Notes

Title: Multiple Regression: Explanatory variables, slopes


1
Multiple RegressionExplanatory variables,
slopes inferences.Evaluation, Estimation
Prediction
  • MSIT3000
  • Lecture 22

2
Objectives
  • Introduce different kinds of explanatory
    variables.
  • Explain apply inferential statistics on the
    slope estimates using Confidence intervals
    Hypothesis testing.
  • Learn the steps in Multivariate Regression
    Modeling.
  • Learn how to evaluate properly apply a multiple
    regression model.
  • Confidence Prediction Intervals for Yhat.
  • Text reference Section 10.4-10.6

3
Model Specification
  • The title Model Specification refers to the
    equation
  • y ?0 ?1x1 ?2x2 .... ?kxk ?
  • What kinds of variables can the xs be?
  • Continuous.
  • Dummy variables (0 or 1)
  • Quadratic terms or other transformations.

4
Continuous variables
  • These are the standard variables, e.g.
  • Price
  • Advertising
  • Age
  • Temperature

5
Dummy variables
  • Dummy, or Indicator, variables allow us to model
    discontinuous effects.
  • Special times, e.g.
  • war y ?0 ?1x1 .... ?kxk ?I(w) ?
  • I1 if there is a war, and I0 if there is peace.
  • seasons y ?0 ?1x1 ?WI(W) ?SI(S) ?FI(F)
    ?
  • I(W)1 for Winter, 0 otherwise, etc.
  • Why isnt I(Summer) included?
  • discrete groups, e.g.
  • gender
  • Dummy variables can be used to shift the
    intercept, the slope, or both.

6
Variable transformations
  • If the data appears to be a nonlinear shape, OLS
    will not be correct.
  • You can correct for that by adding nonlinear
    terms
  • including x² will allow you to model a parabola.
  • other alternatives are ln(x), x³, etc.
  • What does the linear in OLS refer to if not the
    variables?

7
Can OLS fit any kind of model?
  • No. If the function is not linear in the
    parameters, OLS cannot be used.
  • That sums up objective 1
  • Model Specification.

8
Objective 2 Inferences on the slopes.
  • If we wish to test a hypothesis or construct a CI
    for a slope-estimate, we do so exactly as we did
    for univariate OLS.
  • CI bi ? tSb(i)
  • The degrees of freedom for t are dfn-(k1)
  • The standard error will come from a computer
    printout.

9
Conclusion of Part I
  • Objectives addressed
  • Cover model specification and the types of
    explanatory variables we can use.
  • Discuss inferences (CI HT) on the slope
    estimates.
  • Problems next page 10.7

10
Experience and speed of assemblyProblem 10.9 in
the 7th edition of the text
  • Running a manufacturing operation efficiently
    requires knowledge of the time it takes employees
    to manufacture the product, otherwise the cost of
    making the product cannot be determined.
    Furthermore, management would not be able to
    establish an effective incentive plan for its
    employees because it would not know how to set
    work standards (Chase and Aquilano, Production
    and Operations Management, 1992). Estimates of
    production time are frequently obtained using
    time studies. The data in the accompanying table
    came from a recent time study of a sample of 15
    employees performing a particular task on an
    automobile assembly line.

11
Data
12
(No Transcript)
13
SAS results
  • Dep Variable TIME
  • Analysis of Variance
  • Sum of Mean
  • Source DF Squares Square F Value ProbgtF
  • Model 2 156.11948 78.06 65.594 0.0001
  • Error 12 14.28052 1.19
  • C Total 14 170.40000
  • Root MSE 1.09089 R-square 0.9162
  • Dep Mean 14.8000 Adj R-sq 0.9022
  • C.V. 7.37089
  • Parameter Estimates
  • Parameter Standard T for H0
  • Variable DF Estimate Error Parameter0 ProbgtT
  • INTERCEP 1 20.09 0.725 27.7 0.0001
  • EXP 1 -0.67 0.155 -4.33 0.0010
  • EXPSQ 1 0.0095 0.00633 1.507 0.1576

14
Questions
  • What is the least squares prediction equation?
  • Test the null hypothesis that beta-2 0 using a
    level of significance of 0.01. Does the quadratic
    term contribute significantly?
  • What is the expected Time to Assemble for an
    employee with 2 years of experience?

15
Part II
  • Objectives
  • Learn the steps in Multivariate Regression
    Modeling.
  • Learn how to evaluate properly apply a multiple
    regression model.
  • Confidence Prediction Intervals for Yhat.
  • Text reference 10.5 10.6.

16
Steps in Multivariate Regression Modeling
  • Specify the model.
  • Fit the model to the data.
  • Evaluate the model.
  • Apply the model.

17
How many ways can a model be evaluated?
  • Does the model have any interpretable meaning?
  • Does the model have global significance?
  • Statistical significance.
  • Practical significance.
  • Are the individual explanatory variables
    significant?

18
Interpretation of the model.
  • This is a problem that must be addressed before
    the data is used to estimate a model.
  • If the answer is no, the model cannot be
    interpreted in a useful manner, then you stop
    re-specify the model.
  • A typical example is to mistake ordinal for
    nominal data.
  • E.g. define College1 if a student went to UGA,
    College2 if the student went to Ga Tech and
    College3 if the student went to GSU. The
    variable College cannot be used in an OLS (why
    not?).

19
Evaluating the global model Statistical
significance.
  • If the explanatory variables say nothing about
    the behavior of the dependent variable, what must
    the parameters (?) all be equal to?
  • The start of every multivariate regression output
    consists of an ANOVA (for ANalysis Of VAriance).
    It tests the null hypothesis
  • H0 ?1 ?2 ... ?k 0
  • What is the alternative hypothesis?

20
Evaluating the global model Statistical
significance (cont).
  • The test statistic for this hypothesis test is
    F-distributed under the null hypothesis.
  • The test statistic
  • F has the degrees of freedom of the model
    associated with the numerator and the degrees of
    freedom for the error term associated with the
    denominator.
  • Tables VIII-XI
  • See the problems SAS output.
  • Typically, it will suffice to look at the p-value.

21
Evaluating the global model Practical
significance.
  • A model can be statistically significant and
    practically useless if only a very small portion
    of the variation in the dependent variable is
    explained.
  • This can happen if we have a huge number of
    observations, or
  • if there is a large amount of noise.
  • What can we use to evaluate whether or not the
    model explains enough variation to be useful?
  • Hint think back to univariate OLS.

22
Evaluating the global model Practical
significance and R².
  • R² measures the amount of variation in Y
    explained by all the explanatory variables.
  • A model that explains 90 of the variation in Y
    and Y varies quite a bit is significant from a
    practical viewpoint.
  • A model that explains 4 of the variation in Y or
    Y doesnt vary much is of little practical use.

23
Applying a model
  • Does the model have to be significant in order to
    be applied?
  • The model must be globally statistically
    significant and to some extent practically
    significant.

24
Do all the predictors have to be significant?
  • NO.
  • You may have the situation that the marginal
    significance is low, but the variables definitely
    belong in the model.
  • If none of the variables are significant but the
    model is globally significant, you can still use
    the model...
  • for prediction, but
  • not for explanation.

25
Prediction vs Extrapolation
  • Which values of the predictors are acceptable as
    inputs to the model?
  • If the explanatory variables are within the range
    of the data (prediction), the model is valid.
    Otherwise, use the model with extreme caution (or
    preferably not at all).
  • Extrapolation refers to using the model with
    explanatory variables that are outside the
    original range of the data.

26
Yhat as a conditional mean vs. Yhat as a specific
prediction
  • Exactly as in univariate OLS, there are two ways
    to regard the predicted value of Y(Yhat).
  • As a conditional mean, or
  • As a specific prediction
  • These two statistics have different standard
    errors because the second includes individual
    error, whereas the first only includes how much
    our line, based on all the data, missed the true
    line.

27
Confidence Prediction Intervals
  • Based on the standard errors, two different
    confidence intervals can be constructed
  • A confidence interval for the conditional mean,
    and
  • A confidence interval for a specific prediction,
    i.e. a prediction interval.
  • We will not calculate these, but we interpret
    output.

28
Conclusion of Part II
  • Review of objectives
  • Steps in Multivariate Regression Modeling
  • Specify the model.
  • Fit the model to the data.
  • Evaluate the model.
  • Apply the model.
  • Evaluate the model
  • Globally
  • for statistical significance
  • for practical significance.
  • Individual variables statistical significance.
  • Confidence Prediction Intervals for Yhat.
  • Problems 10.22, 10.30, 10.34
Write a Comment
User Comments (0)
About PowerShow.com