Title: Class 5
1Class 5
- Multiple Regression Models
2Multiple Regression Models
- We can readily imagine that there may be several
factors that we can include in our model to
explain test scores.
3Using EXCEL
- The procedure is the same tools/data
analysis/regression. - Note that the independent variables have to be in
contiguous columns. - The F-test now tests to see if all of the
variables are explaining variation in y. - The problem becomes tricky because the degree to
which a variable appears to be important in
explaining the variation in y depends on the
other variables present!
4Hypothesis Testing
- The F-test tests to see if all of the
coefficients of the independent variables are
zero. For our model
- The t-test tests to see if each coefficient of an
independent variables is zero.
5Some Final Comments
- The first step in building a regression model is
to develop a list of candidate variables. - Notice that measurement might be a problem.
- Note that the t-test now takes on an important
role. But all you need are the p-values! - Examination of residuals may provide clues about
other factors that you have left out.
6Adding Qualitative Factors
- Qualitative factors can be added to the model
through the use of dummy variables. - Consider the following data
7Coding the Data
- We can add the gender factor by coding a variable
in the following way - If Female ? then x 1,
- If Male ? then x 0.
- What does our model say about salary?
E(y) expected salary ?0 ?1x
8Doing the Analysis
- After doing the regression analysis, what
hypothesis should we test? - Is there another way of doing this test? From
prior material?
9Coding Variables with More than Two Levels
- Consider the following data set. How would you
code the qualitative factor additive for the
model?
The additives were added to the gasoline and
resulted in the following miles per gallon (MPG).
Is there a difference in the additives? What
model should we build to check this? Be careful
about what the model implies!
10Coding Qualitative Variables--Summary
- The coding of dummy variables depends upon the
number of levels that the qualitative factor has.
For k levels, use k-1 dummy variables. The case
where k5
This adds four variables to the model (four
columns in your spreadsheet).
11More on Dummy Variables
- Of course, these dummy variables just define
different populations of which we are comparing
the means. - If there are only two populations (one dummy
variable), you can use the pooled t-test. - In a regression model, we have the luxury of
including other factors!
Controlling for other factors!
12More on Dummy Variables
- If you have only a set of dummy variables (like
the fuel additive problem), you can use ANOVA.