Title: Statistics with Economics and Business Applications
1Statistics with Economics and Business
Applications
- Chapter 12 Multiple Regression Analysis
- A brief exposition
2Introduction
- We can use the same basic ideas in simple linear
regression to analyze relationships between a
dependent variable and several independent
variables - Multiple regression is an extension of the simple
linear regression for investigating how a
response y is affected by several independent
variables, x1, x2, x3,, xk. - Our objective are
- find relationships between y and x1, x2, x3,, xk
- predict y using x1, x2, x3,, xk
3Example
- Fatness (y) may depend on
- x1 age
- x2 sex
- x3 body type
- Monthly sales (y) of the retail store may depend
on - x1 advertising expenditure
- x2 time of year
- x3 state of economy
- x4 size of inventory
4Some Questions
- Which of the independent variables are useful and
which are not? - How could we create a prediction equation to
allow us to predict y using knowledge of x1, x2,
x3 etc? - How strong is the relationship between y and the
independent variables? - How good is this prediction?
5The General Linear Model
- y b0 b1x1 b2x2 bkxk e
- y is the dependent variable.
- b0, b1, b2,..., bk are unknown parameters
- x1, x2,..., xk are independent predictor
variables - The deterministic part of the model,
- E(y) b0 b1x1 b2x2 bkxk ,
- describes average value of y for any fixed
values of x1, x2,..., xk . The observation y
deviates from the deterministic model by an
amount e. - e is random error. We assume random errors are
independent normal random variables with mean
zero and a constant variance s2
6The Method of Least Squares
- Data n observations on the response y and the
independent variables, x1, x2, x3, xk. - The best-fitting prediction equation is
-
- We choose our estimates to
minimize - The computation is usually done by a computer
7Steps in Regression Analysis
When you perform multiple regression analysis,
use a step-by step approach 1. Fit the model to
data estimate parameters. 2. Use the analysis
of variance F test and R2 to determine how well
the model fits the data. 3. Check the t tests for
the partial regression coefficients to see which
ones are contributing significant information in
the presence of the others. 4. Use diagnostic
plots to check for violation of the regression
assumptions. 5. Proceed to estimate or
predict the quantity of interest
8Example
A data contains the selling price y (in
thousands of dollars), the amount of
living area x1 (in hundreds of square feet), and
the number of floors x2, bedrooms x3, and
bathrooms x4, for n 15 randomly selected
residences currently on the market.
Property y x1 x2 x3 x4
1 69.0 6 1 2 1
2 118.5 10 1 2 2
3 116.5 10 1 3 2
15 209.9 21 2 4 3
9Minitab Output
10Minitab Output
11Minitab Output
Is the overall model useful in predicting list
price? How much of the overall variation in the
response is explained by the regression model?
12Minitab Output
In the presence of the other three independent
variables, is the number of bedrooms significant
in predicting the list price of homes? Test using
a .05.
13Historical Note
- Where does the name regression come from?
- In 1886, geneticist Francis Galton set up a
stand at the Great Exhibition, where he measured
the heights of families attending. He discovered
a phenomenon called regression toward the mean.
Seeking laws of inheritance, he found that sons
heights tended to regress toward the mean height
of the population, compared to their fathers
heights. Tall fathers tended to have somewhat
shorter sons, and vice versa. Galton developed
regression analysis to study this effect, which
he optimistically referred to as regression
towards mediocrity".
14