MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE - PowerPoint PPT Presentation

1 / 208
About This Presentation
Title:

MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE

Description:

Specifically, we will look at an earnings function model where ... Hence a literal interpretation of b1 would be unwise. EARNINGS = b1 b2S b3ASVABC u ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 209
Provided by: Dough87
Category:

less

Transcript and Presenter's Notes

Title: MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE


1
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE
b1
EARNINGS
ASVABC
S
This sequence provides a geometrical
interpretation of a multiple regression model
with two explanatory variables.
1
2
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE
b1
EARNINGS
ASVABC
S
Specifically, we will look at an earnings
function model where hourly earnings, EARNINGS,
depend on years of schooling (highest grade
completed), S, and a measure of cognitive
ability, ASVABC.
2
3
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE
b1
EARNINGS
ASVABC
S
The model has three dimensions, one each for
EARNINGS, S, and ASVABC. The starting point for
investigating the determination of EARNINGS is
the intercept, b1.
3
4
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE
b1
EARNINGS
ASVABC
S
Literally the intercept gives EARNINGS for those
respondents who have no schooling and who scored
zero on the ability test. However, the ability
score is scaled in such a way as to make it
impossible to score zero. Hence a literal
interpretation of b1 would be unwise.
4
5
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE
pure S effect
b1 b2S
b1
EARNINGS
ASVABC
S
The next term on the right side of the equation
gives the effect of variations in S. A one year
increase in S causes EARNINGS to increase by b2
dollars, holding ASVABC constant.
5
6
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE
b1 b3ASVABC
pure ASVABC effect
b1
EARNINGS
ASVABC
S
Similarly, the third term gives the effect of
variations in ASVABC. A one point increase in
ASVABC causes earnings to increase by b3 dollars,
holding S constant.
6
7
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE
b1 b2S b3ASVABC
combined effect of S and ASVABC
b1 b3ASVABC
pure ASVABC effect
pure S effect
b1 b2S
b1
EARNINGS
ASVABC
S
Different combinations of S and ASVABC give rise
to values of EARNINGS which lie on the plane
shown in the diagram, defined by the equation
EARNINGS b1 b2S b3ASVABC. This is the
nonstochastic (nonrandom) component of the model.
7
8
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE
b1 b2S b3ASVABC u
u
b1 b2S b3ASVABC
combined effect of S and ASVABC
b1 b3ASVABC
pure ASVABC effect
pure S effect
b1 b2S
b1
EARNINGS
ASVABC
S
The final element of the model is the disturbance
term, u. This causes the actual values of
EARNINGS to deviate from the plane. In this
observation, u happens to have a positive value.
8
9
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE
b1 b2S b3ASVABC u
u
b1 b2S b3ASVABC
combined effect of S and ASVABC
b1 b3ASVABC
pure ASVABC effect
pure S effect
b1 b2S
b1 b2S
b1
EARNINGS
ASVABC
S
A sample consists of a number of observations
generated in this way. Note that the
interpretation of the model does not depend on
whether S and ASVABC are correlated or not.
9
10
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE
b1 b2S b3ASVABC u
u
b1 b2S b3ASVABC
combined effect of S and ASVABC
b1 b3ASVABC
pure ASVABC effect
pure S effect
b1 b2S
b1 b2S
b1
EARNINGS
ASVABC
S
However we do assume that the effects of S and
ASVABC on EARNINGS are additive. The impact of a
difference in S on EARNINGS is not affected by
the value of ASVABC, or vice versa.
10
11
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE
The regression coefficients are derived using the
same least squares principle used in simple
regression analysis. The fitted value of Y in
observation i depends on our choice of b1, b2,
and b3.
11
12
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE
The residual ei in observation i is the
difference between the actual and fitted values
of Y.
12
13
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE

We define RSS, the sum of the squares of the
residuals, and choose b1, b2, and b3 so as to
minimize it.
13
14
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE

First we expand RSS as shown, and then we use the
first order conditions for minimizing it.
14
15
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE

We thus obtain three equations in three unknowns.
Solving for b1, b2, and b3, we obtain the
expressions shown above.
15
16
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE

The expression for b1 is a straightforward
extension of the expression for it in simple
regression analysis.
16
17
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE

However, the expressions for the slope
coefficients are considerably more complex than
that for the slope coefficient in simple
regression analysis.
17
18
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE

For the general case when there are many
explanatory variables, ordinary algebra is
inadequate. It is necessary to switch to matrix
algebra.
18
19
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE
. reg EARNINGS S ASVABC Source SS
df MS Number of obs
570 ---------------------------------------
F( 2, 567) 39.98 Model
4745.74965 2 2372.87483 Prob gt
F 0.0000 Residual 33651.2874 567
59.3497133 R-squared
0.1236 ---------------------------------------
Adj R-squared 0.1205 Total
38397.0371 569 67.4816117 Root
MSE 7.7039 ------------------------------
------------------------------------------------ E
ARNINGS Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- S .7390366 .1606216
4.601 0.000 .4235506 1.054523 ASVABC
.1545341 .0429486 3.598 0.000
.0701764 .2388918 _cons -4.624749
2.0132 -2.297 0.022 -8.578989
-.6705095 ----------------------------------------
--------------------------------------
Here is the regression output for the earnings
function using Data Set 21.
19
20
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE
. reg EARNINGS S ASVABC Source SS
df MS Number of obs
570 ---------------------------------------
F( 2, 567) 39.98 Model
4745.74965 2 2372.87483 Prob gt
F 0.0000 Residual 33651.2874 567
59.3497133 R-squared
0.1236 ---------------------------------------
Adj R-squared 0.1205 Total
38397.0371 569 67.4816117 Root
MSE 7.7039 ------------------------------
------------------------------------------------ E
ARNINGS Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- S .7390366 .1606216
4.601 0.000 .4235506 1.054523 ASVABC
.1545341 .0429486 3.598 0.000
.0701764 .2388918 _cons -4.624749
2.0132 -2.297 0.022 -8.578989
-.6705095 ----------------------------------------
--------------------------------------
It indicates that earnings increase by 0.74 for
every extra year of schooling and by 0.15 for
every extra point increase in ASVABC.
20
21
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE
. reg EARNINGS S ASVABC Source SS
df MS Number of obs
570 ---------------------------------------
F( 2, 567) 39.98 Model
4745.74965 2 2372.87483 Prob gt
F 0.0000 Residual 33651.2874 567
59.3497133 R-squared
0.1236 ---------------------------------------
Adj R-squared 0.1205 Total
38397.0371 569 67.4816117 Root
MSE 7.7039 ------------------------------
------------------------------------------------ E
ARNINGS Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- S .7390366 .1606216
4.601 0.000 .4235506 1.054523 ASVABC
.1545341 .0429486 3.598 0.000
.0701764 .2388918 _cons -4.624749
2.0132 -2.297 0.022 -8.578989
-.6705095 ----------------------------------------
--------------------------------------
Literally, the intercept indicates that an
individual who had no schooling and an ASVABC
score of zero would have hourly earnings of
-4.62.
21
22
MULTIPLE REGRESSION WITH TWO EXPLANATORY
VARIABLES EXAMPLE
. reg EARNINGS S ASVABC Source SS
df MS Number of obs
570 ---------------------------------------
F( 2, 567) 39.98 Model
4745.74965 2 2372.87483 Prob gt
F 0.0000 Residual 33651.2874 567
59.3497133 R-squared
0.1236 ---------------------------------------
Adj R-squared 0.1205 Total
38397.0371 569 67.4816117 Root
MSE 7.7039 ------------------------------
------------------------------------------------ E
ARNINGS Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- S .7390366 .1606216
4.601 0.000 .4235506 1.054523 ASVABC
.1545341 .0429486 3.598 0.000
.0701764 .2388918 _cons -4.624749
2.0132 -2.297 0.022 -8.578989
-.6705095 ----------------------------------------
--------------------------------------
Obviously, this is impossible. The lowest value
of S in the sample was 6, and the lowest ASVABC
score was 22. We have obtained a nonsense
estimate because we have extrapolated too far
from the data range.
22
23
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION
MODEL
. reg EARNINGS S ASVABC Source SS
df MS Number of obs
570 ---------------------------------------
F( 2, 567) 39.98 Model
4745.74965 2 2372.87483 Prob gt
F 0.0000 Residual 33651.2874 567
59.3497133 R-squared
0.1236 ---------------------------------------
Adj R-squared 0.1205 Total
38397.0371 569 67.4816117 Root
MSE 7.7039 ------------------------------
------------------------------------------------ E
ARNINGS Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- S .7390366 .1606216
4.601 0.000 .4235506 1.054523 ASVABC
.1545341 .0429486 3.598 0.000
.0701764 .2388918 _cons -4.624749
2.0132 -2.297 0.022 -8.578989
-.6705095 ----------------------------------------
--------------------------------------
The output above shows the result of regressing
EARNINGS, hourly earnings in dollars, on S, years
of schooling, and ASVABC, the cognitive ability
score.
1
24
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION
MODEL

Suppose that you were particularly interested in
the relationship between EARNINGS and S and
wished to represent it graphically, using the
sample data.
2
25
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION
MODEL

A simple plot, like the one above, would be
misleading.
3
26
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION
MODEL

. cor S ASVABC (obs570) S
ASVABC -------------------------- S
1.0000 ASVABC 0.5779 1.0000
There appears to be a strong positive
relationship, but it is distorted by the fact
that S is positively correlated with ASVABC,
which also has a positive effect on EARNINGS.
4
27
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION
MODEL

. cor S ASVABC (obs570) S
ASVABC -------------------------- S
1.0000 ASVABC 0.5779 1.0000
We will investigate the distortion mathematically
when we come to omitted variable bias.
5
28
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION
MODEL
. reg EARNINGS ASVABC Source SS
df MS Number of obs
570 ---------------------------------------
F( 1, 568) 56.78 Model
3489.30726 1 3489.30726 Prob gt
F 0.0000 Residual 34907.7298 568
61.4572708 R-squared
0.0909 ---------------------------------------
Adj R-squared 0.0893 Total
38397.0371 569 67.4816117 Root
MSE 7.8395 ------------------------------
------------------------------------------------ E
ARNINGS Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- ASVABC .2687432 .035666
7.535 0.000 .1986898 .3387966 _cons
-.359883 1.818571 -0.198 0.843
-3.931829 3.212063 ----------------------------
--------------------------------------------------

To eliminate the distortion, you purge both
EARNINGS and S of their components related to
ASVABC and then draw a scatter diagram using the
purged variables.
6
29
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION
MODEL
. reg EARNINGS ASVABC Source SS
df MS Number of obs
570 ---------------------------------------
F( 1, 568) 56.78 Model
3489.30726 1 3489.30726 Prob gt
F 0.0000 Residual 34907.7298 568
61.4572708 R-squared
0.0909 ---------------------------------------
Adj R-squared 0.0893 Total
38397.0371 569 67.4816117 Root
MSE 7.8395 ------------------------------
------------------------------------------------ E
ARNINGS Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- ASVABC .2687432 .035666
7.535 0.000 .1986898 .3387966 _cons
-.359883 1.818571 -0.198 0.843
-3.931829 3.212063 ----------------------------
--------------------------------------------------
. predict EEARN, resid
We start by regressing EARNINGS on ASVABC, as
shown above. The residuals are the part of
EARNINGS which is not related to ASVABC. The
"predict" command is the Stata command for saving
the residuals from the most recent regression.
We name them EEARN.
7
30
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION
MODEL
. reg S ASVABC Source SS df
MS Number of obs
570 ---------------------------------------
F( 1, 568) 284.89 Model
1153.80864 1 1153.80864 Prob gt
F 0.0000 Residual 2300.43873 568
4.05006818 R-squared
0.3340 ---------------------------------------
Adj R-squared 0.3329 Total
3454.24737 569 6.07073351 Root
MSE 2.0125 ------------------------------
------------------------------------------------
S Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- ASVABC .1545378 .0091559
16.879 0.000 .1365543 .1725213
_cons 5.770845 .4668473 12.361 0.000
4.853888 6.687803 ------------------------
--------------------------------------------------
---- . predict ES, resid
We do the same with S. We regress it on ASVABC
and save the residuals as ES.
8
31
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION
MODEL
Now we plot EEARN on ES and the scatter is a
faithful representation of the relationship, both
in terms of the slope of the trend line (the
black line) and in terms of the variation about
that line.
9
32
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION
MODEL
As you would expect, the trend line is flatter
that in scatter diagram which did not control for
ASVABC (reproduced here as the gray line).
10
33
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION
MODEL
. reg EEARN ES Source SS df
MS Number of obs
570 ---------------------------------------
F( 1, 568) 21.21 Model
1256.44239 1 1256.44239 Prob gt
F 0.0000 Residual 33651.2873 568
59.2452241 R-squared
0.0360 ---------------------------------------
Adj R-squared 0.0343 Total
34907.7297 569 61.3492613 Root
MSE 7.6971 -------------------------------
-----------------------------------------------
EEARN Coef. Std. Err. t Pgtt
95 Conf. Interval ----------------------
--------------------------------------------------
----- ES .7390366 .1604802 4.605
0.000 .4238296 1.054244 _cons
-5.99e-09 .3223957 0.000 1.000
-.6332333 .6332333 ----------------------------
--------------------------------------------------

Here is the regression of EEARN on ES.
11
34
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION
MODEL
. reg EEARN ES Source SS df
MS Number of obs
570 ---------------------------------------
F( 1, 568) 21.21 Model
1256.44239 1 1256.44239 Prob gt
F 0.0000 Residual 33651.2873 568
59.2452241 R-squared
0.0360 ---------------------------------------
Adj R-squared 0.0343 Total
34907.7297 569 61.3492613 Root
MSE 7.6971 -------------------------------
-----------------------------------------------
EEARN Coef. Std. Err. t Pgtt
95 Conf. Interval ----------------------
--------------------------------------------------
----- ES .7390366 .1604802 4.605
0.000 .4238296 1.054244 _cons
-5.99e-09 .3223957 0.000 1.000
-.6332333 .6332333 ----------------------------
--------------------------------------------------
From multiple regression ----------------------
--------------------------------------------------
------ EARNINGS Coef. Std. Err. t
Pgtt 95 Conf. Interval -------------
--------------------------------------------------
-------------- S .7390366 .1606216
4.601 0.000 .4235506 1.054523
ASVABC .1545341 .0429486 3.598 0.000
.0701764 .2388918 _cons -4.624749
2.0132 -2.297 0.022 -8.578989
-.6705095 ----------------------------------------
--------------------------------------
A mathematical proof that the technique works
requires matrix algebra. We will content
ourselves by verifying that the estimate of the
slope coefficient, and equally importantly, its
standard error and t statistic, are the same as
in the multiple regression.
12
35
PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

Provided that the model is correctly specified
and that the Gauss-Markov conditions are
satisfied, the OLS estimators in the multiple
regression model are unbiased, efficient, and
consistent, as in the simple regression model.
1
36
PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

We will not attempt to prove efficiency or
consistency. We will however give a proof of
unbiasedness. The mathematical details of the
proof are unimportant, but you should understand
the general principle.
2
37
PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

The first step, as always, is to substitute for Y
from the true relationship. To save space, we
will write the denominator as D.
3
38
PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

When we decompose the covariance expressions, the
terms involving b1 disappear because Cov(X2, b1)
and Cov(X3, b1) are both zero.
4
39
PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

This is the last line of the previous slide.
5
40
PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

The slide shows where the terms involving b2 came
from.
6
41
PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

Similarly with the b3 terms. Notice that they
are going to cancel.
7
42
PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

The coefficient of b2 is D as defined above.
This cancels with the D outside the parentheses.
Hence we have decomposed b2 into the true value
and a complicated error term.
8
43
PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

We now examine the expected value of b2 to
determine whether it is unbiased. Assuming that
X2 and X3 are nonstochastic, we can write the
expected value expression as shown.
9
44
PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

ECov(X2, u) and ECov(X3, u) are both zero for
the same reasons that ECov(X, u) was zero in
the simple regression model (see the second
sequence in Chapter 3).
10
45
PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

Hence E(b2) is equal to b2 and so b2 is an
unbiased estimator. Similarly for b3.
11
46
PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

Finally we will show that b1 is an unbiased
estimator of b1. This is quite simple, so you
should attempt to do this yourself, before
looking at the rest of this sequence.
12
47
PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

First substitute for the sample mean of Y.
13
48
PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

Now take expectations.
14
49
PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

The expected value of the mean of the disturbance
term is zero since E(u) is zero in each
observation. We have just shown that E(b2) is
equal to b2 and that E(b3) is equal to b3.
15
50
PROPERTIES OF THE MULTIPLE REGRESSION COEFFICIENTS

Hence b1 is an unbiased estimator of b1.
16
51
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

This sequence investigates the population
variances and standard errors of the slope
coefficients in a model with two explanatory
variables. The expression for the population
variance of b2 is shown above. The expression
for b3 is the same, with the subscripts 2 and 3
interchanged.
2
52
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

The first factor in the expression is identical
to that for the population variance of the slope
coefficient in a simple regression model. The
population variance of b2 depends on the
population variance of the disturbance term, the
number of observations, and the variance of X2
for exactly the same reasons as in a simple
regression model.
4
53
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

The difference is that in multiple regression
analysis the expression is multiplied by a factor
which depends on the correlation between X2 and
X3. The higher is the correlation between the
explanatory variables, the greater will be the
population variance. This is easy to understand
intuitively. The greater the correlation, the
harder it is to discriminate between the effects
of the explanatory variables on Y, and the less
accurate will be the regression estimates. Note
that the population variance expression above is
valid only for a model with two explanatory
variables. When there are more than two, the
expression becomes much more complex and it is
sensible to switch to matrix algebra.
8
54
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

The standard deviation of the distribution of b2
is of course given by the square root of the
population variance.
9
55
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

With the exception of the population variance of
u, we can calculate the components of the
standard deviation from the sample data.
10
56
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

The population variance of u has to be estimated.
The sample variance of the residuals provides a
consistent estimator, but it is biased downwards
by a factor (n-k)/n in a finite sample, where k
is the number of parameters.
11
57
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

Obviously we can obtain an unbiased estimator by
multiplying the sample variance of the residuals
by a factor n/(n-k). We denote this unbiased
estimator su.
2
12
58
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS

Thus the estimate of the standard deviation of
the probability distribution of b2, known as the
standard error of b2 for short, is given by the
expression above.
13
59
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
. reg EARNINGS S ASVABC if COLLBARG0 Source
SS df MS
Number of obs 507 -------------------------
-------------- F( 2, 504)
40.31 Model 4966.96516 2 2483.48258
Prob gt F 0.0000 Residual
31052.2066 504 61.6115211
R-squared 0.1379 -------------------------
-------------- Adj R-squared
0.1345 Total 36019.1718 506 71.184134
Root MSE 7.8493 -------------
--------------------------------------------------
--------------- EARNINGS Coef. Std. Err.
t Pgtt 95 Conf.
Interval ---------------------------------------
-------------------------------------- S
.8891909 .1741617 5.106 0.000
.5470186 1.231363 ASVABC .1398727
.0461806 3.029 0.003 .0491425
.2306029 _cons -6.100961 2.15968
-2.825 0.005 -10.34404
-1.857877 ----------------------------------------
--------------------------------------
We will use this expression to analyze why the
standard error of S in an earnings function
regression is smaller for the non-union subsample
than for the union subsample in an earnings
function using Data Set 21.
14
60
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
. reg EARNINGS S ASVABC if COLLBARG0 Source
SS df MS
Number of obs 507 -------------------------
-------------- F( 2, 504)
40.31 Model 4966.96516 2 2483.48258
Prob gt F 0.0000 Residual
31052.2066 504 61.6115211
R-squared 0.1379 -------------------------
-------------- Adj R-squared
0.1345 Total 36019.1718 506 71.184134
Root MSE 7.8493 -------------
--------------------------------------------------
--------------- EARNINGS Coef. Std. Err.
t Pgtt 95 Conf.
Interval ---------------------------------------
-------------------------------------- S
.8891909 .1741617 5.106 0.000
.5470186 1.231363 ASVABC .1398727
.0461806 3.029 0.003 .0491425
.2306029 _cons -6.100961 2.15968
-2.825 0.005 -10.34404
-1.857877 ----------------------------------------
--------------------------------------
To select a subsample in Stata, you add an "if"
statement to a command. The COLLBARG variable is
equal to 1 for respondents whose rates of pay are
determined by collective bargaining, and it is 0
for the others.
15
61
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
. reg EARNINGS S ASVABC if COLLBARG0 Source
SS df MS
Number of obs 507 -------------------------
-------------- F( 2, 504)
40.31 Model 4966.96516 2 2483.48258
Prob gt F 0.0000 Residual
31052.2066 504 61.6115211
R-squared 0.1379 -------------------------
-------------- Adj R-squared
0.1345 Total 36019.1718 506 71.184134
Root MSE 7.8493 -------------
--------------------------------------------------
--------------- EARNINGS Coef. Std. Err.
t Pgtt 95 Conf.
Interval ---------------------------------------
-------------------------------------- S
.8891909 .1741617 5.106 0.000
.5470186 1.231363 ASVABC .1398727
.0461806 3.029 0.003 .0491425
.2306029 _cons -6.100961 2.15968
-2.825 0.005 -10.34404
-1.857877 ----------------------------------------
--------------------------------------
Note that "equals" in Stata is rendered as a
double sign, for some arcane reason.
16
62
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
. reg EARNINGS S ASVABC if COLLBARG0 Source
SS df MS
Number of obs 507 -------------------------
-------------- F( 2, 504)
40.31 Model 4966.96516 2 2483.48258
Prob gt F 0.0000 Residual
31052.2066 504 61.6115211
R-squared 0.1379 -------------------------
-------------- Adj R-squared
0.1345 Total 36019.1718 506 71.184134
Root MSE 7.8493 -------------
--------------------------------------------------
--------------- EARNINGS Coef. Std. Err.
t Pgtt 95 Conf.
Interval ---------------------------------------
-------------------------------------- S
.8891909 .1741617 5.106 0.000
.5470186 1.231363 ASVABC .1398727
.0461806 3.029 0.003 .0491425
.2306029 _cons -6.100961 2.15968
-2.825 0.005 -10.34404
-1.857877 ----------------------------------------
--------------------------------------
In the case of the non-union subsample, the
standard error of S is 0.1742.
17
63
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
. reg EARNINGS S ASVABC if COLLBARG1 Source
SS df MS
Number of obs 63 -------------------------
-------------- F( 2, 60)
2.58 Model 172.902083 2 86.4510417
Prob gt F 0.0844 Residual
2012.88504 60 33.5480841
R-squared 0.0791 -------------------------
-------------- Adj R-squared
0.0484 Total 2185.78713 62 35.2546311
Root MSE 5.7921 -------------
--------------------------------------------------
--------------- EARNINGS Coef. Std. Err.
t Pgtt 95 Conf.
Interval ---------------------------------------
-------------------------------------- S
-.3872787 .3530145 -1.097 0.277
-1.093413 .3188555 ASVABC .2309133
.1019211 2.266 0.027 .0270407
.4347858 _cons 8.291716 4.869209
1.703 0.094 -1.448152
18.03158 -----------------------------------------
-------------------------------------
In the case of the union subsample, the standard
error of S is 0.3530, twice as large.
18
64
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
Decomposition of the standard error of
S Component su n Var(S)
rS, ASVABC s.e. Non-union
0.1742 Union 0.3530 Factor
Non-union Union
We will explain the difference by looking at the
components of the standard error.
19
65
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
We will start with su.
20
66
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
We have replaced Var(e) by the mathematical
expression for a variance.
21
67
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
The mean of the residuals in an OLS regression
must be zero.
22
68
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
Hence our estimator of the population variance of
the disturbance term can be written as the
residual sum of squares, RSS, divided by n-k.
23
69
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
. reg EARNINGS S ASVABC if COLLBARG0 Source
SS df MS
Number of obs 507 -------------------------
-------------- F( 2, 504)
40.31 Model 4966.96516 2 2483.48258
Prob gt F 0.0000 Residual
31052.2066 504 61.6115211
R-squared 0.1379 -------------------------
-------------- Adj R-squared
0.1345 Total 36019.1718 506 71.184134
Root MSE 7.8493 -------------
--------------------------------------------------
--------------- EARNINGS Coef. Std. Err.
t Pgtt 95 Conf.
Interval ---------------------------------------
-------------------------------------- S
.8891909 .1741617 5.106 0.000
.5470186 1.231363 ASVABC .1398727
.0461806 3.029 0.003 .0491425
.2306029 _cons -6.100961 2.15968
-2.825 0.005 -10.34404
-1.857877 ----------------------------------------
--------------------------------------
Here is RSS for the non-union subsample.
24
70
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
. reg EARNINGS S ASVABC if COLLBARG0 Source
SS df MS
Number of obs 507 -------------------------
-------------- F( 2, 504)
40.31 Model 4966.96516 2 2483.48258
Prob gt F 0.0000 Residual
31052.2066 504 61.6115211
R-squared 0.1379 -------------------------
-------------- Adj R-squared
0.1345 Total 36019.1718 506 71.184134
Root MSE 7.8493 -------------
--------------------------------------------------
--------------- EARNINGS Coef. Std. Err.
t Pgtt 95 Conf.
Interval ---------------------------------------
-------------------------------------- S
.8891909 .1741617 5.106 0.000
.5470186 1.231363 ASVABC .1398727
.0461806 3.029 0.003 .0491425
.2306029 _cons -6.100961 2.15968
-2.825 0.005 -10.34404
-1.857877 ----------------------------------------
--------------------------------------
There are 507 observations in the non-union
subsample. k is equal to 3. Thus n-k is equal
to 504.
25
71
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
. reg EARNINGS S ASVABC if COLLBARG0 Source
SS df MS
Number of obs 507 -------------------------
-------------- F( 2, 504)
40.31 Model 4966.96516 2 2483.48258
Prob gt F 0.0000 Residual
31052.2066 504 61.6115211
R-squared 0.1379 -------------------------
-------------- Adj R-squared
0.1345 Total 36019.1718 506 71.184134
Root MSE 7.8493 -------------
--------------------------------------------------
--------------- EARNINGS Coef. Std. Err.
t Pgtt 95 Conf.
Interval ---------------------------------------
-------------------------------------- S
.8891909 .1741617 5.106 0.000
.5470186 1.231363 ASVABC .1398727
.0461806 3.029 0.003 .0491425
.2306029 _cons -6.100961 2.15968
-2.825 0.005 -10.34404
-1.857877 ----------------------------------------
--------------------------------------
Thus RSS/(n-k) is equal to 61.6115. To obtain
su, we take the square root. This is 7.8493.
26
72
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
Decomposition of the standard error of
S Component su n Var(S)
rS, ASVABC s.e. Non-union 7.8493
507 0.1742 Union 0.3530 Factor
Non-union Union
We place this in the table, along with the number
of observations.
27
73
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
. reg EARNINGS S ASVABC if COLLBARG1 Source
SS df MS
Number of obs 63 -------------------------
-------------- F( 2, 60)
2.58 Model 172.902083 2 86.4510417
Prob gt F 0.0844 Residual
2012.88504 60 33.5480841
R-squared 0.0791 -------------------------
-------------- Adj R-squared
0.0484 Total 2185.78713 62 35.2546311
Root MSE 5.7921 -------------
--------------------------------------------------
--------------- EARNINGS Coef. Std. Err.
t Pgtt 95 Conf.
Interval ---------------------------------------
-------------------------------------- S
-.3872787 .3530145 -1.097 0.277
-1.093413 .3188555 ASVABC .2309133
.1019211 2.266 0.027 .0270407
.4347858 _cons 8.291716 4.869209
1.703 0.094 -1.448152
18.03158 -----------------------------------------
-------------------------------------
Similarly, in the case of the union subsample, su
is the square root of 33.54808, which is 5.7921.
We also note that the number of observations in
that subsample is 63.
28
74
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
Decomposition of the standard error of
S Component su n Var(S)
rS, ASVABC s.e. Non-union 7.8493
507 0.1742 Union 5.7921
63 0.3530 Factor
Non-union Union
We place these in the table.
29
75
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
Decomposition of the standard error of
S Component su n Var(S)
rS, ASVABC s.e. Non-union 7.8493
507 6.0645 0.1742 Union 5.7921
63 6.0136 0.3530 Factor
Non-union Union
We calculate the sample variance of S for the two
subsamples from the sample data.
30
76
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
. cor S ASVABC if COLLBARG0 (obs507)
S ASVABC --------------------------
S 1.0000 ASVABC 0.5826 1.0000 .
cor S ASVABC if COLLBARG1 (obs63)
S ASVABC --------------------------
S 1.0000 ASVABC 0.5380 1.0000
The correlation coefficients for S and ASVABC are
0.5826 and 0.5380 for the non-union and union
subsamples, respectively. (Note that "cor" is
the Stata command for computing correlations.)
31
77
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
Decomposition of the standard error of
S Component su n Var(S)
rS, ASVABC s.e. Non-union 7.8493
507 6.0645 0.5826 0.1742 Union 5.7921
63 6.0136 0.5380 0.3530 Factor
Non-union Union
These entries complete the top half of the table.
We will now look at the impact of each item on
the standard error, using the mathematical
expression at the top.
32
78
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
Decomposition of the standard error of
S Component su n Var(S)
rS, ASVABC s.e. Non-union 7.8493
507 6.0645 0.5826 0.1742 Union 5.7921
63 6.0136 0.5380 0.3530 Factor
Non-union 7.8493 Union 5.7921
The su components need no modification. It is
relatively large for the non- union subsample,
having an adverse effect on the standard error.
33
79
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
Decomposition of the standard error of
S Component su n Var(S)
rS, ASVABC s.e. Non-union 7.8493
507 6.0645 0.5826 0.1742 Union 5.7921
63 6.0136 0.5380 0.3530 Factor
Non-union 7.8493 0.0444 Union 5.7921 0.1
260
The number of observations is much larger for the
non-union subsample, so the second factor is much
smaller than that for the union subsample.
34
80
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
Decomposition of the standard error of
S Component su n Var(S)
rS, ASVABC s.e. Non-union 7.8493
507 6.0645 0.5826 0.1742 Union 5.7921
63 6.0136 0.5380 0.3530 Factor
Non-union 7.8493 0.0444 0.4061 Union 5.7
921 0.1260 0.4078
Perhaps a little surprisingly, the variance in
schooling is about the same for both subsamples.
35
81
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
Decomposition of the standard error of
S Component su n Var(S)
rS, ASVABC s.e. Non-union 7.8493
507 6.0645 0.5826 0.1742 Union 5.7921
63 6.0136 0.5380 0.3530 Factor
Non-union 7.8493 0.0444 0.4061 1.2304 Un
ion 5.7921 0.1260 0.4078 1.1863
The correlation between schooling and ASVABC is
greater for the non-union subsample, and this has
an adverse effect on its standard error.
36
82
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
Decomposition of the standard error of
S Component su n Var(S)
rS, ASVABC s.e. Non-union 7.8493
507 6.0645 0.5826 0.1742 Union 5.7921
63 6.0136 0.5380 0.3530 Factor
product Non-union 7.8493 0.0444 0.4061 1.23
04 0.1741 Union 5.7921 0.1260 0.4078 1.1863 0.35
31
Multiplying the four factors together, we obtain
the standard errors. (The discrepancies in the
last digit have been caused by rounding error.)
37
83
PRECISION OF THE MULTIPLE REGRESSION COEFFICIENTS
Decomposition of the standard error of
S Component su n Var(S)
rS, ASVABC s.e. Non-union 7.8493
507 6.0645 0.5826 0.1742 Union 5.7921
63 6.0136 0.5380 0.3530 Factor
product Non-union 7.8493 0.0444 0.4061 1.23
04 0.1741 Union 5.7921 0.1260 0.4078 1.1863 0.35
31
We see that the reason that the standard error is
smaller for the non-union subsample is that there
are far more observations than in the non-union
subsample. Otherwise the standard error would
have been slightly greater.
38
84
MULTICOLLINEARITY
X2 X3 Y
10 19 51 11 21 56 12 23 61 13 25 66 14
27 71 15 29 76
Suppose that Y 2 3X2 X3 and that X3 2X2 -
1. There is no disturbance term in the equation
for Y, but that is not important. Suppose that
we have the six observations shown.
1
85
MULTICOLLINEARITY
Y
X3
X2
The three variables are plotted as line graphs
above. Looking at the data, it is impossible to
tell whether the changes in Y are caused by
changes in X2, by changes in X3, or jointly by
changes in both X2 and X3.
2
86
MULTICOLLINEARITY
Change Change
Change X2 X3 Y in X2
in X3 in Y 10 19 51 1 2 5 11
21 56 1 2 5 12 23 61 1 2 5 13 25 66 1 2 5 14 27
71 1 2 5 15 29 76 1 2 5
Numerically, Y increases by 5 in each
observation. X2 changes by 1.
3
87
MULTICOLLINEARITY
Y
Y 1 5X2 ?
X3
X2
Hence the true relationship could have been Y 1
5X2.
4
88
MULTICOLLINEARITY
Change Change
Change X2 X3 Y in X2
in X3 in Y 10 19 51 1 2 5 11
21 56 1 2 5 12 23 61 1 2 5 13 25 66 1 2 5 14 27
71 1 2 5 15 29 76 1 2 5
However, it can also be seen that X3 increases by
2 in each observation.
5
89
MULTICOLLINEARITY
Y
Y 3.5 2.5X3 ?
X3
X2
Hence the true relationship could have been Y
3.5 2.5X3.
6
90
MULTICOLLINEARITY
Y
Y 3.5 - 2.5p 5pX2 2.5(1-p)X3
X3
X2
These two possibilities are special cases of Y
3.5 -2.5p 5pX2 2.5(1-p)X3, which would fit the
relationship for any value of p.
7
91
MULTICOLLINEARITY
Y
Y 3.5 - 2.5p 5pX2 2.5(1-p)X3
X3
X2
There is no way that regression analysis, or any
other technique, could determine the true
relationship from this infinite set of
possibilities, given the sample data.
8
92
MULTICOLLINEARITY

What would happen if you tried to run a
regression when there is an exact linear
relationship among the explanatory variables?
9
93
MULTICOLLINEARITY

We will investigate, using the model with two
explanatory variables shown above. Note A
disturbance term has now been included in the
true model, but it makes no difference to the
analysis.
10
94
MULTICOLLINEARITY

The multiple regression coefficient b2 is
calculated as shown.
11
95
MULTICOLLINEARITY

Substitute for X3 wherever it appears.
12
96
MULTICOLLINEARITY

Using Variance Rule 4, we can drop the additive l
in the variance terms.
13
97
MULTICOLLINEARITY

Likewise, using Covariance Rules 1 and 3, we can
drop the additive l in the covariance terms.
14
98
MULTICOLLINEARITY

This is the last line from the previous slide.
15
99
MULTICOLLINEARITY

We can take m out of the variance terms, squaring
it as we do.
16
100
MULTICOLLINEARITY

Likewise we can take m out of the covariance
terms.
17
101
MULTICOLLINEARITY

Cov(X2, X2) is the same as Var(X2).
18
102
MULTICOLLINEARITY

It turns out that both the numerator and the
denominator are equal to zero. The regression
coefficient is not defined.
19
103
MULTICOLLINEARITY

It is unusual for there to be an exact
relationship among the explanatory variables in a
regression. When this occurs, it s typically
because there is a logical error in the
specification.
20
104
MULTICOLLINEARITY
. reg EARNINGS S ASVABC ASVAB5 Source
SS df MS Number of
obs 570 -----------------------------------
---- F( 3, 566) 27.66
Model 4909.11468 3 1636.37156
Prob gt F 0.0000 Residual 33487.9224
566 59.1659406 R-squared
0.1279 ---------------------------------------
Adj R-squared 0.1232 Total
38397.0371 569 67.4816117 Root
MSE 7.6919 ------------------------------
------------------------------------------------ E
ARNINGS Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- S .7115506 .1612235
4.413 0.000 .3948811 1.02822 ASVABC
.1104595 .0504223 2.191 0.029
.0114219 .2094972 ASVAB5 .0770794
.0463868 1.662 0.097 -.0140319
.1681908 _cons -5.944977 2.161409
-2.751 0.006 -10.19034
-1.699616 ----------------------------------------
--------------------------------------
However, if often happens that there is an
approximate relationship. Here is a regression
of EARNINGS on S, ASVABC, and ASVAB5. ASVAB5 is
the score on a speed test of the ability to
perform very simple arithmetical computations.
21
105
MULTICOLLINEARITY
. reg EARNINGS S ASVABC ASVAB5 Source
SS df MS Number of
obs 570 -----------------------------------
---- F( 3, 566) 27.66
Model 4909.11468 3 1636.37156
Prob gt F 0.0000 Residual 33487.9224
566 59.1659406 R-squared
0.1279 ---------------------------------------
Adj R-squared 0.1232 Total
38397.0371 569 67.4816117 Root
MSE 7.6919 ------------------------------
------------------------------------------------ E
ARNINGS Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- S .7115506 .1612235
4.413 0.000 .3948811 1.02822 ASVABC
.1104595 .0504223 2.191 0.029
.0114219 .2094972 ASVAB5 .0770794
.0463868 1.662 0.097 -.0140319
.1681908 _cons -5.944977 2.161409
-2.751 0.006 -10.19034
-1.699616 ----------------------------------------
--------------------------------------
Like ASVABC, the raw scores on this test were
scaled so that they had mean 50 and standard
deviation 10.
22
106
MULTICOLLINEARITY
. reg EARNINGS S ASVABC ASVAB5 Source
SS df MS Number of
obs 570 -----------------------------------
---- F( 3, 566) 27.66
Model 4909.11468 3 1636.37156
Prob gt F 0.0000 Residual 33487.9224
566 59.1659406 R-squared
0.1279 ---------------------------------------
Adj R-squared 0.1232 Total
38397.0371 569 67.4816117 Root
MSE 7.6919 ------------------------------
------------------------------------------------ E
ARNINGS Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- S .7115506 .1612235
4.413 0.000 .3948811 1.02822 ASVABC
.1104595 .0504223 2.191 0.029
.0114219 .2094972 ASVAB5 .0770794
.0463868 1.662 0.097 -.0140319
.1681908 _cons -5.944977 2.161409
-2.751 0.006 -10.19034
-1.699616 ----------------------------------------
--------------------------------------
The regression result indicates that an extra
year of schooling increases hourly earnings by
0.71.
23
107
MULTICOLLINEARITY
. reg EARNINGS S ASVABC ASVAB5 Source
SS df MS Number of
obs 570 -----------------------------------
---- F( 3, 566) 27.66
Model 4909.11468 3 1636.37156
Prob gt F 0.0000 Residual 33487.9224
566 59.1659406 R-squared
0.1279 ---------------------------------------
Adj R-squared 0.1232 Total
38397.0371 569 67.4816117 Root
MSE 7.6919 ------------------------------
------------------------------------------------ E
ARNINGS Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- S .7115506 .1612235
4.413 0.000 .3948811 1.02822 ASVABC
.1104595 .0504223 2.191 0.029
.0114219 .2094972 ASVAB5 .0770794
.0463868 1.662 0.097 -.0140319
.1681908 _cons -5.944977 2.161409
-2.751 0.006 -10.19034
-1.699616 ----------------------------------------
--------------------------------------
An extra point on ASVABC increases hourly
earnings by 0.11. Someone with a score one
standard deviation above the mean would therefore
tend to earn an extra 1.10 per hour, compared
with someone at the mean.
24
108
MULTICOLLINEARITY
. reg EARNINGS S ASVABC ASVAB5 Source
SS df MS Number of
obs 570 -----------------------------------
---- F( 3, 566) 27.66
Model 4909.11468 3 1636.37156
Prob gt F 0.0000 Residual 33487.9224
566 59.1659406 R-squared
0.1279 ---------------------------------------
Adj R-squared 0.1232 Total
38397.0371 569 67.4816117 Root
MSE 7.6919 ------------------------------
------------------------------------------------ E
ARNINGS Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- S .7115506 .1612235
4.413 0.000 .3948811 1.02822 ASVABC
.1104595 .0504223 2.191 0.029
.0114219 .2094972 ASVAB5 .0770794
.0463868 1.662 0.097 -.0140319
.1681908 _cons -5.944977 2.161409
-2.751 0.006 -10.19034
-1.699616 ----------------------------------------
--------------------------------------
An extra point on the numerical computation speed
test increases hourly earnings by 0.08.
25
109
MULTICOLLINEARITY
. reg EARNINGS S ASVABC ASVAB5 Source
SS df MS Number of
obs 570 -----------------------------------
---- F( 3, 566) 27.66
Model 4909.11468 3 1636.37156
Prob gt F 0.0000 Residual 33487.9224
566 59.1659406 R-squared
0.1279 ---------------------------------------
Adj R-squared 0.1232 Total
38397.0371 569 67.4816117 Root
MSE 7.6919 ------------------------------
------------------------------------------------ E
ARNINGS Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- S .7115506 .1612235
4.413 0.000 .3948811 1.02822 ASVABC
.1104595 .0504223 2.191 0.029
.0114219 .2094972 ASVAB5 .0770794
.0463868 1.662 0.097 -.0140319
.1681908 _cons -5.944977 2.161409
-2.751 0.006 -10.19034
-1.699616 ----------------------------------------
--------------------------------------
Does ASVAB5 belong in the earnings function? If
we perform a t test, we find that its coefficient
is just significantly different from zero at the
5 level, using a one-tailed test.
(Justification it is unlikely that a good score
on this test would adversely affect earnings.)
26
110
MULTICOLLINEARITY
. reg EARNINGS S ASVABC ASVAB5 Source
SS df MS Number of
obs 570 -----------------------------------
---- F( 3, 566) 27.66
Model 4909.11468 3 1636.37156
Prob gt F 0.0000 Residual 33487.9224
566 59.1659406 R-squared
0.1279 ---------------------------------------
Adj R-squared 0.1232 Total
38397.0371 569 67.4816117 Root
MSE 7.6919 ------------------------------
------------------------------------------------ E
ARNINGS Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- S .7115506 .1612235
4.413 0.000 .3948811 1.02822 ASVABC
.1104595 .0504223 2.191 0.029
.0114219 .2094972 ASVAB5 .0770794
.0463868 1.662 0.097 -.0140319
.1681908 _cons -5.944977 2.161409
-2.751 0.006 -10.19034
-1.699616 ----------------------------------------
--------------------------------------
We note that in this regression, the coefficient
of ASVABC is significant only at the 5 level.
27
111
MULTICOLLINEARITY
. reg EARNINGS S ASVABC Source SS
df MS Number of obs
570 ---------------------------------------
F( 2, 567) 39.98 Model
4745.74965 2 2372.87483 Prob gt
F 0.0000 Residual 33651.2874 567
59.3497133 R-squared
0.1236 ---------------------------------------
Adj R-squared 0.1205 Total
38397.0371 569 67.4816117 Root
MSE 7.7039 ------------------------------
------------------------------------------------ E
ARNINGS Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- S .7390366 .1606216
4.601 0.000 .4235506 1.054523 ASVABC
.1545341 .0429486 3.598 0.000
.0701764 .2388918 _cons -4.624749
2.0132 -2.297 0.022 -8.578989
-.6705095 ----------------------------------------
--------------------------------------
In the regression without ASVAB5, its t statistic
was 3.60, making it significantly different from
zero at the 0.1 level.
28
112
MULTICOLLINEARITY
. reg EARNINGS S ASVABC ASVAB5 Source
SS df MS Number of
obs 570 -----------------------------------
---- F( 3, 566) 27.66
Model 4909.11468 3 1636.37156
Prob gt F 0.0000 Residual 33487.9224
566 59.1659406 R-squared
0.1279 ---------------------------------------
Adj R-squared 0.1232 Total
38397.0371 569 67.4816117 Root
MSE 7.6919 ------------------------------
------------------------------------------------ E
ARNINGS Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- S .7115506 .1612235
4.413 0.000 .3948811 1.02822 ASVABC
.1104595 .0504223 2.191 0.029
.0114219 .2094972 ASVAB5 .0770794
.0463868 1.662 0.097 -.0140319
.1681908 _cons -5.944977 2.161409
-2.751 0.006 -10.19034
-1.699616 ----------------------------------------
--------------------------------------
. cor ASVABC ASVAB5 (obs570) ASVABC
ASVAB5 -------------------------- ASVABC
1.0000 ASVAB5 0.6371 1.0000
The reason for the reduction in its t ratio is
that it is highly correlated with ASVAB5.
29
113
MULTICOLLINEARITY
. reg EARNINGS S ASVABC ASVAB5 Source
SS df MS Number of
obs 570 -----------------------------------
---- F( 3, 566) 27.66
Model 4909.11468 3 1636.37156
Prob gt F 0.0000 Residual 33487.9224
566 59.1659406 R-squared
0.1279 -------
Write a Comment
User Comments (0)
About PowerShow.com