Title: MULTIPLE REGRESSION OF TIME SERIES
1CHAPTER 10
- MULTIPLE REGRESSION OF TIME SERIES
- LINEAR MULTIPLE REGRESSION MODEL
- General Multiple Regression Model
2BIG CITY BOOKSTORE EXAMPLE
- Multiple Regression Model
- Coefficient of Multiple Determination-R2
- Partial Regression Coefficients
- Describing the Regression Plane
3THE MULTIPLE REGRESSION MODELING PROCESS
- MULTICOLLINEARITY
- Collinearity Among Variables
- Solutions Multicoll. Problems
- An Example Solution
- PARTIAL F-TEST FOR INCLUDING VARIABLES SERIAL
CORRELATION PROBLEMS Forecasting with Serially
Correlated Errors
4ANALYSIS OF STOCK INDEXES USING COILS
- ELASTICITIES AND LOGARITHMIC RELATIONSHIPS
- HETEROSCEDASTICITY
- Wrong Functional Form-UK and US Stock Indexes
Goldfeld-Quandt Test
5Interpretation of Elasticities
- WEIGHTED LEAST SQUARES
- GENERALIZED LEAST SQUARES
- BETA COEFFICIENTS
- DICHOTOMOUS (DUMMY) VARS. FOR MODELING EVENTS
- CONSTRUCTING CONFIDENCE AND PREDICTION INTERVALS
- PARSIMONY AND REGRESSION ANALYSIS
- AUTOMATED REGRESSION METHODS
6 CHAPTER 10 MULTIPLE REGRESSION OF TIME SERIES
- "Theorize longer, analyze shorter. Don't be in
a rush to run the program. Think about the model
from every angle, hypothesize how different
variables affect each other. When you have a
theory, then try it. Impatience is the enemy of
valid models. Contemplation is productive work."
The Author"Measure twice, cut once." The
Carpenter's Rule
7GENERAL LINEAR MULTIPLE REGRESSION
- General Multiple-Regression Model Y a
b1X1 b2X2...bnXn e (10-1)"n" is
rarely above 6 to 7.
8- BIG CITY BOOKSTORE EXAMPLE
- Table 10-1. Big City Bookstore
- Year Sales(Y) Advertising(X1) Competition
(X2) - (1000) (1000) (1000sq.ft.) 1
27 20 10 2
23 20 15 3
31 25 15 4
45 28 15 5
47 29 20 6 42
28 25 7 39
31 35 8 45
34 35 9 57
35 20 10 59
36 30 11 73
41 20 12 84 45
20
9Multiple Regression Model
- Correlation Matrix Sales
Advertising CompetitionSales 1
.964 .221Advertising .964
1 .426 Competition
.221 .426 1 - SALES f(ADV.) IGNORING COMPETITION y
-23.02 2.280X1 (10-2)
(-3.64) (11.5) (X1 advertising)
Syx 5.039 R2 .923 n12
F132.26 DW 1.13676
10- SALES f(COMP.) IGNORING ADVERTISING
- y 37.34 .477X2 (10-3)
(2.339) (.687) (X2competition)
Syx 18.574 R2 -.045 n 12 F
.507 DW .3767 - SALES f(ADV. , COMP.) SIMULTANEOUSLY
- y -18.80 2.525X1 - .545X2 (10-4)
(-4.879) (19.50) (-4.432) Syx 2.978
R2 .973 n 12 F 199.21 DW
1.7705
11-
- Table 10-2. Simple and Multiple Regression for
Big City Bookstorea) Linear
Regression Salesf(Advertising)-Eq.10-2
1 Dependent Variable SALES 2 Usable
Observations 12 Degs of Freedom 10 3 R
Bar2
0.9227 4 Std Error of Dependent Variable
18.1225 5 Standard Error of Estimate
5.0394 6 Sum of Squared Residuals
253.9530 7 Regression F(1,10)
132.26 8 Significance Level
of F 0.00000044 9
Durbin-Watson Statistic
1.13710 Variable Coeff Std Error T-Stat
Signif
11 Constant -23.02 6.316 -3.644
0.004512 ADVERT 2.28 0.198
11.500 0.0000
12- b) Linear Regression Salesf(Competition)-Eq.10-3
1 Dependent Variable
SALES 2 Usable Observations 12 Degs of Freedom
10 3 R Bar2
-0.050 4 Std Deviation of Dependent
Variable 18.123 5 Standard Error of
Estimate 18.574 6 Sum of
Squared Residuals 3449.780 7
Regression F(1,10) 0.472
8 Significance Level of F 0.5076
9 Durbin-Watson Statistic
0.37710 Variable Coeff Std Error T-Stat
Signif
11 Constant 37.3372 15.960 2.339
0.041412 COMP 0.4767 0.694
0.687 0.5076
13- c) Mult Regression Salesf(Adver.
Comp.)-Eq.10-4 1 Dependent
Variable SALES 2 Usable Observations 12 Degs
of Freedom 9 3 R Bar2
0.9730 4 Std Deviation of Dependent
Variable 18.1225 5 Standard Error of Estimate
2.978 6 Sum of Squared
Residuals 79.803 7 Regression
F(2,9) 199.2155 8
Significance Level of F
0.00000004 9 Durbin-Watson Statistic
1.77110 Variable Coeff
Std Error T-Stat Signif
11 Constant -18.7958
3.8520 -4.880 0.000912 ADVERT 2.5248
0.1295 19.495 0.000013 COMP
-0.5449 0.1230 -4.432 0.0016
14- Multiple Coefficient of Determination - R2
Expl Variance Unexp Var.
Syx2R2 1 - 1-
Total Variance Total Var. Sy2 - Partial Regression Coefficients Y -18.80
2.52530 - .545X2 e Y
-18.80 75.75 - .545X2 e 56.86 -
.545X2 e (10-5)
15- Figure 10-2 Here-Deviations About a Plane or
HyperspaceDescribing The Regression
PlaneFigure 10-3 Here Regression Plane for
Equation 10-3.Figure 10-4 Here Several Reg
Lines on the Reg PlaneFigure 10-5 The Multiple
Regression Modeling Process
16MULTIPLE REGRESSION MODELING-PLOTS
- Res. VS Included Indep Vars. Detect
heteroscedasticity, misspecification (Nlin) - Res. VS Excluded Indep Vars. Detect variable to
be included, misspecification (Nlin) - Res. VS Y. Detect serial correlation,
heteroscedasticity,misspecifications - Residualst VS Residualst-1. Detect serial
correlation, out of sample projections,
unreasonable forecasts.
17MULTICOLLINEARITY
- Highly Related Independent Variables may or may
not be a problem if a problem coefficient be
wrong also its standard error
Syx Sb1
(10-6) ?x2(1 -
r122) With r121, Impossible to fit unique
model because of their redundancy.
18- Multicollinearity Problems (MCP)
- May not be evident to the analyst
- Yields the wrong sign or insignificant t-values
- Avoid MCPs By
- Good theory
- Large sample sizes
- Good diagnostic procedures
- Sometimes MCP are simply an artifact of the sample
19DETECTING
- Insignificant/Incorrect regression coefficients
- Some strange regression results from MCP
- Assume correlation between X1 and X2 is high
- Each is highly correlated with Y
- Often one regression coefficient is negative
despite the positive relationship - Often one variable is highly significant the
other not - Often the sum of the regression coefficients
equals true, single variable regression
coefficient.
20COLLINEARITY AMONG MORE THAN TWO VARIABLES
- What is the problem? Consider that some for of
Linear Transformation of X2 and X3 perfectly
defines X1 - X1 a b2X2 b3X2
- with Syx 0, R2 1, and r123 1 Thus, when
an attempt is made to fit the following
relationship, a solution is not possible.
21- Y a b1X1 b2X2 b3X3 eWhen perfect
collinearity the estimation procedure aborts.
Often with Dichotomous Variable
t Yt d1
d2 d3 d4 1 10 1
0 0 0 2 20 0
1 0 0 3 30
0 0 1 0 4 5
0 0 0 1
22- Thus, d1 1 - d2 - d3 - d4
- Solution is impossible.
- Avoid-always defining one-less variable
- The last var. is part of constant
- Dummy vars. are studied here later
23Solutions to Multicollinearity Problems
- With redundant measures delete the redundant
variable. Good theory precludes most redundant
variables. - Some MCP are an artifact of a specific sample.
Then additional obs. may eliminate the problem. - MCP from flawed theories. When vars. represent
different dimensions of an influence then they
might be combined using factor analysis
24- When MCP is caused by a unique sample use ridge
regression. - When Theory dictates that both variables should
be included, then include them.While MCP
affects regression coefficients and their
interpretability, it might not alter the
predictive power of the regression model. That
is, the overall relationship may still be useful
in predictive power, this being confirmed by a
low standard error of estimate and high F-value.
To better understand this, consider the example
below.
25- An Example of MCPs (MULT.DAT)
- Table 10-3. Correlation Matrix
X1 X2 X3
X4 Y X1 1.0000 -0.1067
0.1821 0.9998 0.4622 X2 -0.1067 1.0000
0.1031 -0.1053 0.7479 X3 0.1821 0.1031
1.0000 0.1830 0.5334 X4 0.9998 -0.1053
0.1830 1.0000 0.4638 Y 0.4622 0.7479
0.5334 0.4638 1.0000
26- Table 10-4. Models Illustrating
Multicollinearity Problems - X1
X2 X3 X4 R2
F-value Syx t-values under each
coefficient) (significance)
M1
9.16 6.06 76.51
162.2 2166.5 (14.3)
(9.4) (.0000) M2 14.82
9.96 4.84 98.51
2188.9 544.9 (37.9) (61.26)
(29.33) (.0000) M3
9.95 4.83 14.83 98.53 2217.0
541.4 (61.61) (29.47)
(38.17) (.0000) M4 -7.07 9.94
4.83 21.89 98.52 1648.3 543.8
(-.38) (61.18) (29.30) (1.17)
(.0000)
27Partial F-test for Including Variables
- DETERMINING IF VARS. SHOULD BE IN A RELATIONSHIP
TEST WHETHER M-VARIABLES SHOULD BE INCLUDED
(SSER - SSEU)/m Fcalculated
(10-7)
SSEU/(n-k-1)
28- where SSEU Sum of Squared Errors with all
variables - in the relationship, called the
- unrestricted SSE SSER Sum of Sqed
Errors with m vars. - excluded, called the restricted SSE
k-1 Total no. of unrestricted indep.variables
m Number of restricted independent variables
a Chosen level of sign., typically .01 or .05
29- This test is used as follows
- Estimate a full, unrestricted k-var.
modelCapture the SSEU, - Estimate a partial, restricted model, k-m
var. Capture the SSER. - Calculate F using eq. 10-6 Compare to F-table
with df of (i.e., m, n-K-1) and alpha value,
that is Fm,n-k-1,a. - If F-cal gtF-table, then SSER is significantly
greater than the SSEU.
30- Denotes that
- unexpl. Var. Res. gt unexpl. Var. Unres.
- If F-cal lt F-table then SSER SSEu thus no
significant additional explained variance from
unrestricted model. - Again, If F-cal. gt F-table then SSER gt SSEu ,
there is additional explained variance from the
unrestricted model. - Consider Big City Bookstore W and W/O Competition
31- (SSER - SSEU) / m F-cal
SSEU / (n-K-1)
(253.9 - 79.8)/1
19.64 (10-7a)
79.8/(12-2-1) - F-cal 19.64 gtF-table Fm,n-K-1,a
F1,9,a.055.12 - F-cal 19.64 gt F-table F1,9,a .01 10.56
We infer include COMP. This is a powerful test
32SERIAL CORRELATION PROBLEMS
- An Assumption of OLS - residuals are independent.
That is, ACF(k) 0 for all k gt 0 - When et have ACF(k) 0 then there may be a
deficiency in model/estimationConsider Table
10-2a), b), and c). - Serial Correlation denotes the following may be
incorrectR2, Syx, Sb, b-t-values
33- First order serial correlation denotes
Yt a bXt ret-1 et
(10-8)where r is rho, the first-order
coefficient.In ARIMA terms r is actually q1
34How to estimate r?
- One of Several Iterative Processes-Including
Cochrane-Orcutt Iterative Least Squares
(COILS), Hildreth-Lu method, and Prais-Winston
methods. We Illustrate the COILS Method
COILS Given Yt a bXt
ret-1 et (10-9)
35- Therefore from et-1 Yt-1 -Y t-1 ret-1
r(Yt-1-Yt-1) r(Yt-1-(abXt-1
ret-2et-1)) (10-10) - substituting equation 10-10 into 10-9 yields
Yt abXtr(Yt-1-(a bXt-1ret-2 et-1))et - expanding and combining a's into a new term
- Yt a bXt rYt-1 - ra rbXt et
Yt - rYt-1 a bXt - rbXt-1 et (10-11) - reintroducing backshift operator
(1-B)Yt (Yt-Yt-1)
36- and therefore (1- rB)Yt
(Yt- rYt-1)therefore equation 10-11 can be
simplified to (1- rB)Yt a b(1- rB)Xt
et (10-12) - This is estimated iteratively by trial and error
using different values of r , called the
Cochrane-Orcutt Iterative Least Squares (COILS)
procedure.
37- COILS can be used with OLS Software
- Run OLS to determine first r (i.e., ACF(1) of et
). - Using r transform Yt and Xt to Yt and Xt
Yt Yt - rYt-1 (1- rB) Yt Xt Xt - rXt-1
(1- rB) Xt - Save these new variables for use in Yt
a b Xt et (10-13)(We lose one
observation in backshifting. The
Prais-Winston method does not.)
38- Estimate a and b using OLS in Eq. 10-11
- Iteratively Search for r with MIN(SSE)
- Using this r, use coef. of eq. 10-13 in eq. 10-8
However, remember that the a
a (1- r)aFigures 10-5 and 10-6
illustrate Xt and Yt
39- Table 10-5. OLS Between Y and X,
AR1DAT.DAT Usable
Observations 100 Degrees of Freedom 98R
Bar2
0.5924Std Error of Dependent Variable
2.898Standard Error of Estimate
1.850Sum of Squared Residuals
335.476Regression F(1,98)
144.895Significance Level of F
0.00000000Durbin-Watson Statistic
0.905Q(25)
61.009Significance
Level of Q 0.00008
Variable Coeff Std Error T-Stat
Signif
1. Constant 79.894 9.899 8.071
0.000000002. X 0.681 0.057
12.037 0.00000000
40- Table 10-6. ACFs of et for OLS of Table
10-5. 1 0.547 0.239
0.242 0.147 0.084 0.049 7 0.005 -0.003
-0.025 -0.141 -0.103 0.001
2Approx. 2SeACF
.20 100 -
- Using the ACF(1) of .547 yields Yt
yt - .55Yt-1 Xt xt -
.55Yt-1Regressing these two variables yields
Table 10-7.
41- Table 10-7. Y f(X) for r.55
Dependent Variable
Y-Estimation by Least SquaresUsable Obs. 99
Degs. of F. 97R Bar2
0.456 Std Error of
Dependent Variable 2.0089 Standard
Error of Estimate 1.4823
Sum of Squared Residuals
213.145 Regression F(1,97)
82.99 Significance Level of F
0.00000000Durbin-Watson Statistic
1.637 Q(24)
22.958
Significance Level of Q
0.5223 Variable Coeff Std Error
T-Stat Signif
1. Constant 49.731
4.3745 11.368 0.000000002. X
0.506 0.0555 9.110 0.00000000
42- Now, let's try r.45 and r.65
Yt yt - .45Yt-1 and Xt
xt - .45Yt-1Table 10-8. Y f(X) for
r.45 - Dependent Variable Y-Estimation by Least
SquaresUsable Obs 99 Degrees of
Freedom 97R Bar2
0.483 Standard Error of Estimate
1.519 Sum of Squared Residuals
223.879 Regression F(1,97)
92.44
Significance Level of F
0.00000000Durbin-Watson Statistic
1.478 Q(24)
26.128 Significance
Level of Q 0.34668
Variable Coeff Std Error T-Stat
Signif
1. Constant 57.403 5.4164 10.598
0.000000002. X 0.541 0.0563
9.615 0.00000000 This
is worse than r .55
43- Consider the r in the opposite direction, r
.65 Yt yt - .65Yt-1
and Xt xt -
.65Yt-1Table 10-9. Y f(X) for r.65
- Dependent Variable Y-Estimation by Least
SquaresUsable Obs 99 Degrees of
Freedom 97R Bar2
0.432 Standard Error of Estimate
1.465 Sum of Squared Residuals
208.201 Regression F(1,97)
75.52
Significance Level of F
0.00000000Durbin-Watson Statistic
1.795 Q(24)
23.328 Significance
Level of Q 0.5005
Variable Coeff Std Error T-Stat
Signif
1. Constant 40.549 3.353 12.09
0.000000002. X 0.476 0.055 8.69
0.00000000
44- Table 10-10. Iterations of r to Minimum SSE.
r SSE
D-W Statistic.00 535.5
.9053.45 223.88
1.478 .55 213.145
1.637.65 208.20
1.795.75 209.79
1.9331.85 218.626
2.033.95 235.22
2.082 denotes optimal
value of r in manual search.
45Forecasting With Serially Correlated Errors
- Yt a bXt r et-1 et Yt
40.55/(1- .65) .476Xt .65et-1 et
Yt 115.85 .476Xt .65et-1 et
(10-14) - Yt made at the end of period t-1 Yt 115.85
.476Xt .65et-1 Yt1 made at the
end of period t-1 Yt1115.85 .476Xt1
.65(0) (10-15)where et-1 is unknown in
period t1. - Cochrane-Orcutt Iterative Least Squares (COILS)
46- Table 10-11. Y f (X) COILS
Usable Obs 99
Degrees of Freedom 96R Bar2
0.744 Std Error of
Dependent Variable 2.910 Standard
Error of Estimate 1.472 Sum
of Squared Residuals 207.954
Durbin-Watson Statistic 1.835
Q(24)
24.037 Significance Level of Q
0.4018 Variable Coeff Std Error
T-Stat Signif
1. Constant 117.105 9.592
12.208 0.000000002. Xt 0.468 0.055
8.550 0.00000000
3. RHO(r) 0.677 0.077
8.815 0.00000000
47- Because r is so high, only a fraction of the
explained variance is attributed to Xt. - This R2 and RSE are indicative of one-period
forecast. After one period, the influence of r
declines to zero,The RSE (Standard Error of
Estimate) for Yt1 for Kgt1
48REVIEWING COILS
- OLS Yt 79.894 .681Xt et
RSE 1.850 DW0.905 - Correct Coefficients from COILS
Yt 117.105 .468Xt et RSE
1.472 DW1.835Figures 10-7 and 10-8 Here
49- Table 10-12. Y f(X) by OLS, ARDAT.DAT
Usable Obs 100 Degrees of
Freedom 98R Bar2
0.368 Std Error of Dependent Variable
1.644 Standard Error of Estimate
1.306 Sum of Squared
Residuals 167.243
Regression F(1,98)
58.70 Significance Level of F
0.00000000Durbin-Watson Statistic
0.211 Q(25)
310.561 Significance
Level of Q 0.00000000 Variable
Coeff Std Error T-Stat
Signif
1. Constant 192.602 12.077 15.948
0.000000002. Xt -0.907 0.118
-7.661 0.00000000
50- Table 10-13. Yf(X)-Estimation by
COILS - Usable Obs 99 Degrees of Freedom
96R Bar2
0.910Std Error of Dependent Variable
1.652 Standard Error of Estimate
0.497 Sum of Squared Residuals
23.6708 Durbin-Watson Statistic
2.012 Q(24)
25.823
Significance Level of Q
0.30927 Variable Coeff Std Error
T-Stat Signif
1. Constant 114.298 12.331
9.269 0.000000002. Xt -0.136
0.120 -1.133 0.25960000
3.
RHO(r) 0.952 0.032 29.582
0.00000000
51- This was generated using a random number
generator X0 100 Y0 100Xt
Xt-1 (1.5 - (RAN1 RAN2 RAN3))Yt Yt-1
(1.5 - (RAN4 RAN5 RAN6))ANALYSIS OF STOCK
INDEXES USING COILS