Title: P1253814549OPMVW
1V. Multivariate Linear Regression
A. The Basic Principle We consider the
multivariate extension of multiple linear
regression modeling the relationship between m
responses Y1,,Ym and a single set of r predictor
variables z1,,zr. Each of the m responses is
assumed to follow its own regression model,
i.e., Y1 B01 B11z1 B21z2 ? Br1zr Y2
B02 B12z1 B22z2 ? Br2zr ? ? ? Y1
B01 B11z1 B21z2 ? Br1zr where
2 Conceptually, we can let zj0, zj1, ,
zjr denote the values of the predictor
variables for the jth trial and
be the responses and errors for the jth trial.
Thus we have an n x (r 1) design matrix
3 If we now set
4 the multivariate linear regression model is
with
and
Note also that the m observed responses on the
jth trial have covariance matrix
5 The ordinary least squares estimates b are found
in a manner analogous to the univariate case we
begin by taking
collecting the univariate least squares
estimates yields
Now for any choice of parameters
the resulting matrix of errors is
6 The resulting Error Sums of Squares and
Crossproducts is
We can show that the selection b(i) b(i)
minimizes the ith diagonal sum of squares
generalized variance
i.e.,
are both minimized.
7 so we have matrices of predicted values
and we have a resulting matrices of residuals
Note that the orthogonality conditions among
residuals, predicted values, and columns of the
design matrix which hold in the univariate case
are also true in the multivariate case because
8 which means the residuals are perpendicular to
the columns of the design matrix
and to the predicted values
Furthermore, because
we have
predicted sums of squares and crossproducts
residual (error) sums of squares and crossproducts
total sums of squares and crossproducts
9 Example suppose we had the following six
sample observations on two independent variables
(palatability and texture) and two dependent
variables (purchase intent and overall quality)
Use these data to estimate the multivariate
linear regression model for which palatability
and texture are independent variables while
purchase intent and overall quality are the
dependent variables
10 We wish to estimate Y1 B01 B11z1
B21z2 and Y2 B02 B12z1
B22z2 jointly. The design matrix is
11 so
and
12 and
so
13 and
so
14 so
This gives us estimated values matrix
15 and residuals matrix
Note that each column sums to zero!
16B. Inference in Multivariate Regression The
least squares estimators b b(1) b(2)
?b(m) of the multivariate regression model
have the following properties - -
- if the model is of full rank, i.e., rank(Z)
r 1 lt n. Note that e and b are also
uncorrelated.
17 This means that, for any observation z0
is an unbiased estimator, i.e.,
We can also determine from these properties that
the estimation errors
have covariances
18 Furthermore, we can easily ascertain that
i.e., the forecasted vector Y0 associated with
the values of the predictor variables z0 is an
unbiased estimator of Y0. The forecast errors
have covariance
19 Thus, for the multivariate regression model with
full rank (Z) r 1, n ? r 1 m, and
normally distributed errors e,
is the maximum likelihood estimator of b and
where the elements of S are
20 Also, the maximum likelihood estimator of b is
independent of the maximum likelihood estimator
of the positive definite matrix S given by
and
all of which provide additional support for
using the least squares estimate when the
errors are normally distributed
are the maximum likelihood estimators of
21 These results can be used to develop likelihood
ratio tests for the multivariate regression
parameters. The hypothesis that the responses
do not depend on predictor variables zq1,
zq2,, zr is
(q 1) x m
Big Beta (2)
(r - q) x m
If we partition Z in a similar manner
m x (q 1)
m x (r - q)
22 we can write the general model as
The extra sum of squares associated with b(2) are
where
and
23 The likelihood ratio for the test of the
hypothesis H0b(2) 0 is given by the ratio
of generalized variances
which is often converted to Wilks Lambda
statistic
24 Finally, for the multivariate regression model
with full rank (Z) r 1, n ? r 1 m,
normally distributed errors e, and the null
hypothesis is true (so n(S1 S) Wq,r-q(S))
when n r and n m are both large.
25 If we again refer to the Error Sum of Squares
and Crossproducts as E nS and the Hypothesis
Sum of Squares and Crossproducts as H n(S1 -
S) then we can define Wilks lambda as
where h1 ? h2 ? ? ? hs are the ordered
eigienvalues of HE-1 where s min(p, r - q).
26 There are other similar tests (as we have seen
in our discussion of MANOVA)
Pillais Trace Hotelling-Lawley Trace Roys
Greatest Root
Each of these statistics is an alternative to
Wilks lambda and perform in a very similar
manner (particularly for large sample sizes).
27 Example For our previous data (the following
six sample observations on two independent
variables - palatability and texture - and two
dependent variables - purchase intent and overall
quality
to test the hypotheses that i) palatability has
no joint relationship with purchase intent and
overall quality and ii) texture has no joint
relationship with purchase intent and overall
quality.
28 We first test the hypothesis that palatability
has no joint relationship with purchase intent
and overall quality, i.e., H0b(1) 0 The
likelihood ratio for the test of this hypothesis
is given by the ratio of generalized variances
For ease of computation, well use the Wilks
lambda statistic
29 The error sum of squares and crossproducts
matrix is
and the hypothesis sum of squares and
crossproducts matrix for this null hypothesis is
30 so the calculated value of the Wilks lambda
statistic is
31 The transformation to a Chi-square distributed
statistic (which is actually valid only when n
r and n m are both large) is
at a 0.01 and m(r - q) 1 degrees of freedom,
the critical value is 9.210351 - we have a strong
non-rejection. Also, the approximate p-value of
this chi-square test is 0.630174 note that this
is an extremely gross approximation (since n r
4 and n m 4).
32 We next test the hypothesis that texture has no
joint relationship with purchase intent and
overall quality, i.e., H0b(2) 0 The
likelihood ratio for the test of this hypothesis
is given by the ratio of generalized variances
For ease of computation, well use the Wilks
lambda statistic
33 The error sum of squares and crossproducts
matrix is
and the hypothesis sum of squares and
crossproducts matrix for this null hypothesis is
34 so the calculated value of the Wilks lambda
statistic is
35 The transformation to a Chi-square distributed
statistic (which is actually valid only when n
r and n m are both large) is
at a 0.01 and m(r - q) 1 degrees of freedom,
the critical value is 9.210351 - we have a strong
non-rejection. Also, the approximate p-value of
this chi-square test is 0.925701 - note that this
is an extremely gross approximation (since n r
4 and n m 4).
36SAS code for a Multivariate Linear Regression
Analysis
OPTIONS LINESIZE 72 NODATE PAGENO 1 DATA
stuff INPUT z1 z2 y1 y2 LABEL z1'Palatability
Rating' z2'Texture Rating'
y1'Overall Quality Rating' y2'Purchase
Intent' CARDS 65 71 63 67 72 77 70 70 77 73 72 7
0 68 78 75 72 81 76 89 88 73 87 76 77 PROC GLM
DATAstuff MODEL y1 y2 z1 z2/ MANOVA Hz1
z2/PRINTE PRINTH TITLE4 'Using PROC GLM for
Multivariate Linear Regression' RUN
37 SAS output for a Multivariate Linear Regression
Analysis
Dependent Variable y1
Overall Quality Rating
Sum of Source DF
Squares Mean Square F Value Pr gt F Model
2 256.5203092
128.2601546 3.37 0.1711 Error
3 114.3130241 38.1043414 Corrected
Total 5 370.8333333
R-Square Coeff Var Root MSE y1
Mean 0.691740 8.322973
6.172871 74.16667 Source
DF Type I SS Mean Square F Value Pr
gt F z1 1 234.6482940
234.6482940 6.16 0.0891 z2
1 21.8720152 21.8720152 0.57
0.5037 Source DF Type III
SS Mean Square F Value Pr gt F z1
1 214.9618676 214.9618676
5.64 0.0980 z2 1
21.8720152 21.8720152 0.57
0.5037 Dependent Variable y1 Overall Quality
Rating
Standard Parameter Estimate
Error t Value Pr gt t Intercept
-37.50120546 48.82448511 -0.77
0.4984 z1 1.13458373
0.47768661 2.38 0.0980 z2
0.37949941 0.50090335 0.76
0.5037
38 SAS output for a Multivariate Linear Regression
Analysis
Dependent Variable y2
Purchase Intent
Sum of Source DF
Squares Mean Square F Value Pr gt F Model
2 181.4905702 90.7452851
2.51 0.2289 Error 3
108.5094298 36.1698099 Corrected Total
5 290.0000000 R-Square Coeff
Var Root MSE y2 Mean
0.625830 8.127208 6.014134
74.00000 Source DF Type I
SS Mean Square F Value Pr gt F z1
1 162.7322835 162.7322835
4.50 0.1241 z2 1
18.7582867 18.7582867 0.52
0.5235 Source DF Type III SS
Mean Square F Value Pr gt F z1
1 147.8282325 147.8282325
4.09 0.1364 z2 1
18.7582867 18.7582867 0.52
0.5235 Dependent Variable y2 Purchase
Intent
Standard Parameter Estimate
Error t Value Pr gt t Intercept
-21.43229335 47.56894895 -0.45
0.6829 z1 0.94088063
0.46540276 2.02 0.1364 z2
0.35144979 0.48802247 0.72
0.5235
39 SAS output for a Multivariate Linear Regression
Analysis
The GLM Procedure
Multivariate Analysis of Variance
E Error SSCP Matrix
y1
y2 y1 114.31302415
99.335143683 y2
99.335143683 108.5094298 Partial
Correlation Coefficients from the Error SSCP
Matrix / Prob gt r DF 3
y1 y2 y1
1.000000 0.891911
0.1081
y2 0.891911 1.000000
0.1081
40 SAS output for a Multivariate Linear Regression
Analysis
The GLM Procedure
Multivariate Analysis of
Variance H Type III SSCP
Matrix for z1
y1 y2 y1
214.96186763 178.26225891
y2 178.26225891 147.82823253
Characteristic Roots and Vectors of E Inverse
H, where H Type III SSCP
Matrix for z1 E Error
SSCP Matrix Characteristic
Characteristic Vector V'EV1
Root Percent y1 y2
1.89573606 100.00 0.10970859
-0.01905206 0.00000000 0.00
-0.17533407 0.21143084 MANOVA
Test Criteria and Exact F Statistics
for the Hypothesis of No Overall z1 Effect
H Type III SSCP Matrix for z1
E Error SSCP Matrix
S1 M0 N0 Statistic
Value F Value Num DF Den
DF Pr gt F Wilks' Lambda 0.34533534
1.90 2 2 0.3453 Pillai's Trace
0.65466466 1.90 2 2
0.3453 Hotelling-Lawley Trace 1.89573606
1.90 2 2 0.3453 Roy's Greatest Root
1.89573606 1.90 2 2 0.3453
41 SAS output for a Multivariate Linear Regression
Analysis
The GLM Procedure
Multivariate Analysis of Variance
H Type
III SSCP Matrix for z2
y1 y2 y1
21.872015222 20.255407498
y2 20.255407498 18.758286731
Characteristic Roots and Vectors of E
Inverse H, where H Type
III SSCP Matrix for z2 E
Error SSCP Matrix Characteristic
Characteristic Vector V'EV1
Root Percent y1
y2 0.19454961 100.00
0.06903935 0.02729059 0.00000000
0.00 -0.19496558 0.21052601
MANOVA Test Criteria and Exact F
Statistics for the Hypothesis of
No Overall z2 Effect H Type
III SSCP Matrix for z2 E
Error SSCP Matrix
S1 M0 N0 Statistic
Value F Value Num DF Den DF Pr gt F Wilks'
Lambda 0.83713560 0.19 2
2 0.8371 Pillai's Trace 0.16286440
0.19 2 2 0.8371 Hotelling-Lawley
Trace 0.19454961 0.19 2 2
0.8371 Roy's Greatest Root 0.19454961
0.19 2 2 0.8371
42 We can also build confidence intervals for the
predicted mean value of Y0 associated with z0 -
if the model
has normal errors, then
and
independent
so
43 Thus the 100(1 a) confidence interval for the
predicted mean value of Y0 associated with z0
(bz0) is given by
and the 100(1 a) simultaneous confidence
intervals for the mean value of Yi associated
with z0 (z0 b(i) ) are
i 1,,m
44 Finally, we can build prediction intervals for
the predicted value of Y0 associated with z0
here the prediction error
has normal errors, then
and
independent
so
45 the prediction intervals the 100(1 a)
prediction interval associated with z0 is given by
and the 100(1 a) simultaneous prediction
intervals with z0 are
i 1,,m