Title: Linear regression models in matrix terms
1Linear regression modelsin matrix terms
2The regression function in matrix terms
3Simple linear regression function
4Simple linear regression function in matrix
notation
5Definition of a matrix
An rc matrix is a rectangular array of symbols
or numbers arranged in r rows and c columns. A
matrix is almost always denoted by a single
capital letter in boldface type.
6Definition of a vector and a scalar
A column vector is an r1 matrix, that is, a
matrix with only one column.
A row vector is an 1c matrix, that is, a matrix
with only one row.
A 11 matrix is called a scalar, but its just
an ordinary number, such as 29 or s2.
7Matrix multiplication
- The Xß in the regression function is an example
of matrix multiplication. - Two matrices can be multiplied together only if
- the of columns of the first matrix equals the
of rows of the second matrix. - Then
- of rows of the resulting matrix equals of
rows of first matrix. - of columns of the resulting matrix equals of
columns of second matrix.
8Matrix multiplication
- If A is a 23 matrix and B is a 35 matrix then
matrix multiplication AB is possible. The
resulting matrix C AB has rows and columns. - Is the matrix multiplication BA possible?
- If X is an np matrix and ß is a p1 column
vector, then Xß is
9Matrix multiplication
The entry in the ith row and jth column of C is
the inner product (element-by-element products
added together) of the ith row of A with the jth
column of B.
10The Xß multiplication in simple linear
regression setting
11Matrix addition
- The Xße in the regression function is an example
of matrix addition. - Simply add the corresponding elements of the two
matrices. - For example, add the entry in the first row,
first column of the first matrix with the entry
in the first row, first column of the second
matrix, and so on. - Two matrices can be added together only if they
have the same number of rows and columns.
12Matrix addition
13The Xße addition in the simple linear
regression setting
14Multiple linear regression functionin matrix
notation
15Least squares estimates of the parameters
16Least squares estimates
The p1 vector containing the estimates of the p
parameters can be shown to equal
where (X'X)-1 is the inverse of the X'X matrix
and X' is the transpose of the X matrix.
17Definition of the transpose of a matrix
The transpose of a matrix A is a matrix, denoted
A' or AT, whose rows are the columns of A and
whose columns are the rows of A all in the same
original order.
18The X'X matrix in the simple linear regression
setting
19Definition of the identity matrix
The (square) nn identity matrix, denoted In, is
a matrix with 1s on the diagonal and 0s
elsewhere.
The identity matrix plays the same role as the
number 1 in ordinary arithmetic.
20Definition of the inverse of a matrix
The inverse A-1 of a square (!!) matrix A is the
unique matrix such that
21Least squares estimates in simple linear
regression setting
Find X'X.
22Least squares estimates in simple linear
regression setting
Find inverse of X'X.
Its very messy to determine inverses by hand.
We let computers find inverses for us.
Therefore
23Least squares estimates in simple linear
regression setting
Find X'Y.
24Least squares estimates in simple linear
regression setting
25Linear dependence
If none of the columns can be written as a linear
combination of another, then we say the columns
are linearly independent.
26Linear dependence is not always obvious
Formally, the columns a1, a2, , an of an nn
matrix are linearly dependent if there are
constants c1, c2, , cn, not all 0, such that
27Implications of linear dependence on regression
- The inverse of a square matrix exists only if the
columns are linearly independent. - Since the regression estimate b depends on
(X'X)-1, the parameter estimates b0, b1, ,
cannot be (uniquely) determined if some of the
columns of X are linearly dependent.
28The main point about linear dependence
- If the columns of the X matrix (that is, if two
or more of your predictor variables) are linearly
dependent (or nearly so), you will run into
trouble when trying to estimate the regression
function.
29Implications of linear dependenceon regression
30Fitted values and residuals
31Fitted values
32Fitted values
33The residual vector
34The residual vector written as a function of the
hat matrix
35Sum of squares and the analysis of variance table
36Analysis of variance table in matrix terms
Source DF SS MS F
Regression p-1
Error n-p
Total n-1
37Sum of squares
In general, if you pre-multiply a vector by its
transpose, you get a sum of squares.
38Error sum of squares
39Error sum of squares
40Total sum of squares
Previously, wed write
41An example oftotal sum of squares
If n 2
42Analysis of variance table in matrix terms
Source DF SS MS F
Regression p-1
Error n-p
Total n-1
43Model assumptions
44Error term assumptions
- As always, the error terms ei are
- independent
- normally distributed (with mean 0)
- with equal variances s2
- Now, how can we say the same thing using matrices
and vectors?
45Error terms as a random vector
The n1 random error term vector, denoted as e,
is
46The mean (expectation) of the random error term
vector
Definition
Assumption
Definition
47The variance of the random error term vector
The nn variance matrix, denoted as s2(e), is
defined as
Diagonal elements are variances of the
errors. Off-diagonal elements are covariances
between errors.
48The ASSUMED variance of the random error term
vector
BUT, we assume error terms are independent
(covariances are 0), and have equal variances
(s2).
49Scalar by matrix multiplication
Just multiply each element of the matrix by the
scalar.
50The ASSUMED variance of the random error term
vector
51The general linear regression model
- where
- Y is a ( ) vector of response values
- ß is a ( ) vector of unknown parameters
- X is an ( ) matrix of predictor values
- e is an ( ) vector of independent, normal
error terms with mean 0 and (equal) variance s2I.