Title: Linear regression in matrix terms
1Linear regression in matrix terms
2The regression function in matrix terms
3Simple linear regression function
4Simple linear regression function written
succinctly in matrix notation
5Definition of a matrix
An rc matrix is a rectangular array of symbols
or numbers arranged in r rows and c columns. A
matrix is almost always denoted by a single
capital letter in boldface type.
6Definition of a vector and a scalar
A column vector is an r1 matrix, that is, a
matrix with only column.
A row vector is an 1c matrix, that is, a matrix
with only row.
A 11 matrix is called a scalar, but its just
an ordinary number, such as 29 or s2.
7Matrix multiplication
- The Xß in the regression function is an example
of matrix multiplication. - Two matrices can be multiplied together only if
the number of columns of the first matrix equals
the number of rows of the second matrix. - The number of rows of the resulting matrix equals
the number of rows of the first matrix. - The number of columns of the resulting matrix
equals the number of columns of the second matrix.
8Matrix multiplication
- If A is a 23 matrix and B is a 35 matrix then
the matrix multiplication AB is possible. The
resulting matrix CAB has (how many?) rows and
(how many?) columns. - Is the matrix multiplication BA possible?
- If X is an np matrix and ß is a p1 column
vector, then Xß is
9Matrix multiplication
The entry in the ith row and jth column of C is
the inner product (element-by-element products
added together) of the ith row of A with the jth
column of B. For example
10The Xß multiplication in simple linear
regression setting
11Matrix addition
- The Xße in the regression function is an example
of matrix addition. - Simply add the corresponding elements of the two
matrices, for example, add the entry in the first
row, first column of the first matrix with the
entry in the first row, first column of the
second matrix, and so on. - Two matrices can be added together only if they
have the same number of rows and columns.
12Matrix addition
13The Xße addition in the simple linear
regression setting
14Multiple linear regression function written
succinctly in matrix notation
15Least squares estimates of the parameters
16Least squares estimates
The p1 vector containing the estimates of the p
parameters can be shown to equal (succinctly)
where X' is the transpose of the X matrix and
(X'X)-1 is the inverse of the X'X matrix.
17Definition of the transpose of a matrix
The transpose of a matrix A is a matrix, denoted
A' or AT, whose rows are the columns of A and
whose columns are the rows of A all in the same
original order.
18The X'X matrix in the simple linear regression
setting
19Definition of the identity matrix
The (square) nn identity matrix, denoted In, is
a matrix with 1s on the diagonal and 0s
elsewhere.
The identity matrix plays the same role as the
number 1 in ordinary arithmetic. What is the
value of ?
20Definition of the inverse of a matrix
The inverse A-1 of a square matrix (!!) A is the
unique matrix such that
21Least squares estimates in simple linear
regression setting
soap suds sosu soap2 4.0 33 132.0
16.00 4.5 42 189.0 20.25 5.0 45 225.0
25.00 5.5 51 280.5 30.25 6.0 53 318.0
36.00 6.5 61 396.5 42.25 7.0 62 434.0
49.00 --- --- ----- ----- 38.5 347 1975.0
218.75
22Least squares estimates in simple linear
regression setting
The regression equation is suds - 2.68 9.50
soap
23Linear dependence and rank
If none of the columns can be written as a linear
combination of another, then we say the columns
are linearly independent.
The rank of a matrix is the maximum number of
linearly independent columns in the matrix.
24Linear dependence is not always obvious
25Implications of linear dependence on regression
- The inverse of a square matrix exists only if the
columns are linearly independent. - Since the regression estimate b depends on
(X'X)-1, the parameter estimates b0, b1, ,
cannot be (uniquely) determined if some of the
columns of X are linearly dependent.
26Implications of linear dependence
soap1 soap2 suds 4.0 8 33 4.5 9
42 5.0 10 45 5.5 11 51 6.0 12
53 6.5 13 61 7.0 14 62
27Fitted values and residuals
28Fitted values
29Fitted values
30The residual vector
31The residual vector written as a function of the
hat matrix