Linear Regression

About This Presentation

Title:

Linear Regression

Description:

Linear Regression Least Squares Method: the Meaning of r2 Linear Regression Least Squares Method: the Meaning of r2 We are given the following ordered pairs: (1.2,1 ... – PowerPoint PPT presentation

Number of Views:10

Avg rating:3.0/5.0

Slides: 15

Provided by: AnnN154

Category:

more less

Transcript and Presenter's Notes

Title: Linear Regression

1
Linear Regression

Least Squares Method
the Meaning of r2

2
We are given the following ordered pairs
(1.2,1), (1.3,1.6), (1.7,2.7), (2,2), (3,1.8),
(3,3), (3.8,3.3), (4,4.2). They are shown in the
scatterplot below
3
Now we show the line . .
This is the mean of the y values.
4
The line segments show the deviations from the
mean for each data point.
5
The squares of the deviations are shown
geometrically. Squaring has the consequence of
making each difference positive. The greater the
variation in y, the larger are the squares. If
the y values are close together, the squares will
be small.
6
This is the geometric representation of the sum
of the squares from the previous slide.
7
Now the best fit line is shown.
8
The directed distance is called the
residual. For each point this is the difference
between the actual y value and the predicted y
value. As with deviations, some residuals are
positive, some are negative. Together they add
to zero.
9
This graph gives a geometric representation of
the squares of the residuals. As with the
squares of the deviations this produces all
positive quantities.
10
This is the geometric representation of the sum
of the squares of the residuals. This quantity
is minimized in the least squares method of
linear regression. We use the line that produces
the smallest sum of the squares of the residuals.
11
We now see both the squares of the deviations
from the mean (green squares) and squares of
residuals (red squares).
12
This geometric representation of the sum of the
squares of the residuals (in red) shows that this
quantity is a portion of the total sum of the
squares of the deviations (imagine the entire
green square). All of the variation in y is
represented by the larger green square, and the
part that is not explainable by the regression
equation is in red.
The green square is called SST, the sum of the
squares, total, about the mean. The red square
is called SSE, the sum of the squares of the
error about the line.
SST
SSE
13
If we now measure the quantities SSE and SST, we
can make a useful calculation, the coefficient of
determination, or r2. Recall that SST is the
total sum of the squares of the deviations about
the mean value of y, and SSE is the sum of the
squares of the error (residuals) about the line.
n.b. This r2 is the square of the correlation
coefficient.
14
In our example, SST7.6 and SSE 2.44.
Therefore,
This means that 68 of the variation in y is
explained by the regression line. The meaning of
r2 is extremely important in Statistics.

Write a Comment

User Comments (0)