Title: Understanding Best Fit Linear Regression
1Understanding Best Fit Linear Regression
- Dr. David Thomas
- Centenary College
- thomas_at_centenary.edu
2- Consider the graph pictured to the right. For
the seven data points shown, which of the lines
pictured is the line of best fit? - Line A
- Line B
- Line C
- The line of best fit is not pictured.
3- The table to the right contains data on the
amount of money spent on toys and sport supplies
in the US from 1990 through 1995. Use this data
to find the line of best fit.
4Results from Summer Program
- 14 students were given the preceding two
questions at the beginning of three weeks of
summer training and the same two questions at the
end of the training. - They scored as follows
- Question 1 Pretest 3 correct, Posttest 2
correct - Question 2 Pretest 4 correct, Posttest 10
correct
5Regression Analysis
- Regression analysis is concerned with finding a
mathematical model or formula that relates the
values of one variable to those of another. - Consider the box score data from Game 6 of the
2006 NBA Western Conference Finals on the
following slide.
6(No Transcript)
7- Consider the third column in the table (FG and
FGA). This represents two variables, field goals
made and field goals attempted. - Of the two variables, field goals made and field
goals attempted, which is independent and which
is dependent? - Plot field goals attempted on the x-axis and
field goals made on the y-axis. Plot one point
for each player.
8We want a formula that relates field goals made
to field goals attempted. Using your data draw a
line which you thinks best fits this data.
Find the equation of your line.
9Criterion Used
- What criteria did you use to pick your line?
One common measure of best fit is to make the
sum of the squares of the differences between the
actual y-values and the predicted y-values as
small as possible. (Hence the name least squares
fit)
10(No Transcript)
11Calculating the Sum of the Square of the
Differences
- Place the equation you thinks best fits the data
in Y1. - Enter the data for field goals attempted in L1.
- Enter the data for field goals made in L2.
- Calculate the square of the differences between
the actual and predicted values by entering
sum((L2 Y1(L1))2). The command sum is on the
MATH submenu of the LIST key, which is 2nd over
the STAT key.
12Using the Calculator to Find the Best Fit Line
- Use your calculator to find the linear regression
for the actual data. (Option 4 off of the CALC
submenu of the STAT key.) Enter this equation in
Y2. Calculate the sum of the square of the
differences as you did on the previous slide to
find the minimum sum. - Why do you think the difference between the
actual and predicted values is squared?
13The sum of the square of the differences is 5.76
This is the minimum value.
14The Theory Behind This
15(No Transcript)
16- Consider the graph pictured to the right. For
the seven data points shown, which of the lines
pictured is the line of best fit? - Line A
- Line B
- Line C
- The line of best fit is not pictured.
17Line C
Line A
Line B
18Correlation Coefficient
- The r-value listed with the results of the linear
regression is called the correlation coefficient.
It is used to measure how linear the actual
data is. If all the data points are on the same
line the r-value is 1 (positive correlation) or
1 (negative correlation).
19- Using the calculator enter assists in L5 and
rebounds in L6. - Turn off all other plots and functions. Graph
these points with L5 as the x-values and L6 as
the y-values. - Is there a correlation between these variables?
Does the number of rebounds depend on the number
of assists? - Use your calculator to find the linear regression
equation and the correlation coefficient.