Title: Linear regression
1Linear regression
2- Suppose we found the age and weight of a sample
of 10 adults. - Create a scatterplot of the data below.
- Is there any relationship between the age and
weight of these adults?
Age 24 30 41 28 50 46 49 35 20 39
Wt 256 124 320 185 158 129 103 196 110 130
3- Suppose we found the height and weight of a
sample of 10 adults. - Create a scatterplot of the data below.
- Is there any relationship between the height and
weight of these adults?
Is it positive or negative? Weak or strong?
Ht 74 65 77 72 68 60 62 73 61 64
Wt 256 124 320 185 158 129 103 196 110 130
4The closer the points in a scatterplot are to a
straight line - the stronger the relationship.
The farther away from a straight line the
weaker the relationship
5Identify as having a positive association, a
negative association, or no association.
- Heights of mothers heights of their adult
daughters
-
- Age of a car in years and its current value
- Weight of a person and calories consumed
- Height of a person and the persons birth month
NO
- Number of hours spent in safety training and the
number of accidents that occur
-
6Correlation Coefficient (r)-
- A quantitative assessment of the strength
direction of the linear relationship between
bivariate, quantitative data - Pearsons sample correlation is used most
- parameter - r (rho)
- statistic - r
7Speed Limit (mph) 55 50 45 40 30 20
Avg. of accidents (weekly) 28 25 21 17 11 6
Calculate r. Interpret r in context.
There is a strong, positive, linear relationship
between speed limit and average number of
accidents per week.
8Properties of r(correlation coefficient)
- legitimate values of r is -1,1
9value of r does not depend on the unit of
measurement for either variable
- x (in mm) 12 15 21 32 26 19 24
- y 4 7 10 14 9 8 12
- Find r.
- Change to cm find r.
The correlations are the same.
10value of r does not depend on which of the two
variables is labeled x
- x 12 15 21 32 26 19 24
- y 4 7 10 14 9 8 12
- Switch x y find r.
The correlations are the same.
11value of r is non-resistant
x 12 15 21 32 26 19 24 y 4 7 10 14 9 8 22 Fi
nd r.
Outliers affect the correlation coefficient
12value of r is a measure of the extent to which x
y are linearly related
- A value of r close to zero does not rule out any
strong relationship between x and y.
r 0, but has a definite relationship!
13 Minister data (Data on Elmo)
r .9999
So does an increase in ministers cause an
increase in consumption of rum?
14Correlation does not imply causation
- Correlation does not imply causation
Correlation does not imply causation