Title: Chapter 7 Part 1
1Chapter 7 -Part 1
2Correlation Topics
- Co-relationship between two variables.
- Linear vs Curvilinear relationships
- Positive vs Negative relationships
- Strength of relationship
3Mythical relationship between Baseball and
Football performance
Football skill predicts baseball skill.
There is a strong relationship.
Baseball skill Very good Very poor Good Terrible P
oor Average Excellent
Football skill Very good Very poor Good Terrible P
oor Average Excellent
Al Ben Chuck David Ed Frank George
Baseball skill predicts football skill.
Is this a linear relationship?
4First we must arrange the scores in order
Football skill Terrible Very Poor Poor Average Goo
d Very Good Excellent
Baseball skill Terrible Very Poor Poor Average Goo
d Very Good Excellent
David Ben Ed Frank Chuck Al George
5Then we plot the scores
Football Skill
George
Baseball Skill
This is definitely a linear relationship!
David
6Lets get more abstract?
Football Skill
Y
X
Baseball Skill
7Linear or nonlinear? Lets look at another set of
values.
Football skill Terrible Average Average Very
Good Excellent Good Poor
Is this a linear relationship?
8Is this linear?
Football Skill
Chuck
Frank
Al
Baseball Skill
Ben
Ed
George
NO! It is best described by a curved line. It is
a curvilinear relationship!
David
9Positive vs Negative relationships
- In a positive relationship, as one value
increases the other value tends to increase as
well. Example The longer a sailboat is, the
more it tends to cost. As length goes up, price
tends to go up. - In a negative relationship, as one value
increases, the other value decreases.Example
The older a sailboat is, the less it tends to
cost. As years go up, price tends to go down.
10(No Transcript)
11Positive vs Negative scatterplot
12Correlation Characteristics
Linear vs Curvilinear
13The strength of a relationship tells us
approximately how the dots will fall around a
best fitting line.
- Perfect - scores fall exactly on a straight
line. - Strong - most scores fall near the line.
- Moderate - some are near the line, some not.
- Weak lots of scores fall close to the line, but
many fall quite far from it. - Independent - the scores are not close to the
line and form a circular or square pattern
14Strength of a relationship
15Strength of a relationship
16Strength of a relationship
Moderate
17Strength of a relationship
18What is this relationship?
19What is this?
20What is this?
21What is this?
22Comparing apples to oranges? Use t scores!
- You can use correlation to look for the
relationship between ANY two values that you can
measure of a single subject. - However, there may not be any relationship
(independent). - A correlation tells us if scores are consistently
similar on two measures, consistently different
from each other, or have no real pattern
23Comparing apples to oranges? Use t scores!
- To compare scores on two different variables, you
transform them into tX and tY scores. - tX and tY scores can be directly compared to each
other to see whether they are consistently
similar, consistently quite different, or show no
consistent pattern of similarity or difference
24Similar tX and tY scores positive correlation.
dissimilar negative correlation. No pattern
independence.
- When t scores are consistently more similar than
different, we have a positive correlation. - When t scores are consistently more different
than similar, we have a negative correlation. - When t scores show no consistent pattern of
similarity or difference, we have independence.
25Comparing variables
- Anxiety symptoms, e.g., heartbeat, with number of
hours driving to class. - Hat size with drawing ability.
- Math ability with verbal ability.
- Number of children with IQ.
- Turn them all into t scores
26Pearsons Correlation Coefficient
- coefficient - noun, a number that serves as a
measure of some property. - The correlation coefficient indexes the
consistency and direction of a correlation - Pearsons rho (?) is the parameter that
characterizes the strength and direction of a
linear relationship (and only a linear
relationship) between two population variables. - Pearsons r is a least squares, unbiased estimate
of rho.
27Pearsons Correlation Coefficient
- r and rho vary from -1.000 to 1.000.
- A negative value indicates a negative
relationship a positive value indicates a
positive relationship. - Values of r close to 1.000 or -1.000 indicate a
strong (consistent) relationship values close
to 0.000 indicate a weak (inconsistent) or
independent relationship.
28r, strength and direction
Perfect, positive 1.00 Strong, positive
.75 Moderate, positive .50 Weak, positive
.25 Independent .00 Weak, negative -
.25 Moderate, negative - .50 Strong, negative
- .75 Perfect, negative -1.00
29Calculating Pearsons r
- Select a random sample from a population obtain
scores on two variables, which we will call X and
Y. - Convert all the scores into t scores.
30Calculating Pearsons r
- First, subtract the tY score from the tX score in
each pair. - Then square all of the differences and add them
up, that is, ?(tX - tY)2.
31Calculating Pearsons r
- Estimate the average squared distance between ZX
and ZY by dividing by the sum of squared
differences by(nP - 1), that is, ?(tX - tY)2 /
(nP - 1) - To turn this estimate into Pearsons r, use the
formula r 1 - (1/2 ?(tX - tY)2 / (nP - 1))
32Note seeming exception
- Usually we divide a sum of squared deviations
around a mean by df to estimate the variance. - Here the sum of squares is not around a mean and
we are not estimating a variance. - So you divide ?(tX - tY)2 by (nP - 1)
- nP - 1 is not df for corr regression (dfREG
nP - 2)
33Example Calculate t scores for X
DATA 2 4 6 8 10
MSW 40.00/(5-1) 10
sX 3.16
34Calculate t scores for Y
DATA 9 11 10 12 13
MSW 10.00/(5-1) 2.50
sY 1.58
35Calculate r
tY -1.26 0.00 -0.63 0.63 1.26
tX -1.26 -0.63 0.00 0.63 1.26
tX - tY 0.00 -0.63 0.63 0.00 0.00
(tX - tY)2 0.00 0.40 0.40 0.00 0.00
This is a very strong, positive relationship.
? (tX - tY)2 / (nP - 1)0.200
r 1.000 - (1/2 (? (tX - tY)2 / (nP - 1)))
r 1.000 - (1/2 .200)
1 - .100 .900