Title: Statistics and Computing 101
1Correlation
Week 3
2OBJECTIVES
- 1. TO DISCUSS THE DIFFERENCE BETWEEN CORRELATION
AND REGRESSION ANALYSIS - 2. TO DRAW SCATTER DIAGRAMS.
- 3. TO OBTAIN AND DISCUSS PEARSONS
- CORRELATION COEFFICIENT.
31. Examining Relationships
- We often see things that are related to one
another.
- Time spent studying / Examination Mark
- Mothers weight and her babys birthweight
- Tablet Weight and Tablet Potency
- Tablet Weight and Dissolution
4Examining Relationships
- Independent variable (also known as an
- explanatory or X variable) which is a
variable - that attempts to explain the variation in Y.
- Dependent variable (also known as a response or
Y variable) which measures the outcome of a study.
5Correlation and Regression
- The most common procedures for examining
relationships between quantitative variables
Is there a relationship between two (ore more)
variables, and if there is, what is the strength
of the relationship.
- Explore the nature of the relationship
- Develop a model that relates Y to X
- Predict the future values of Y variable
6Correlation and Regression
- Note Carefully
- Neither regression nor correlation can be
interpreted as establishing cause-effect
relationships. -
Correlation is probably the most abused concept
in statistics.
72. Scatterplot (Scatter Diagram)
Archaeopteryx the classic example of a
transitional form, in this case between reptiles
and birds, a so-called "missing link", according
to Charles Darwin who published his book on
evolutionary theory in 1859. Still today,
Archaeopteryx is a key text-book example of
evolutionary theory, and a landmark fossil for
palaeontologists and evolutionary biologists..
8Choice of variables
- In correlation the choice of which variable to
call X and which to call Y is arbitrary. - However in regression labelling is important
9Scatter Diagram (1)
10Scatter Diagram (2)
- What can we conclude by inspecting a scatter
diagram?
1) Is there a relationship? 2) Shape of the
relationship 3) Strength of the relationship 4)
Direction of the relationship
11Scatter Diagram (3)
No Relationship
12Scatter Diagram (4)
Perfect, Direct, Linear
Perfect, Inverse, Linear
133.0 Pearsons Correlation Coefficient
- The most widely used method of measuring
correlation is the - Pearsons Correlation Coefficient r
14Pearson Correlation Coefficient
- r is a measure of strength of the linear
relationship between the paired X and Y values in
a sample - it is a unitless (dimensionless) measure
- ranges from -1 to 1
15Pearson Correlation Coefficient
Perfect Inverse Correlation
Perfect Direct Correlation
No Linear Correlation
-1
1
0
-0.7
0.7
Strong Negative
Strong Positive
16Correlation - Calculation
Let us calculate the correlation coefficient for
our example 1 data. First traditionally
Formula
17Correlation - Minitab (1)
18Correlation - Minitab (2)
MTB gt Correlations (Pearson) Correlation of
Femur and Humerus 0.994 MTB gt
19Lecture Exercises
- Each of the following statements contains a
blunder. Explain what is wrong. - 1. There is a high correlation between the
gender of Aust workers and their income. - 2. We found a high correlation (r1.09) between
students rating of faculty teaching and ratings
made by other faculty members. - 3. The correlation between planning rate and
yield of corn was found to be r0.23 bushel.
20Last year exam question
If the correlation between two measures is 1.20,
we can conclude (a) The value of one measure
increases by 1.20 when the other increases by
1. (b) 1.20 of the observations fall on the
regression line. (c) 20 of the observations fall
on the regression line. (d) 20 of the
variation in one measure is accounted for by the
other. e) None of the above