Title: Correlation
1Correlation
- The Association Between Variables
2When to Use
- t-test or
ANOVA - When the independent variable is Categorical
3When to Use
- Correlation
- Regression
- When the independent variable is Ratio or Interval
4ScatterPlots
5ScatterPlots
6Times and costs for five word-processing jobs
7Four data points
8Direct Relationship Positive Slope
9Age and price data for a sample of 11 used cars
10Scatter diagram for the age and price data of
used cars
11Inverse Relationship Negative Slope
12Various degrees of linear correlation (Slide 1 of
3)
13Various degrees of linear correlation (Slide 3 of
3)
14Various degrees of linear correlation (Slide 2 of
3)
15Examples of positive and negative relationships
16The Simple Idea
- If the corresponding x and y z-scores are always
in agreement, r will be high. - If they are sometimes in agreement r will be
moderate - If they are generally different, r will be near
zero - If they are in agreement, but in opposite
directions, r will be negative
17The Basic Idea
Zx Zy Zx Zy Zx Zy
1.2 1.6 1.2 -2.3 1.2 -1.6
-1.1 -0.7 -1.1 -0.2 -1.1 0.7
0.8 1.1 0.8 1.6 0.8 -1.1
3.2 2.8 3.2 -0.7 3.2 -2.8
-2.7 -2.3 -2.7 1.1 -2.7 2.3
0.1 -0.2 0.1 2.8 0.1 0.2
18Notation Used in Regression and Correlation
We define SSx, SSp and SSy by
19Obtaining the three sums of squares for the used
car data using the computational formulas
20Linear Correlation Coefficient
The linear correlation coefficient, r, of n data
points is defined by or by the computational
formula
21Linear Correlation Coefficient
22Coefficient of Determination
The coefficient of determination, r2, is the
proportion of variation in the observed values of
the response variable that is explained by the
regression The coefficient of the
determination always lies between 0 and 1 and is
a descriptive measure of of the utility of the
regression equation for making predictions.
Values of r2 near 0 indicate that the regression
equation is not useful for making predictions,
whereas values near 1 indicate that the
regression equation is extremely useful for
making predictions.
23t-Distribution for a Correlation Test
For samples of size n, the variable has the
t-distribution with df n 2 if the null
hypothesis ? 0 ? or rho is pronounced row
24The t-test for correlation (Slide 1 of 3)
With df n-2 use table B.6
25The t-test for correlation (Slide 2 of 3)
26The t-test for correlation (Slide 3 of 3)
Step 4 Compute the test statistic r. Table B.6
allows a direct lookup. Alternatively, r has a t
distribution and Table B.2 will yield an
identical conclusion Step 5 If the value of
the test statistic falls in the rejection region,
reject the null hypothesis. Step 6 State the
conclusion in words
27Criterion for deciding whether or not to reject
the null hypothesis
28Correlation Matrix
a b c d
a 1.00
b 0.84 1.00
c 0.68 0.58 1.00
d 0.12 0.19 0.08 1.00
29Computer printouts for correlations