Correlation: Finding Statistical Relationships - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Correlation: Finding Statistical Relationships

Description:

Form, Direction, Strength as well as Outliers. Create a ... the strength of the linear relationship increases as r moves away from 0 toward either -1 or 1 ... – PowerPoint PPT presentation

Number of Views:241
Avg rating:3.0/5.0
Slides: 29
Provided by: melanie58
Category:

less

Transcript and Presenter's Notes

Title: Correlation: Finding Statistical Relationships


1
Section 2.2
  • Correlation Finding Statistical Relationships

2
Placing Correlation in Context
  • Step one Create a Scatterplot
  • Interpret the Scatterplot
  • Form, Direction, Strength as well as Outliers
  • Create a Numerical Summary
  • Mean of X and Y
  • Standard Deviation for X and Y
  • Correlation
  • Interpret the results with a model

3
Introduction
  • Linear relationships (straight lines) occur
    fairly often in practice (e.g., ACT vs. GPA)
  • A linear relation is strong if the points lie
    close to a straight line. Conversely a linear
    relationship is weak if the points are widely
    scattered about a line
  • Warning human eyes can be fooled by changing
    the plotting scales or the amount of white space
    displayed in a scatterplot.
  • Hence, we use a numeric measure called correlation

4
Defining Correlation
  • Correlation(r)
  • Correlation is a quantity used to measure the
    direction and strength of the linear relationship
    between two quantitative variables.
  • Formula

5
Equation
  • Calculating r
  • Let x, y be any two quantitative variables for n
    individuals
  • The values for the first individual are and
    , the values for the second are and ,
    etc.

6
Does a graphic help?
7
Correlation
  • Remember are standardized
    values of variable x and y respectively. Think
    about how the z-score standardized values.
  • You can think of the correlation (r) as an
    average (d.f. divisor) of the products of the
    standardized values of the two variables x and y
    for the n observations. Remember, each
    observation comes in a pair.

8
Properties of r
  • Correlation makes no distinction between
    explanatory and response variables
  • Correlation requires that both variables be
    quantitative.
  • r does not change when we change the units of
    measurements of x, y, or both. Why?
  • r has no units of measurements. r is simply a
    number
  • For example, changing the unit of x from
    millimeter to kilometer will not change r.
    Likewise, changing the unit of y from grams to
    decagrams will not change r.

9
Properties of r
  • The correlation r is always a number between -1
    and 1
  • values of r near 0 indicate a very weak linear
    relationship
  • the strength of the linear relationship increases
    as r moves away from 0 toward either -1 or 1
  • values of r close to -1 or 1 indicate that the
    points in a scatterplot lie close to a straight
    line
  • the extreme values r -1 and r 1 occur only in
    the case of a perfect linear relationship
  • Correlation measures the strength of a linear
    relationship.

10
Properties of r
  • r gt 0 then there is a positive correlation
    between two variables.
  • r lt 0 then there is a negative correlation
    between two variables.
  • Once again, correlation only measures the
    strength of a linear relationship.

11
r gt 0 (positive)
  • An example of a positively correlated X and Y

12
rlt0 (negative)
  • Negatively correlated X and Y

13
r near zero
  • Little if any correlation between X Y

14
Non-linear?
  • r is not useful for non-linear data
  • The data have a clear relationship a curve.
    But the correlation will be near 0.

15
Properties of r
  • Correlation measures the strength of the linear
    relationship between two variables
  • Correlation does not describe curved
    relationships between variables, no matter how
    strong they are
  • This is why we identify form first
  • Like the mean and standard deviation, the
    correlation is not resistant to outliers
  • r is strongly affected by a few outlying
    observations
  • This makes sense by looking at the formula

16
Correlation is not resistant
  • Original Graph Outlier Graph

17
Other Notes
  • At times, scatterplots may disguise the true
    correlation.
  • Correlation is not a complete description of
    two-variable data, even when the relationship
    between the variables is linear.
  • Also report the means and standard deviations of
    both x and y as well as the correlation.
  • Reminder 22 Correlation only measures the
    strength of a linear relationship.

18
Example ACT (x) vs. GPA (y)
19
  • Form
  • Linear
  • r 0.622
  • Which tells us
  • Direction
  • positive
  • Strength
  • moderate

20
Text Example
  • Data

21
The Result
22
Example -- Correlation Calc.
  • Two managers evaluate a performance of 5
    candidates
  • n5
  • Mean score of X is mean score of Y
    is
  • Standard deviations

23
Example Cont.
  • Now plug in data to the formula (magnifying
    glass?)

R0.8 Two evaluations are highly correlated.
Managers agree on the score level for each
candidate, even though A tends to give higher.
24
Correlation.
  • Identify the flaws
  • We found a high correlation between eating celery
    and adult height.
  • We found a correlation of r 1.02 between years
    of age and weight for the experimental subjects.
  • The correlation between eating x pounds of potato
    chips and y pounds gaining weight is r 0.55
    pounds.

25
Section 2.2 Summary
  • The correlation (r) measures the strength and
    direction of the linear association between two
    quantitative variables x and y. Although you can
    calculate a correlation for any scatterplot, r
    measures only straight-line relationships.

26
Section 2.2 Summary
  • Correlation indicates the direction of a linear
    relationship by its sign r gt 0 for a positive
    association and r lt 0 for a negative association.

27
Section 2.2 Summary
  • Correlation always satisfies -1 r 1 and
    indicates the strength of a relationship by how
    close it is to -1 or 1. Perfect correlation, r
    /- 1, occurs only when the points on a
    scatterplot lie exactly on a straight line.

28
Section 2.2 Summary
  • Correlation ignores the distinction between
    explanatory and response variables. The value of
    r is not affected by changes in the unit of
    measurement of either variable. Correlation is
    not resistant, so outliers can greatly change the
    value of r.
Write a Comment
User Comments (0)
About PowerShow.com