Correlation and Regression - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Correlation and Regression

Description:

Point-biserial correlation: used with one qualitative (with just 2 levels) ... Point-biserial used whenever an independent samples t-test can be used ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 24
Provided by: nfr5
Category:

less

Transcript and Presenter's Notes

Title: Correlation and Regression


1
Correlation and Regression
  • Chapter 15

2
A new sort of test
  • One qualitative and one quantitative variable
    t-test
  • Two qualitative variables chi square
  • What if you have two quantitative variables, and
    you want to know how much they go together?
  • E.g., study time and grades
  • ? correlation

3
Information in a correlation
  • 1. direction of association (positive or
    negative)
  • Captured by sign of correlation
  • 2. strength of association (how much does one
    variable tell you about the other?)
  • Captured by absolute value of correlation can
    vary between 0 and 1

4
Watch wording
  • Tell about, rather than dictate
  • Correlation does not equal causation
  • Any time neither variable is manipulated, you
    just know that two variables are related or two
    groups are different you dont know why

5
Primary use of correlation
  • Examining two quantitative variables
  • There is a linear relationship between them

6
Graphing correlations
  • Scatterplot one variable on X axis, one variable
    on Y axis
  • One dot per participant
  • If most people above the mean on both variables
    or below the mean on both variables, positive
    correlation
  • If most people below the mean on one variable,
    but above the mean on the other, negative
    correlation

7
Calculating correlation
  • 1. determine whether people are above the mean or
    below the mean, and by how much
  • For each person, for each variable, subtract mean
    from score
  • 2. combine this information together
  • For each person, multiply result from 1 for each
    variable together cross-products
  • Add this up across all participants
  • ? sum of cross-products

8
Taking into account variability of variables
  • If just used sum of cross products, would end up
    with large value if there were many participants,
    or if the variable had a wide possible range of
    values
  • ? need to correct for this
  • Divide by square root of (sum of squares for one
    variable, times sum of squares for the other
    variable)
  • ? value that can vary between -1 and 1

9
Is it significant?
  • Null hypothesis for correlation?
  • How big is big enough that its not due to
    chance?
  • Critical value of correlation
  • Based on df n-2

10
Fun with correlation
  • Predicting one variable based on another
  • Who will do best in college?
  • Measuring whether a measure is valid
  • Is it related to other measures of the same
    construct?
  • Measuring whether a measure is reliable
  • Is there a relationship between peoples answers
    at time 1 and peoples answers at time 2?

11
Some points of caution
  • Outliers
  • Restricted range

12
Telling the world
  • r (df) correlation value, p information
  • Or
  • r correlation value, n sample size, p
    information

13
What about effect size?
  • Measured by r2
  • Symbol for correlation r
  • ? to calculate r2?
  • Referred to as coefficient of determination

14
What if the association isnt linear?
  • Correlation, so far, only works with linear
    associations between variables
  • Correlation weve talked about so far Pearson
    correlation
  • Most common form of correlation
  • If people just say correlation, they mean
    Pearson correlation

15
Curvilinear associations
  • Can still use a correlation to examine, as long
    as its not a U-shaped or inverted U-shaped
    correlation
  • Can use Spearman correlation
  • 1. replace raw scores with rank order
  • 2. calculate Pearson correlation, using rank
    order data

16
Independent samples t-tests are a special case of
correlation
  • Point-biserial correlation used with one
    qualitative (with just 2 levels) variable and one
    quantitative variable i.e., the same situation
    when you can use an independent samples t-test
  • 1. assign 0 to one group, and 1 to the other
  • 2. compute a Pearson correlation

17
Chi square is a special case of correlation
  • Phi coefficient use when both variables are
    qualitative, with just two levels
  • 1. assign 0 and 1 to the two levels of one
    variable
  • 2. assign 0 and 1 to the two levels of the other
    variable
  • 3. compute a Pearson correlation

18
Correlation summary
  • Most common form (Pearson) used with two
    continuous variables, in a linear association
  • Spearman used with curvilinear associations
  • Point-biserial used whenever an independent
    samples t-test can be used
  • Phi used when a chi square for goodness of fit
    (with just 2 levels/variable) can be used
  • Can vary between -1 and 1
  • Does not tell anything about causation

19
Back to the idea of prediction
  • With correlation, you can predict the value of
    one variable based on the value of another
    variable
  • If you know someones marital problems, you can
    predict that persons level of satisfaction
  • But, if you knew more about that person you could
    do an even better job predicting satisfaction
  • ? regression used to predict one quantitative
    variable from a whole mess of quantitative
    variables

20
Building up to regression
  • First, the equation for a line?
  • Y bX a
  • AKA Y mX b
  • In both, have intercept and slope
  • Intercept predicted value of Y when X is zero
  • Slope how much Y is predicted to change as X
    changes
  • Goal of regression line Minimize the discrepancy
    between predicted and actual values of Y

21
Linking this to correlation
  • Correlation slope of the regression line, if
    the scores are in z-scores
  • ? predicted z score for Y variable correlation
    value z-score for X variable

22
Difference between regression and correlation
  • Correlation is a special case of regression, with
    just one predictor variable
  • Regression lets you add in more predictor
    variables to
  • Figure out how much of the Y variable is
    explained by a whole mess of predictor variables
  • Figure out how much each predictor variable
    uniquely tells about the Y variable
  • ? two tests for significance for whole model,
    and for each individual variable

23
Be sure to review
  • What correlation and regression are
  • When to use each type
  • How to calculate all types of correlations
  • How to interpret results
Write a Comment
User Comments (0)
About PowerShow.com