PSY 360 - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

PSY 360

Description:

Defined as the degree of linear relationship between X and Y ... Compute the correlation between Quiz 1 and Quiz 2 scores using SPSS. ... – PowerPoint PPT presentation

Number of Views:187
Avg rating:3.0/5.0
Slides: 33
Provided by: tooth
Category:

less

Transcript and Presenter's Notes

Title: PSY 360


1
PSY 360
  • Correlation
  • Regression

2
Correlation and Regression
  • Both examine linear (straight line) relationships
  • Both work with a pair of scores, one on each of
    two variables, X and Y
  • Correlation
  • Defined as the degree of linear relationship
    between X and Y
  • Is measured/described by the statistic r
  • Regression
  • Describes the form or function of the linear
    relationship between X Y
  • Is concerned with the prediction of Y from X
  • Forms a prediction equation to predict Y from X

3
Why do we care?
  • Knowing the values of the one most representative
    score (central tendency) and a measure of spread
    or dispersion (variability) is critical for
    DESCRIBING the characteristics of a distribution
  • However, sometimes we are interested in the
    relationship between variables or to be more
    precise, how the value of one variable changes
    when the value of another variable changes.
    Correlation and regression help us understand
    these relationships

4
Correlation
  • The aspect of the data that we
  • want to describe/measure is
  • the degree of linear relationship
  • between X and Y
  • The statistic r describes/measures the degree of
    linear relationship between X and Y
  • r?zXzY/N, the average product of z scores for X
    and Y
  • Works with two variables, X and Y
  • -1relationships
  • Measures only the degree of linear relationship
  • r2proportion of variability in Y that is
    explained by X
  • r is undefined if X or Y has zero spread

5
Correlation -1
  • The sign of r shows the type of linear
    relationship between X and Y. We can use the
    definitional

  • formula for r and these scatterplots to see
    positive, negative and zero relationships
    r 1
    r 0
    r -1
    Y Y
    Y Y
    Y Y



    -

    -
    -

    -
    X X
    X X
    X X
    6
    Correlation -1
    7
    Correlation -1
    8
    Correlation -1
    9
    Correlation -1
    10
    Correlation -1
    11
    Interpreting Correlation Coefficients
    Size of the correlation coefficient General
    Interpretation .8 to 1.0 very strong
    relationship .6 to .8 strong relationship
    .4 to .6 moderate relationship .2 to
    .4 weak relationship .0 to .2 weak or
    no relationship
    12
    Interpreting Correlation Coefficients
    Size of the correlation coefficient General
    Interpretation .8 to 1.0 very strong
    relationship .6 to .8 strong relationship
    .4 to .6 moderate relationship .2 to
    .4 weak relationship .0 to .2 weak or
    no relationship
    13
    Interpreting Correlation Coefficients
    Size of the correlation coefficient General
    Interpretation .8 to 1.0 very strong
    relationship .6 to .8 strong relationship
    .4 to .6 moderate relationship .2 to
    .4 weak relationship .0 to .2 weak or
    no relationship
    14
    Interpreting Correlation Coefficients
    Size of the correlation coefficient General
    Interpretation .8 to 1.0 very strong
    relationship .6 to .8 strong relationship
    .4 to .6 moderate relationship .2 to
    .4 weak relationship .0 to .2 weak or
    no relationship
    15
    Interpreting Correlation Coefficients
    Size of the correlation coefficient General
    Interpretation .8 to 1.0 very strong
    relationship .6 to .8 strong relationship
    .4 to .6 moderate relationship .2 to
    .4 weak relationship .0 to .2 weak or
    no relationship
    16
    Correlation Linear
    • If there is a curvilinear
    • relationship between X and Y,
    • then r will not detect it. The value of r
      will be zero if there is no linear relationship
      between X and Y.








    r 0
    r 0
    17
    Correlation r2
    • r2proportion of variability in Y
    • that is explained by X.
    • If r.5, r2.25, so the proportion of
    • variability in Y that is explained by X is .25
      (as a percentage, this shows 25 explained by X,
      75 unexplained).
    • Scatterplots
    • r.5, r2.25 r.7, r2.49 r.9,
      r2.81
    • Venn Diagrams r2 is represented by the
      proportion of overlap.
    • Y X Y X Y X

    18
    Correlation Undefined
    • If there is no spread in X or Y, then r is
      undefined. Note that any z is undefined if the
      standard deviation is zero, and r?zXzY/N.

    sY0
    sX0



    Y
    Y
    r is undefined
    r is undefined


    X
    X
    19
    Correlation
    • Example of correlation
    • Murder rates and ice cream sales are positively
      correlated
    • As murder rates increase, ice cream sales also
      increase
    • Why?
    • CORRELATION DOES NOT MEAN CAUSATION
    • Murders may cause increases in ice cream sales
    • Ice cream sales may cause more murders
    • Some other variable may cause both murders and
      ice cream sales

    20
    Correlation
    • Things to remember
    • Correlations can range from -1 to 1
    • The absolute value of the correlation coefficient
      reflects the strength of the correlation. So a
      correlation of -.7 is stronger than a correlation
      of .5
    • Do not assign a value judgment to the sign of the
      correlation. Many students assume that negative
      correlations are bad and positive correlations
      are good. This is not true!
    • Population correlation coefficient, ? (rho)
    • Impact on r
    • Restriction of range
    • Extreme scores (outliers)

    21
    Regression
    • Not only can we compute the degree to which two
      variables are related (correlation coefficient),
      but we can use these correlations as the basis
      for predicting the value of one variable from the
      value of another
    • Prediction is an activity that computes future
      outcomes from present ones. When we want to
      predict one variable from another, we need to
      first compute the correlation between the two
      variables

    22
    Regression
    Total High School GPA and First-Year College GPA
    are Correlated
    23
    Regression
    r .68 for these two variables
    24
    Regression


    • Regression is concerned with
    • forming a prediction equation to
    • predict Y from X
    • Uses the formula for a straight line, YbXa
    • Y is the predicted Y score on the criterion
      variable
    • b is the slope, b?Y/ ? Xrise/run
    • X is a score on the predictor variable
    • a is the Y-intercept, where the line crosses the
      Y axis, the value of Y when X0
    • Example if b.695, a.739, and X3.5,
    • then Y .695(3.5).739 3.17

    25
    Regression

    Regression line of Y on X
    26
    Regression

    Prediction of Y, given X 3.5
    27
    Regression


    • Linear only
    • Generalize only for X values in your sample
    • Actual observed Y is different from Y by an
      amount called error, e, that is, YYe
    • Error in regression is eY-Y
    • Many different potential regression lines

    28
    Regression Best Line
    • There are many different potential regression
      lines, but only one best-fitting line

    • The statistics b and a are computed so as to
      minimize the sum of squared errors,
    • ?e2?(Y-Y)2 is a minimum which is called the
      Least Squares Criterion.
    • This means that it minimizes the distance between
      each individual point and the regression line

    29
    Regression


    • Error in prediction the distance between
      each individual data point and the
      regression line (a direct
      reflection of the correlation
      between two variables)

    Error in prediction
    X 3.3, Y 3.7
    30
    Regression sy.x


    • Standard error of estimate is a
    • statistic that measures/describes
    • spread of errors or Y scores
    • in regression.
    • syx is the standard deviation of errors in
      regression
    • syx ??e2/(N-2) ??(Y-Y)2/(N-2).
    • syx ?(N-1)/(N-2)(sy)?(1-r2)
    • As r2 increases, syx decreases. For example, if
      N100 and sy4
    • r2 syx
    • .2 3.94
    • .4 3.68
    • .6 3.22
    • .8 2.41
    • .9 1.75

    syx is the standard deviation of Y around the
    regression line Y
    31
    Regression Partitioning


    • Partitioning total variability
    • Total Explained Not Explained
    • This is true for proportion of spread and amount
      of spread.
    • Proportion 1 r2 (1-r2)
    • Amount s2y r2s2y (1-r2)s2y
    • Formulas Total Expl.
      Not Expl.
    • Proportion
    • Amount

    1
    r2
    1-r2
    s2y
    r2.s2y
    (1-r2)s2y
    • Example Total Expl.
      Not Expl.
    • r.7, s2y150, Proportion
    • Amount

    1
    .49
    .51
    150
    73.5
    76.5
    32
    • Quiz 1 Quiz 2
    • 9.8 10.4
    • 8.6 9.9
    • 10.0 10.4
    • 7.0 7.8
    • 9.1 8.5
    • 8.1 9.4
    • 7.8 9.4
    • 6.1 5.4
    • 4.8 5.0
    • 7.6 8.0
    • 7.3 9.5
    • 3.8 1.5
    • 7.4 3.5
    • 5.8 4.2
    • 8.8 7.4
    • 8.8 5.5
    • 6.0 7.9
    • 6.1 6.4

    Compute the correlation between Quiz 1 and Quiz 2
    scores using SPSS. Run a regression model
    predicting Quiz 2 scores using Quiz 1 scores.
    What is the slope? What is the y-intercept? Based
    on the regression results, estimate the Quiz 2
    score of someone who earned a 9.0 on Quiz 1.
    Write a Comment
    User Comments (0)
    About PowerShow.com