Correlations - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Correlations

Description:

Negative relationship high values are paired with low ... Point biserial correlation -- correlations between quantitative data and two coded categories. ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 31
Provided by: NAlva1
Category:

less

Transcript and Presenter's Notes

Title: Correlations


1
Correlations
  • Bernardo Aguilar-Gonzalez

2
Describing Relationships
  • Positive relationship high values are paired
    with high values, low with low.
  • Negative relationship high values are paired
    with low values, low with high.
  • No relationship no regularity appears between
    pairs of scores in two distributions.

3
Scatterplots
  • One variable is measured on the x-axis, the
    other on the y-axis.
  • Positive relationship a cluster of dots sloping
    upward from the lower left to the upper right.
  • Negative relationship a cluster of dots sloping
    down from upper left to lower right.
  • No relationship no apparent slope.

4
Strength of Relationship
  • The more closely the dots approximate a straight
    line, the stronger the relationship.
  • A perfect relationship forms a straight line.
  • Dots forming a line reflect a linear
    relationship.
  • Dots forming a curved or bent line reflect a
    curvilinear relationship.

5
(No Transcript)
6
Correlation Coefficient
  • Pearsons r a measure of how well a straight
    line describes the cluster of dots in a plot.
  • Ranges from -1 to 1.
  • The sign indicates a positive or negative
    relationship.
  • The value of r indicates strength of
    relationship.
  • Pearsons r is independent of units of measure.

1900, I suggested this, when studying natural
selecion, remember?
7
Interpreting Pearsons r
  • The value of r needed to assert a strong
    relationship depends on
  • The size of n
  • What is being measured.
  • Pearsons r is NOT the percent or proportion of a
    perfect relationship.
  • Correlation is not causation.
  • Experimentation is used to confirm a suspected
    causal relationship.

8
Other Correlation Coefficients
  • Spearmans rho (r) based on ranks rather than
    values.
  • Used with ordinal data (qualitative data that can
    be ordered least to most).
  • Point biserial correlation -- correlations
    between quantitative data and two coded
    categories.
  • Cramers phi correlation between two ordered
    qualitative categories.

In 1904! Im Spearman, remember?
9
Chapter 14 of the text
  • Do the problem on the correlation between the
    physical and intellectual effects of exercising
    in Ch. 14 of the book

10
Procedure and Output
11
(No Transcript)
12
Example 1
  • Do Exercise 1 in the handout
  • Hypothesis membership growth in large city
    churches in positively correlated with distance
    from the central business district.

13
Results
14
Example 2
  • Do Exercise 18 in the handout
  • Hypothesis There is a positive correlation
    between the ranking of counties by their schools
    by two different consulting agencies.

15
Results
16
SPSS allows to do a scatter plot too
17
(No Transcript)
18
The GSS
  • The GSS (General Social Survey) is an almost
    annual See Note 1, "omnibus," personal
    interview survey of U.S. households conducted by
    the National Opinion Research Center (NORC) with
    James A. Davis, Tom W. Smith, and Peter V.
    Marsden as principal investigators (PIs). The
    first survey took place in 1972 and since then
    more than 38,000 respondents have answered over
    3,260 different questions. The special features
    of the GSS follow from its unique origin as the
    first, perhaps only, social science data set
    designed to be analyzed by "users," rather than
    the PIs and project staff.
  • The mission of the GSS is to make timely,
    high-quality, scientifically relevant data
    available to the social science research
    community.
  • Key features of the GSS are its broad coverage,
    its use of replication, its cross-national
    perspective, and its attention to data quality.

19
Example 3
  • How does education influence the types of
    occupations that people enter?  One way to think
    about occupations is in terms of  occupational
    prestige. Load the data set gss00a.sav. Your
    data set includes a variable, PRESTG80, in which
    a prestige score was assigned to respondents
    occupations, where higher numbers indicate
    greater prestige.  (To get more information about
    how the occupational prestige scale was
    constructed, you can go to http//www.csub.edu/ssr
    ic-trd/SPSS/xtras.html) Lets hypothesize that
    as education increases, the level of prestige of
    ones occupation also increases.  To test this
    hypothesis, click on "Analyze," "Correlate," and
    "Bivariate."  The following dialog box shown will
    appear on your screen. Click on EDUC, and then
    click the arrow to move it into the box.  Do the
    same with PRESTG80.

20
Variable Code
21
  • The most widely used bivariate test is the
    Pearson correlation.  It is intended to be used
    when both variables are measured at either the
    interval or ratio level, and each variable is
    normally distributed.  However, sometimes we do
    violate these assumptions. If you do a histogram
    of both EDUC, chapter 4, and PRESTG80, you will
    notice that neither is actually normally
    distributed.  Furthermore, if you noted that
    PRESTG80 is really an ordinal measure, not an
    interval one, you would be correct. 

22
  • Nevertheless, most analysts would use the Pearson
    correlation because the variables are close to
    being normally distributed, the ordinal variable
    has many ranks, and because the Pearson
    correlation is the one they are used to.  SPSS
    includes another correlation test, Spearmans
    rho, that is designed to analyze variables that
    are not normally distributed, or are ranked, as
    is PRESTG80.  We will conduct both tests to see
    if our hypothesis is supported, and also to see
    how much the results differ depending on the test
    used in other words, whether those who use the
    Pearson correlation on these types of variables
    are seriously off base.

23
In the dialog box, the box next to Pearson is
already checked, as this is the default.  Click
in the box next to Spearman.  Your dialog box
should now look like the following figure  Click
OK to run the tests.
  • Your output screen will show two tables  one for
    the Pearson correlation, and one for the
    Spearmans rho.  The results of the Pearsons
    correlation, which is called a correlation
    matrix, should look like the following one

24
(No Transcript)
25
Notice that the Pearson coefficient for the
relationship between education and occupational
prestige is .520, and it is positive.  This tells
us that, just as we predicted, as education
increases, occupational prestige increases.  But
should we consider the relationship strong?  At
.520, the coefficient is only about half as large
as is possible.  It should not surprise us,
however, that the relationship is not perfect
(a coefficient of 1).  Education appears to be an
important predictor of occupational prestige, but
no doubt you can think of other reasons why
people might enter a particular occupation. For
example, someone with a college degree may decide
that they really wanted to be a cheese-maker,
which has an occupational prestige score of only
29, while a high-school dropout may one day
become an owner of a bowling alley, which has a
prestige score of 44.  Given the variety of
factors that may influence ones occupational
choice, a coefficient of .520 suggests that the
relationship between education and occupational
prestige is actually quite strong.
The correlation matrix also gives the probability
of being wrong if we assume that the relationship
we find in our sample accurately reflects the
relationship between education and occupational
prestige that exists in the total population from
which the sample was drawn (labeled as Sig.
(2-tailed)).  The probability value is .000
(remember that the value is rounded to three
digits), which is well below the conventional
threshold of p lt .05.  Thus, our hypothesis is
supported.  There is a relationship (the
coefficient is not 0), it is in the predicted
direction (positive), and we can generalize the
results to the population (p lt .05).
26
Recall that we had some concerns about using the
Pearson coefficient, given that  PRESTG80 is
measured as an ordinal variable.  The following
figure shows the results using Spearmans rho. 
Notice that the coefficient, .523, is nearly
identical to coefficient obtained using the
Pearson correlation.  What do you conclude?
27
Example 4
  • the size of breeding pairs of penguins was
    measured to see if there was correlation between
    the sizes of the two sexes.
  • Calculate both the Parametric and the non
    parametric correlation between the sizes and
    sexes.

28
Data Procedure
29
Parametric Result
30
Non Parametric
What happened?
Write a Comment
User Comments (0)
About PowerShow.com