Agenda - PowerPoint PPT Presentation

About This Presentation
Title:

Agenda

Description:

Agenda Review Association for Nominal/Ordinal Data 2 Based Measures, PRE measures Introduce Association Measures for I-R data Regression, Pearson s r, R2 – PowerPoint PPT presentation

Number of Views:121
Avg rating:3.0/5.0
Slides: 34
Provided by: rwei2
Learn more at: https://www.d.umn.edu
Category:

less

Transcript and Presenter's Notes

Title: Agenda


1
Agenda
  • Review Association for Nominal/Ordinal Data
  • ?2 Based Measures, PRE measures
  • Introduce Association Measures for I-R data
  • Regression, Pearsons r, R2
  • HOMEWORKS
  • Two left, each due one week after posting
  • HW 5 posted later this afternoon
  • HW 6 posted on April 28

2
Why measure of association?
  • What do significant tests tell us?
  • What is the point of calculating measures of
    association?
  • Which do you calculate first (significant tests
    or association measures) and why?

3
?2 Based Measures
  • How does ?2 tap into association?
  • Indicates how different our findings from what is
    expected under null
  • Since null not related, high ?2 suggests
    stronger relationship
  • Why not just use ?2 ?
  • Phi
  • Cramers V

4
PRE Measures
  • ?2 based measures have no intuitive meaning
  • PRE measures proportionate reduction in error
  • Does knowing someones value on the independent
    variable (e.g., sex) improve our prediction of
    their score/value on the dependent variable
    (whether or not criminal).
  • Lambda
  • Gamma

5
ORDINAL MEASURE OF ASSOCIATION
  • GAMMA
  • For examining STRENGTH DIRECTION of collapsed
    ordinal variables (lt6 categories)
  • Like Lambda, a PRE-based measure
  • Range is -1.0 to 1.0

6
GAMMA
  • Logic Applying PRE to PAIRS of individuals

Prejudice Lower Class Middle Class Upper Class
Low Kenny Tim Kim
Middle Joey Deb Ross
High Randy Eric Barb
7
GAMMA
  • CONSIDER KENNY-DEB PAIR
  • In the language of Gamma, this is a same pair
  • direction of difference on 1 variable is the same
    as direction on the other
  • If you focused on the Kenny-Eric pair, you would
    come to the same conclusion

Prejudice Lower Class Middle Class Upper Class
Low Kenny Tim Kim
Middle Joey Deb Ross
High Randy Eric Barb
8
GAMMA
  • NOW LOOK AT THE TIM-JOEY PAIR
  • In the language of Gamma, this is a different
    pair
  • direction of difference on one variable is
    opposite of the difference on the other

Prejudice Lower Class Middle Class Upper Class
Low Kenny Tim Kim
Middle Joey Deb Ross
High Randy Eric Barb
9
GAMMA
  • Logic Applying PRE to PAIRS of individuals
  • Formula
  • same different
  • same different

Prejudice Lower Class Middle Class Upper Class
Low Kenny Tim Kim
Middle Joey Deb Ross
High Randy Eric Barb
10
GAMMA
  • If you were to account for all the pairs in this
    table, you would find that there were 9 same
    9 different pairs
  • Applying the Gamma formula, we would get
  • 9 9 0 0.0
  • 18 18

Prejudice Lower Class Middle Class Upper Class
Low Kenny Tim Kim
Middle Joey Deb Ross
High Randy Eric Barb
11
GAMMA
  • 3-case example
  • Applying the Gamma formula, we would get
  • 3 0 3 1.00
  • 3 3

Prejudice Lower Class Middle Class Upper Class
Low Kenny
Middle Deb
High Barb
12
Gamma Example 1
  • Examining the relationship between
  • FEHELP (Wife should help husbands career
    first)
  • FEFAM (Better for man to work, women to tend
    home)
  • Both variables are ordinal, coded 1 (strongly
    agree) to 4 (strongly disagree)

13
Gamma Example 1
  • Based on the info in this table, does there seem
    to be a relationship between these factors?
  • Does there seem to be a positive or negative
    relationship between them?
  • Does this appear to be a strong or weak
    relationship?

14
GAMMA
  • Do we reject the null hypothesis of independence
    between these 2 variables?
  • Yes, the Pearson chi square p value (.000) is
    lt alpha (.05)
  • Its worthwhile to look at gamma.
  • Interpretation
  • There is a strong positive relationship between
    these factors.
  • Knowing someones view on a wifes first
    priority improves our ability to predict whether
    they agree that women should tend home by 75.5.

15
ASSOCIATION BETWEEN INTERVAL-RATIO VARIABLES

16
Scattergrams
  • Allow quick identification of important features
    of relationship between interval-ratio variables
  • Two dimensions
  • Scores of the independent (X) variable
    (horizontal axis)
  • Scores of the dependent (Y) variable (vertical
    axis)

17
3 Purposes of Scattergrams
  • To give a rough idea about the existence,
    strength direction of a relationship
  • The direction of the relationship can be detected
    by the angle of the regression line
  • 2. To give a rough idea about whether a
    relationship between 2 variables is linear
    (defined with a straight line)
  • 3. To predict scores of cases on one variable (Y)
    from the score on the other (X)

18
  • IV and DV?
  • What is the direction
  • of this relationship?

19
  • IV and DV?
  • What is the direction of this relationship?

20
The Regression line
  • Properties
  • The sum of positive and negative vertical
    distances from it is zero
  • The standard deviation of the points from the
    line is at a minimum
  • The line passes through the point (mean x, mean
    y)
  • Bivariate Regression Applet

21
Regression Line Formula
  • Y a bX
  • Y score on the dependent variable
  • X the score on the independent variable
  • a the Y intercept
  • point where the regression line crosses the Y
    axis
  • b the slope of the regression line
  • SLOPE the amount of change produced in Y by a
    unit change in X or,
  • a measure of the effect of the X variable on the Y

22
Regression Line Formula
  • Y a bX
  • y-intercept (a) 102
  • slope (b) .9
  • Y 102 (.9)X
  • This information can be used to predict weight
    from height.
  • Example What is the predicted weight of a male
    who is 70 tall (510)?
  • Y 102 (.9)(70) 102 63
  • 165 pounds

23
Example 2 Examining the link between hours of
daily TV watching (X) of cans of soda
consumed per day (Y)

Case Hours TV/ Day (X) Cans Soda Per Day (Y)
1 1 2
2 3 6
3 2 3
4 2 4
5 1 1
6 4 6
7 6 7
8 4 2
9 4 5
10 2 0
24
Example 2
  • Example 2 Examining the link between hours of
    daily TV watching (X) of cans of soda
    consumed per day. (Y)
  • The regression line for this problem
  • Y 0.7 .99x
  • If a person watches 3 hours of TV per day, how
    many cans of soda would he be expected to consume
    according to the regression equation?
  • y .7 .99(3) 3.67

25
The Slope (b) A Strength A Weakness
  • We know that b indicates the change in Y for a
    unit change in X, but b is not really a good
    measure of strength
  • Weakness
  • It is unbounded (can be gt1 or lt-1) making it hard
    to interpret
  • The size of b is influenced by the scale that
    each variable is measured on

26
Pearsons r Correlation Coefficient
  • By contrast, Pearsons r is bounded
  • a value of 0.0 indicates no linear relationship
    and a value of /-1.00 indicates a perfect linear
    relationship

27
Pearsons r
  • Y 0.7 .99x
  • sx 1.51
  • sy 2.24
  • Converting the slope to a Pearsons r correlation
    coefficient
  • Formula r b(sx/sy)
  • r .99 (1.51/2.24)
  • r .67

28
The Coefficient of Determination
  • The interpretation of Pearsons r (like Cramers
    V) is not straightforward
  • What is a strong or weak correlation?
  • Subjective
  • The coefficient of determination (r2) is a more
    direct way to interpret the association between 2
    variables
  • r2 represents the amount of variation in Y
    explained by X
  • You can interpret r2 with PRE logic
  • predict Y while ignoring info. supplied by X
  • then account for X when predicting Y

29
Coefficient of Determination Example
  • Without info about X (hours of daily TV
    watching), the best predictor we have is the mean
    of cans of soda consumed (mean of Y)
  • The green line (the slope) is what we would
    predict WITH info about X

30
Coefficient of Determination
  • Conceptually, the formula for r2 is
  • r2 Explained variation
  • Total variation
  • The proportion of the total variation in Y that
    is attributable or explained by X.
  • The variation not explained by r2 is called the
    unexplained variation
  • Usually attributed to measurement error, random
    chance, or some combination of other variables

31
Coefficient of Determination
  • Interpreting the meaning of the coefficient of
    determination in the example
  • Squaring Pearsons r (.67) gives us an r2 of .45
  • Interpretation
  • The of hours of daily TV watching (X) explains
    45 of the total variation in soda consumed (Y)

32
Another Example Relationship between Mobility
Rate (x) Divorce rate (y)
  • The formula for this regression line is
  • Y -2.5 (.17)X
  • 1) What is this slope telling you?
  • 2) Using this formula, if the mobility rate for a
    given state was 45, what would you predict the
    divorce rate to be?
  • 3) The standard deviation (s) for x6.57 the s
    for y1.29. Use this info to calculate Pearsons
    r. How would you interpret this correlation?
  • 4) Calculate interpret the coefficient of
    determination (r2)

33
Another Example Relationship between Mobility
Rate (x) Divorce rate (y)
  • The formula for this regression line is
  • Y -2.5 (.17)X
  • 1) What is this slope telling you?
  • For every one unit increase in x (mobility rate),
    divorce rate (y) goes up .17
  • 2) Using this formula, if the mobility rate for a
    given state was 45, what would you predict the
    divorce rate to be?
  • Y -2.5 (.17) 45 5.15
  • 3) The standard deviation (s) for x6.57 the s
    for y1.29. Use this info to calculate Pearsons
    r. How would you interpret this correlation?
  • r .17 (6.57/1.29) .17(5.093) .866
  • There is a strong positive association between
    mobility rate divorce rate.
  • 4) Calculate interpret the coefficient of
    determination (r2)
  • r2 (.866)2 .75
  • A states mobility rate explains 75 of the
    variation in its divorce rate.
Write a Comment
User Comments (0)
About PowerShow.com