Agenda - PowerPoint PPT Presentation

About This Presentation

Title:

Agenda

Description:

Agenda Review Association for Nominal/Ordinal Data 2 Based Measures, PRE measures Introduce Association Measures for I-R data Regression, Pearson s r, R2 – PowerPoint PPT presentation

Number of Views:122

Avg rating:3.0/5.0

Slides: 34

Provided by: rwei2

Learn more at: https://www.d.umn.edu

Category:

more less

Transcript and Presenter's Notes

Title: Agenda

1
Agenda

Review Association for Nominal/Ordinal Data
?2 Based Measures, PRE measures
Introduce Association Measures for I-R data
Regression, Pearsons r, R2
HOMEWORKS
Two left, each due one week after posting
HW 5 posted later this afternoon
HW 6 posted on April 28

2
Why measure of association?

What do significant tests tell us?
What is the point of calculating measures of
association?
Which do you calculate first (significant tests
or association measures) and why?

3
?2 Based Measures

How does ?2 tap into association?
Indicates how different our findings from what is
expected under null
Since null not related, high ?2 suggests
stronger relationship
Why not just use ?2 ?
Phi
Cramers V

4
PRE Measures

?2 based measures have no intuitive meaning
PRE measures proportionate reduction in error
Does knowing someones value on the independent
variable (e.g., sex) improve our prediction of
their score/value on the dependent variable
(whether or not criminal).
Lambda
Gamma

5
ORDINAL MEASURE OF ASSOCIATION

GAMMA
For examining STRENGTH DIRECTION of collapsed
ordinal variables (lt6 categories)
Like Lambda, a PRE-based measure
Range is -1.0 to 1.0

6
GAMMA

Logic Applying PRE to PAIRS of individuals

Prejudice Lower Class Middle Class Upper Class
Low Kenny Tim Kim
Middle Joey Deb Ross
High Randy Eric Barb
7
GAMMA

CONSIDER KENNY-DEB PAIR
In the language of Gamma, this is a same pair
direction of difference on 1 variable is the same
as direction on the other
If you focused on the Kenny-Eric pair, you would
come to the same conclusion

Prejudice Lower Class Middle Class Upper Class
Low Kenny Tim Kim
Middle Joey Deb Ross
High Randy Eric Barb
8
GAMMA

NOW LOOK AT THE TIM-JOEY PAIR
In the language of Gamma, this is a different
pair
direction of difference on one variable is
opposite of the difference on the other

Prejudice Lower Class Middle Class Upper Class
Low Kenny Tim Kim
Middle Joey Deb Ross
High Randy Eric Barb
9
GAMMA

Logic Applying PRE to PAIRS of individuals
Formula
same different
same different

Prejudice Lower Class Middle Class Upper Class
Low Kenny Tim Kim
Middle Joey Deb Ross
High Randy Eric Barb
10
GAMMA

If you were to account for all the pairs in this
table, you would find that there were 9 same
9 different pairs
Applying the Gamma formula, we would get
9 9 0 0.0
18 18

Prejudice Lower Class Middle Class Upper Class
Low Kenny Tim Kim
Middle Joey Deb Ross
High Randy Eric Barb
11
GAMMA

3-case example
Applying the Gamma formula, we would get
3 0 3 1.00
3 3

Prejudice Lower Class Middle Class Upper Class
Low Kenny
Middle Deb
High Barb
12
Gamma Example 1

Examining the relationship between
FEHELP (Wife should help husbands career
first)
FEFAM (Better for man to work, women to tend
home)
Both variables are ordinal, coded 1 (strongly
agree) to 4 (strongly disagree)

13
Gamma Example 1

Based on the info in this table, does there seem
to be a relationship between these factors?
Does there seem to be a positive or negative
relationship between them?
Does this appear to be a strong or weak
relationship?

14
GAMMA

Do we reject the null hypothesis of independence
between these 2 variables?
Yes, the Pearson chi square p value (.000) is
lt alpha (.05)
Its worthwhile to look at gamma.
Interpretation
There is a strong positive relationship between
these factors.
Knowing someones view on a wifes first
priority improves our ability to predict whether
they agree that women should tend home by 75.5.

15
ASSOCIATION BETWEEN INTERVAL-RATIO VARIABLES

16
Scattergrams

Allow quick identification of important features
of relationship between interval-ratio variables
Two dimensions
Scores of the independent (X) variable
(horizontal axis)
Scores of the dependent (Y) variable (vertical
axis)

17
3 Purposes of Scattergrams

To give a rough idea about the existence,
strength direction of a relationship
The direction of the relationship can be detected
by the angle of the regression line
2. To give a rough idea about whether a
relationship between 2 variables is linear
(defined with a straight line)
3. To predict scores of cases on one variable (Y)
from the score on the other (X)

IV and DV?
What is the direction
of this relationship?

IV and DV?
What is the direction of this relationship?

20
The Regression line

Properties
The sum of positive and negative vertical
distances from it is zero
The standard deviation of the points from the
line is at a minimum
The line passes through the point (mean x, mean
y)
Bivariate Regression Applet

21
Regression Line Formula

Y a bX
Y score on the dependent variable
X the score on the independent variable
a the Y intercept
point where the regression line crosses the Y
axis
b the slope of the regression line
SLOPE the amount of change produced in Y by a
unit change in X or,
a measure of the effect of the X variable on the Y

22
Regression Line Formula

Y a bX
y-intercept (a) 102
slope (b) .9
Y 102 (.9)X
This information can be used to predict weight
from height.
Example What is the predicted weight of a male
who is 70 tall (510)?
Y 102 (.9)(70) 102 63
165 pounds

23
Example 2 Examining the link between hours of
daily TV watching (X) of cans of soda
consumed per day (Y)

Case Hours TV/ Day (X) Cans Soda Per Day (Y)
1 1 2
2 3 6
3 2 3
4 2 4
5 1 1
6 4 6
7 6 7
8 4 2
9 4 5
10 2 0
24
Example 2

Example 2 Examining the link between hours of
daily TV watching (X) of cans of soda
consumed per day. (Y)
The regression line for this problem
Y 0.7 .99x
If a person watches 3 hours of TV per day, how
many cans of soda would he be expected to consume
according to the regression equation?
y .7 .99(3) 3.67

25
The Slope (b) A Strength A Weakness

We know that b indicates the change in Y for a
unit change in X, but b is not really a good
measure of strength
Weakness
It is unbounded (can be gt1 or lt-1) making it hard
to interpret
The size of b is influenced by the scale that
each variable is measured on

26
Pearsons r Correlation Coefficient

By contrast, Pearsons r is bounded
a value of 0.0 indicates no linear relationship
and a value of /-1.00 indicates a perfect linear
relationship

27
Pearsons r

Y 0.7 .99x
sx 1.51
sy 2.24
Converting the slope to a Pearsons r correlation
coefficient
Formula r b(sx/sy)
r .99 (1.51/2.24)
r .67

28
The Coefficient of Determination

The interpretation of Pearsons r (like Cramers
V) is not straightforward
What is a strong or weak correlation?
Subjective
The coefficient of determination (r2) is a more
direct way to interpret the association between 2
variables
r2 represents the amount of variation in Y
explained by X
You can interpret r2 with PRE logic
predict Y while ignoring info. supplied by X
then account for X when predicting Y

29
Coefficient of Determination Example

Without info about X (hours of daily TV
watching), the best predictor we have is the mean
of cans of soda consumed (mean of Y)
The green line (the slope) is what we would
predict WITH info about X

30
Coefficient of Determination

Conceptually, the formula for r2 is
r2 Explained variation
Total variation
The proportion of the total variation in Y that
is attributable or explained by X.
The variation not explained by r2 is called the
unexplained variation
Usually attributed to measurement error, random
chance, or some combination of other variables

31
Coefficient of Determination

Interpreting the meaning of the coefficient of
determination in the example
Squaring Pearsons r (.67) gives us an r2 of .45
Interpretation
The of hours of daily TV watching (X) explains
45 of the total variation in soda consumed (Y)

32
Another Example Relationship between Mobility
Rate (x) Divorce rate (y)

The formula for this regression line is
Y -2.5 (.17)X
1) What is this slope telling you?
2) Using this formula, if the mobility rate for a
given state was 45, what would you predict the
divorce rate to be?
3) The standard deviation (s) for x6.57 the s
for y1.29. Use this info to calculate Pearsons
r. How would you interpret this correlation?
4) Calculate interpret the coefficient of
determination (r2)

33
Another Example Relationship between Mobility
Rate (x) Divorce rate (y)

The formula for this regression line is
Y -2.5 (.17)X
1) What is this slope telling you?
For every one unit increase in x (mobility rate),
divorce rate (y) goes up .17
2) Using this formula, if the mobility rate for a
given state was 45, what would you predict the
divorce rate to be?
Y -2.5 (.17) 45 5.15
3) The standard deviation (s) for x6.57 the s
for y1.29. Use this info to calculate Pearsons
r. How would you interpret this correlation?
r .17 (6.57/1.29) .17(5.093) .866
There is a strong positive association between
mobility rate divorce rate.
4) Calculate interpret the coefficient of
determination (r2)
r2 (.866)2 .75
A states mobility rate explains 75 of the
variation in its divorce rate.