AS 737 Categorical Data Analysis - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

AS 737 Categorical Data Analysis

Description:

( Note, log will represent natural log, in Excel you must use ln, not log). Relative Risk ... Linear trend alternative to independence. is chi-squared with one ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 29
Provided by: asNi7
Category:

less

Transcript and Presenter's Notes

Title: AS 737 Categorical Data Analysis


1
AS 737Categorical Data Analysis
  • Week 2

2
The Data (var00002)
3
Binomial Test
4
The Result
Using SPSS. But how is it calculated?
5
How To Calculate the P-value
Binomial Table Made in Excel with n20 and p0.70
Why? P-value is the probability of observing
what was observed or more extreme under the null
hypothesis. In our example X15 so p-value
equals P(xgt15).4163708
P-value equals the sum of the probabilities 15
through 20 .4163708
6
Binomial Test
x15
The way the data is analyzed it treats the 0 as
a success. There are 15 zeros.
Thus again, P-value equals the sum of the
probabilities 15 through 20 P(Xgt15) .4163708
7
Sampling
  • Last week we covered the Binomial distribution
    and Poisson distribution.
  • Count data often comes from the
    Binomial/Multinomial or Poisson distribution.
  • Luckily whether the data comes from
    Binomial/Multinomial or Poisson distribution for
    most analysis of the categorical data is
    performed in the same manor.
  • For this reason we will often not discuss which
    distribution the data came from.

8
Two-Way Contingency Tables
Belief in Afterlife
Gender
nrc (n, 1st row, 2nd column)
9
Joint, Marginal and Conditional Probabilities
10
Independence
11
Difference of Proportions
When the counts in the two rows are independent
binomial samples, the estimated standard error of
p1-p2 is
Class take 10 minutes to do the
following Calculate the 95 confidence interval
for the difference in proportions between women
and men (women-men) that believe in an afterlife.
12
Difference of Proportions
95 CI for the difference in proportion (can
range from -1 to 1) .010684/-1.96.02656 .010684
/-.052057 (-0.04137,0.062741) Do you believe the
difference is different from zero?
Now that we have calculated a 95 CI, Explain
what a 95 CI is. Were we to take an infinite
number of samples and create an infinite number
of 95 confidence intervals 95 of those
intervals created would contain the true
difference of
13
Difference of Proportions
Myocardial Infarction (MI)
Group
Class take 5 minutes to do the following Calculat
e a 95 for difference in proportions.
14
Difference in Proportions vs. Relative Risk
The 95 CI is (.0171-.0094)/-1.96(0.0015) Approx
(.005,.011), appears to diminish risk of
MI Another way to compare the placebo vs. Aspirin
is to look at the relative risk, The sample
relative risk is p1/p2.0171/.00941.82 Thus in
the sample there were 82 more cases of MI from
the placebo than Aspirin. To calculate the CI for
relative risk you would first calculate the CI of
the log of relative risk and then take the CI
limits and the taken the antilog. (Note, log will
represent natural log, in Excel you must use ln,
not log).
15
Relative Risk
The confidence interval for the relative risk
is (1.43, 2.31). From this we would the relative
risk is at least 43 higher for patients taking
aspirin. It can be misleading to only look at the
difference in proportions, looking at this
situation in terms of relative risk, clearly you
would want to take Aspirin.
0.597628/-1.960.121347(.359787,.835469) Exp(0.3
59787) and Exp(0.835469)(1.43,2.31)
16
The Odds Ratio
The odds are nonnegative, when the odds are
greater than one a success is more likely than a
failure. The odds ratio can equal all nonnegative
numbers. When X and Y are independent then the
odds ratio equals 1. An odds ratio of 4 means
that the odds of success in row 1 are 4 times the
odds of success in row 2. When the odds of
success are higher for row 2 than row 1 the odds
ratio is less than 1.
17
The Odds Ratio
The maximum likelihood estimator of the odds
ratio is
The asymptotic standard error for the log of the
MLE is
The confidence interval is
18
Inference for Log Odds Ratios
Class take10 minutes to do the following Calculat
e the Odds ratio for MI, and then a 95 CI for
the odds ratio.
19
Inference for Log Odds Ratios
Odds ratio(18910933)/(10410845)1.832 Log(1.832
).605 ASE of the log (1/1891/109331/108451/1
04)1/2.123 95 CI of the log odds ratio is
(.365,.846) Thus the 95 CI of the Odds ratio is
(1.44,2.33)
20
Dealing with small cell counts and the
For when zero cell counts occur or some cell
counts are very small, the following slightly
amended formula is used
The Relationship Between Odds Ratio and Relative
Risk
21
Chi-Squared Tests
For calculating chi-square statistics for testing
a null hypothesis with fixed values we
use expected frequencies
22
Chi-Squared Tests of Independence
For calculating chi-square statistics for testing
a null hypothesis with assuming independence

Most likely the true probabilities are unknown
and the sample probabilities must be used
23
Chi-Squared Test of Independence
Take 15 minutes and calculate the Pearson
statistic and the likelihood ratio chi-squared
statistic for the null hypothesis that the
probability of heads is the same for all people,
assuming the true probability is unknown.
Coin Toss
Person
24
Adjusted Residuals
When the null hypothesis is true, each adjusted
residual has a large-sample standard normal
distribution. An adjusted residual about 2-3 or
larger in value indicates lack of fit of the null
hypothesis within that cell. Take 10 minutes to
calculate the adjusted residuals
Political Party Identification
Gender
25
Adjusted Residuals
From this example we can see how the adjusted
residuals can add further insight beyond the
chi-squared tests of independence. Such as
direction.
Political Party Identification
Gender
26
Chi-Squared Tests of Independence with Ordinal
Data
Linear trend alternative to independence.
is chi-squared with one degree of freedom.
M, its square root follows a standard normal
distribution. M gives insight into direction.
Note, when categories do not have scores such as
education level logical scores must be assigned.
E.G. High School degree 1, College degree 2,
Masters degree3
27
Example with Ordinal Data Alcohol and Infant
Malformation
Infant Malformation
Alcohol Consumption
Take 2-3 minutes and think of logical value
assignments for scores. Note nominal binary
data can be treated as ordinal.
28
Example with Ordinal Data Alcohol and Infant
Malformation
Infant Malformation
Alcohol Consumption
Take 20 minutes using the scores given calculate
r.
Write a Comment
User Comments (0)
About PowerShow.com