Title: Biostatistics course Part 13 Effect measures in 2 x 2 tables
1Biostatistics coursePart 13Effect measures in 2
x 2 tables
- Dr. Sc. Nicolas Padilla Raygoza
- Department of Nursing and Obstetrics
- Division Health Sciences and Engineering
- University of Guanajuato
- Campus Celaya-Salvatierra
2Biosketch
- Medical Doctor by University Autonomous of
Guadalajara. - Pediatrician by the Mexican Council of
Certification on Pediatrics. - Postgraduate Diploma on Epidemiology, London
School of Hygiene and Tropical Medicine,
University of London. - Master Sciences with aim in Epidemiology,
Atlantic International University. - Doctorate Sciences with aim in Epidemiology,
Atlantic International University. - Associated Professor B, Department of Nursing and
Obstetrics, Division of Health Sciences and
Engineering, University of Guanajuato, Campus
Celaya Salvatierra, Mexico. - padillawarm_at_gmail.com
3Competencies
- The reader will obtain Risk Ratio or Odds Ratio
from a 2 x 2 table. - He (she) will calculate 95 confidence interval
from RR or OR. - He (she) will identify potential confounders
and/or interactions. - He (she) will apply Mantel Haenzsel test for RR,
OR and Chi-squared.
4Introduction
- In part 12 of the course, we tested the
association between two categorical variables. - Now, we review the methods used to measure the
association. - We will work with binary variables, so we will
use 2 x 2 tables.
5Example
- A nurse in a poor area of Mexico, was informed
that many area children attending the nursery
were sick of respiratory infections. - She designed a cohort study to investigate the
problem. - During the following years 1000 children were
followed. - The main research question was
- Attending nursery is associated with respiratory
infection?
6Example
Respiratory infection Respiratory infection Total
Attending nursery Yes n No n
Yes 37 33.9 72 66.1 109
No 43 4.8 848 95.2 891
Total 80 8 920 92 1000
7Risk Ratio (RR)
- In health research, the term "risk" is used
instead of proportion. - For example
- The risk of infection among children attending
day care was 33.9. - Thus, the risk ratio is the ratio of two
proportions. - The risk of respiratory infection for those
attending the nursery 37 / (37 72)
37/109 0.339 - The risk of respiratory infection in children not
attending day care is 43 / (43 848) 43/891
0.048. - The risk ratio (RR) is the ratio of these two
risks. - Risk ratio 0.339 / 0.048 7.06
8Risk Ratio (RR)
- In general, the risk ratio can be obtained with
the following formula, where a, b, c and d are
the frequencies in the 2 x 2 table.
Outcome Outcome Total
Exposure Yes No
Yes a b a b
No c d c d
Total a c b d N
Risk Ratio (a /ab) / (c/c d)
9Odds Ratio (OR)
- The Odds Ratio (OR) is the ratio of the chance
(probability) of the results between those
exposed and the chance of the outcome among
non-exposed. - The chance of infection among attendees of the
nursery is 37 / 72 0,514 - The chance of infection among children not
attending day care is 43 / 848 0,051 - The Odds Ratio of these two probabilities OR
0,514 / 0,051 10.08 - In general, the Odds Ratio was found with the
following formula - OR ad / bc (a / c) / (b / d)
10Confidence intervals
- In the analysis of data from children attending
day care or not, we have the option to use RR or
OR, to measure the effect of attendance at the
nursery. - Each value is an estimate only, so these values
should be reported with confidence intervals. - An approximate confidence interval at 95 for the
RR is found using the following formula - Minimum value RR / EF
- Maximum value RR x EF
EF exp(1.96v(1/a) (1/ab) (1/c) (1/cd))
11Confidence intervals
- CI for the data of children who attend day care
or not, is - EF exp (1.96 v 1 / 37 - 1 / 109 1 / 43 -1/891
1.48 - RR 7.06
- Minimum 7.06/1.48 4.77
- Maximum value 7.06 x 1.48 10.45
- 95 CI 4.77 to 10.45
12Confidence intervals
- An approximate confidence interval at 95 for the
OR is found using the following formula - Minimum value OR / EF
- Maximum value OR x EF
EF exp(1.96v(1/a) (1/b) (1/c) (1/d))
13Confidence intervals
- CI for the data of children who attend day care
or not, is - EF exp (1.96 v 1 / 37 1 / 72 1 / 43 1 /
848 1.65 - OR 10.08
- Minimum value 10.08/1.65 6.11
- Maximum value 10.08 x 1.65 16.63
- 95 CI 6.11 to 16.63
14Which measure is best?
- Risk Ratios are calculated for cross-sectional
and cohort studies. - The formula for the 95 confidence interval for
RR requires larger sample sizes than for OR. - OR are calculated for case-control and
cross-sectional studies. - In case-control studies is not possible to
calculate risks, and therefore can not calculate
RR. - There is an advantage in using OR.
- It is a consistent measure of effect, unlike RR.
15Example (Cont)
- Mexican children showed a strong association
between exposure (attending nursery) and outcome
(respiratory infection). - However such an association may be confounded by
other factor(s). - For example, although children who attend day
care, seem to have a 7 times higher risk of
respiratory infection, the cause of the infection
can also be something that is associated with
children who go to daycare. - In other words, to attend the nursery may be a
marker of exposure that causes a respiratory
infection. - If this is true, we can say that the association
between respiratory infections and assistance to
the nursery, are confused.
16How identify a potential confounder?
- To evaluate a potential confounder, we should
consider three aspects - The exposure
- The outcome
- The confounder
17Example
- The nurse is interested in the association
between day care attendance and presence of
respiratory infection, but is aware that children
might be exposed to other factors that cause
respiratory infection. - For example, overcrowding at home is a risk
factor for respiratory infection. - It is therefore a potential confounder of the
association between attendance at day care and
respiratory infections.
18Confounders
- For a variable has been a potential confounding,
it should meet three conditions - Must be
- an independent risk factor for the outcome of
interest - should be associated with the exposure of
interest - not be in the cause pathway between exposure and
outcome.
19Confounders
- How do we check these conditions in the study of
Mexican children? - Condition 1 of confusion
- Risk factor for the outcome of interest
- Is there an association between overcrowding and
respiratory infection?
Overcrowding in home RI Yes RI No Risk of RI
Yes 54 55 54/109 0.5
No 21 870 21/891 0.02
RR 25 95CI 15.72 a 39.75 X2 311.67 Pltlt0.05
20Confounders
- How do we check these conditions in the study of
Mexican children? - Condition 2 of confusion
- Association with exposure
- Is there an association between overcrowding and
assistance to child care?
Overcrowding in home Attendance to nursery Yes Attendance to nursery No
Yes 43 66
No 35 856
X2 170.39 Pltlt0.05
21Confounders
- How do we check these conditions in the study of
Mexican children? - Condition 3 of confusion
- Is the potential confusion is the causal pathway?
- In this example, it is unlikely that child care
assistance, is caused by overcrowding
22Do we have a confounder?
- In this study, overcrowding has satisfied the
three conditions necessary for a confounding
variable - It is an independent risk factor for the outcome
of interest. Overcrowding is associated with
respiratory infection. - It is associated with the exposure of interest.
Overcrowding is associated with attendance at the
nursery. - It is not in the causal pathway. Overcrowding is
unlikely to be the cause of attendance at nursery.
23Stratified tables
- Now, we know that the data must be additionaly
analyzed for to have the effect of overcrowding. - To adjust for confounder variable, we stratified
the table 2 x 2 of interest. - The table without stratify is called raw table.
- Can be divided into strata defined by the
confounder variable. - The sample is divided into two groups, each of
them the status of overcrowding is the same. - The two groups are
- Overcrowding and without overcrowding
24Stratified tables
- If we want to find childcare assistance is
associated with respiratory infection when
comparing children within the same category of
overcrowding. - The raw table for the relationship between
respiratory infections and child care assistance
Respiratory infection Respiratory infection Total
Attendance to nursery Yes n No n
Yes 37 33.9 72 66.1 109
No 43 4.8 848 95.2 891
Total 80 8 920 92 1000
25Stratified tables
- Now, it is show stratified tables by overcrowding
and without overcrowding
Overcrowding
Without overcrowding
Respiratory infection Yes Respiratory infection No Total
Nursery Yes 10 24 34
Nursery No 4 861 865
Total 14 885 899
Respiratory infection Yes Respiratory infection No Total
Nursery Yes 61 14 75
Nursery No 5 21 26
Total 66 35 101
RR 4.23 X232.88 p0.0000 95CI 1.91 a 9.37
RR 63.6 X2178.84 p0.0000 95CI 21.01 a 192.56
26Stratified tables
- Do you think that attendance at nursery is a risk
factor for respiratory infections among children
with overcrowding? - Yes, children attending day care are 63 times
more at risk of respiratory infection than those
who do not attend nursery. - The p value indicates a strong association
between attendance at daycare and respiratory
infection in the group without overcrowding.
27Stratified tables
- Do you think that attendance at nursery is a risk
factor for respiratory infection in the group
without overcrowding? - Yes, children attending day care are more than 3
times more at risk of respiratory infection than
those not attending the nursery. - The p value indicates a strong association
between attendance at daycare and respiratory
infection in this group. - Within each stratum, the association between
attendance at day care and respiratory infections
is now independent of overcrowding at home.
28Comparison of results
- How to compare these results with those of the
raw table? - The raw table shows a strong relationship between
attendance at day care and respiratory infection,
RR is different in both tables stratified but
remains a significant statistical association.
RR 95CI X2 P-value
Raw 7.06 4.77 a 10.45 111.88 lt0.05
Overcrowding 4.23 1.91 a 9.37 32.88 lt0.05
Without overcrowding 63.6 21.01 a 192.56 178.84 lt0.05
29Adjusted Risk Ratios
- Nurse do not want show data divided into strata,
prefer a global estimate of the effect of
attended to nursery in respiratory tract
infection adjusted by overcrowding. - This can be done by calculate RR using a Mantel
Haenzsel method. - First, look 2 x s table in each strata.
Exposure Disease Yes Diasease No Total
Yes ae be
No ce de
Total ne
30Risk Ratios from Mantel Haenzsel
- Adjusted RR (summarized), can be obtained with
-
- ? a (cd)/n
- RRMantel Haenzsel ---------------
- ? c (ab)/n
- This give us a average of RR initially estimate
into each table more important each table with
more sample size.
31Adjusted Risk Ratio
- We calculate overcrowding adjusted RR with Mantel
Haenzsel formula
Overcrowding
Non-overcrowding
Respiratory infection Yes Respiratory infection No Total
Nursery Yes 61 14 75
Nursery No 5 21 26
Total 66 35 101
Respiratory infection Yes Respiratory infection No Total
Nursery Yes 10 24 34
Nursery No 4 861 865
Total 14 885 899
61 (5 21)/ 101 10 (4 861)/899 15.70
9.62 25.32 ---------------------------------
--------------- ----------------- -----------
6.56 5 (61 14)/101 4 (10 24)/899
3.71 0.15 3.86
32Adjusted Odds Ratio
- Adjusted OR is calculate in similar form that
adjusted RR. - ? ad/n
- RMMantel Haenzel -----------
- ? bc/n
Exposure Disease Yes Diasease No Total
Yes ae be
No ce de
Total ne
33Adjusted Odds Ratio
- In a cross-sectional study, on the use of
quinfamide after a amoebic dysentery, it was
reported how many are carriers of Entamoeba
histolytic.
Non-carrier Carrier Total
Quinfamide 100 54 154
Non quinfamide 15 72 87
Total 115 126 241
34Adjusted Odds Ratio
- We calculate adjusted OR by residence area, with
the Mantel Haenzsel formula
Urban
Rural
Non-carrier Carrier Total
Quinfamide Yes 35 39 74
Quinfamide No 10 51 61
Total 45 90 135
Non-carrier Carrier Total
Quinfamide Yes 65 14 79
Quinfamide No 5 21 26
Total 70 35 105
(35 x 51 /135) (65 x 21/105) 13.2 13
26.2 ----------------------------------------
----------------- ---------- 7.4 (39 x 10 /
135) (14 x 5 /105) 2.89 0.67 3.56
35Mantel Haenzsel X2
- The nurse now knows that the association between
respiratory infection and attend to nursery still
is after adjusted by overcrowding, confounder
variable. - Now, she want to calculate a Chi squared test to
significance of this association, adjusted by
confounder. - This can be do, calculating X2Mantel-Haenzsel
test.
36Mantel Haenzsel X2
- To calculate adjusted Chi squared test for the
confounder, we calculate Mantel Haenzsel Chi
squared. Null hypothesis is that there is not
association between attend to nursery and
respiratory infection. - Ho OR 1.
?ae-?E(ae)2 X2Mantel
Haenzsel -------------------
?V(ae)
37Mantel Haenzsel X2
- We should go, step by step, beginning with 2 x 2
of each strata.
Exposure Disease Yes Disease No Total
Yes ae be
No ce de
Total ne
38Mantel Haenzsel X2
- Mantel Haenzsel Chi squared test is an average of
individuals Chi squared of each table. - To calculate Mantel Haenzsel Chi squared test, we
need three values of each table - ae number of ill and exposed
- E(ae) value expected of ae
- V(ae) variance (standard error squared) of ae,
- where,
- E(ae) total row x total column / grand total
(ae be) x (ae ce)/ne - (ae be) x (ce de) x (ae
ce) x (be de) - V(ae)
--------------------------------------------------
------ -
ne²(ne - 1)
39Example
- Overcrowding table
- a 61
- E(a) 75 x 66 / 101 49.01
- V(a) (75 x 66 x 26 x 35) / (101² x (101 - 1))
4.42 - Non-overcrowding table
- a 10
- E(a) 34 x 14 / 899 0.53
- V(a) 34 x 14 x 865 x 885 / (899² x (899 - 1))
0.50 - To obtain Mantel Haenzsel Chi squared test
(adjusted Chi squared by overcrowding), we add
these values from the two strata, using the
formula
?ae-?E(ae)2 X2Mantel
Haenzsel -------------------
?V(ae)
40Example
- To obtain Mantel Haenzsel Chi squared test
(Adjusted Chi squared test by overcrowding), we
add these values, using the formula - a
E(a) V(a) - Overcrowding 61
49.01 4.42 - Non-overcrowding 10
0.53 0.50 - Total 71
49.54 4.92 -
- X2Mantel-Haenzsel (71 49.54)²/4.92 93.60
41Confusion or not confusion
- How we decide if there is confusion?
- There are nor statistical tests to demonstrate
confusion. - We do calculate statistical tests and measure the
effect raw and stratified tables. - Then, we calculate summarized statistical test
and we compare them with the raws, and we
conclude if there is confusion or not.
42Confusion or not confusion
- If there is an important difference between raw
and adjusted estimates, we say that the
association of interest is confounding by another
factor. - We look the data of children that attend to
nursery and respiratory infection. - After adjust by overcrowding, RR diminish from
7.06 to 6.56.
43Posibles effects from confusion
- Generally there are more than one confounder.
- They can have different effects
- The association in study, can be or not
significative before of adjust for a confounder
and not significative after. - The association can be significative after adjust
for a confounder but with a p-value less
significative. - Strata can show oposite results and in this case,
it is better, show stratified results. This is
interaction or effect modified. - Confounder can hide an existing relationship.
44Bibliografía
- 1.- Last JM. A dictionary of epidemiology. New
York, 4ª ed. Oxford University Press, 2001173. - 2.- Kirkwood BR. Essentials of medical
ststistics. Oxford, Blackwell Science, 1988 1-4. - 3.- Altman DG. Practical statistics for medical
research. Boca Ratón, Chapman Hall/ CRC 1991
1-9.