Title: Cohort studies Case control studies Field methods
1Cohort studiesCase control studiesField methods
2- Every epidemiological study should be viewed as a
measurement exercise - Kenneth J. Rothman, 2002
What does this mean?
3Cohort study
- What is a cohort study?
- Types of cohorts
- Observational - retro, prospective,
- Clinical trials
- Other?
- What is a fixed cohort? An open cohort?
- What is the denominator in a cohort study?
- How is it calculated?
4Elements of longitudinal study designsEssential
question what is the nature of time?
Confounding
Modification
Mediation
Changes in exposures
Exposures At start of study
History of exposures
Outcome(s)
Exposures times
Study times
Study times
5Longitudinal studies are all about change over
time
- Change in level and effects of risk factors due
to - Physiological changes related (perhaps) to age
- Selective survival
- Differential due to higher mortality in high risk
- Competing risk from other outcomes (esp death)
- Differential attrition by exposure
- Biasing effects of attrition
- Time trends in population levels of exposure
- Reverse causality of disease on exposure
- Proximity of exposure measurement to outcome
6Factors determining trends with age in risk
factordisease associations
7Change over time in risk-factor disease
associations
8Classifying person time exposed
What is the consequence for effect estimates of
misclassifying exposure time?
9Estimating exposure over time
- Measurement of exposure (validity, reliability)
- Change over time in exposure
- Baseline
- After baseline
- Time that is unexposed
- baseline time, average over time, cumulative
exposure
10Time exposed vs. time at risk
- Time from exposure to event
- Induction period
- Threshold of cumulative exposure
- Average exposure, cumulative exposure,
- Immortal person time (?)
- Reverse causality
11Example of cohort study
12The 32-year relationship between cholesterol and
dementia from midlife to late life
Background Cellular and animal studies suggest
that hypercholesterolemia contributes to
Alzheimer disease (AD). However, the relationship
between cholesterol and dementia at the
population level is less clear and may vary over
the lifespan. Methods The Prospective
Population Study of Women, consisting of 1,462
women without dementia aged 3860 years, was
initiated in 19681969 in Gothenburg, Sweden.
Follow-ups were conducted in 19741975,
19801981, 19921993, and 20002001. All-cause
dementia was diagnosed according to DSM-III-R
criteria and AD according to National Institute
of Neurological and Communicative Disorders and
StrokeAlzheimers Disease and Related Disorders
Association criteria. Cox proportional hazards
regression examined baseline, time-dependent, and
change in cholesterol levels in relation to
incident dementia and AD among all participants.
Analyses were repeated among participants who
survived to the age of 70 years or older and
participated in the 20002001 examination. Result
s Higher cholesterol level in 1968 was not
associated with an increased risk of AD (highest
vs lowest quartile hazard ratio HR 2.82, 95
confidence interval CI 0.948.43) among those
who survived to and participated in the 20002001
examination. While there was no association
between cholesterol level and dementia when
considering all participants over 32 years, a
time dependent decrease in cholesterol over the
follow-up was associated with an increased risk
of dementia (HR 2.35, 95 CI 1.224.58). Conclusi
on These data suggest that midlife cholesterol
level is not associated with an increased risk of
AD. However, there may be a slight risk among
those surviving to an age at risk for dementia.
Declining cholesterol levels from midlife to late
life may better predict AD risk than levels
obtained at one time point prior to dementia
onset. Analytic strategies examining this and
other risk factors across the lifespan may affect
interpretation of results. Mielke et al in
Neurology 20107518881895
13Attrition, cumulative mortality, cumulative
person years
14Fig 1 - Mean cholesterol levels in the
Prospective Population Study of Women by
examination year andbirth cohort
15Table 1 Characteristics of PPSW participants by
dementia status over 32years (n 1,462)
16A9R7A98 MIELKE tab 2.xlsx
17(No Transcript)
18(No Transcript)
19(No Transcript)
20Case-Control Studies
21General Definition of a Case-Control Study A
method of sampling a population in which cases of
disease are identified and enrolled, and a sample
of non-cases of the population that produced the
cases is identified and enrolled. Exposures are
determined in the same way for individuals in
each group.
22TROHOC (cohort) STUDIES
- This disparaging term was given to case-control
studies because their logic seemed backwards and
they seemed more prone to bias than other
designs. - No basis for this derogation.
- Case-control studies are a logical extension of
cohort studies and an efficient way to learn
about associations.
23Introduction
- Hypo. Example Vitamin D exposure increases the
risk of breast cancer. - Consider a hypothetical prospective cohort study
of 89,949 women aged 34-59 1,439 breast cancer
cases identified over 8 years of follow-up - Blood drawn on all at beginning of follow-up and
frozen - Exposure Level of Vit D in blood characterized
as high or low
24Breast Cancer
Vit D
25- Practical Problem Quantifying Vit D levels in
the blood is very expensive -- it's not practical
to analyze all 89,949 blood samples - To be efficient, analyze blood on all cases
(N1,439) but just take a sample of the women who
did not get breast cancer, say two times as many
cases (N2,878)
26All cases and subsample of controls
Breast Cancer
Vit D
These data can be used to estimate the relative
risk.
- Identify cases of disease from a defined
population, - take a sample of controls from that population.
27When is it a good idea to conduct a case-control
study?
- When exposure data are expensive or difficult to
obtain - - Ex Vit D study described earlier
- When disease has long induction and latent period
- - Ex Cancer, cardiovascular disease
-
- When the disease is rare
- Ex Studying risk factors for birth defects
- When little is known about the disease
- Ex. Early studies of AIDS
- When underlying population is dynamic
- Ex Studying breast cancer on Cape Cod
28Cases
- Criteria for case definition should lead to
accurate classification of disease - Efficient and accurate sources should be used to
identify cases eg, existing registries,
hospitals -
Disease
Exposed
29Cases give you the numerators of the rates of
disease in exposed and unexposed groups being
compared
- Rate of disease in exposed a/?
- Rate of disease in unexposed c/?
The denominators are missing. If this were a
cohort study, you would have the total population
(if you were calculating cumulative incidence) or
total person-years (if you were calculating
incidence rates) for both the exposed and non
exposed groups, which would provide the
denominators for the compared rates.
30Where do you get the information for the
denominators in a case control study? THE
CONTROLS.
- A case-control study can be considered a less
costly, more efficient form of a cohort study. - Cases are the same as those that would be
included in a cohort study. - Controls provide a fast and inexpensive means of
obtaining the exposure experience in the
population that gave rise to the cases.
31Controls
- Definition A subsample of the source population
that gave rise to the cases. - Ideal nested case-control study from a cohort
- Purpose To compare the exposure distribution in
the source population that produced the cases.
32Selecting Controls
- General population controls
- Existing cohort study
- Most often used when cases are selected from a
defined geographic population - Sources random digit dialing, residence lists,
drivers license records -
33Selecting Controls
- Advantages of general population or cohort study
controls - Because of selection process, investigator is
usually assured that they come from the same base
population as the cases. - In a cohort, exposure is standardized for both
cases and controls
34Selecting Controls
- Disadvantages of general population controls
- Time consuming, expensive, hard to contact and
get cooperation may remember exposures
differently than cases (recall bias)
35Selecting Controls
- Hospital controls
- Used most often when cases are selected from a
hospital population
Example Study of cigarette smoking and
myocardial infarction among women. Cases
identified from admissions to hospital coronary
care units. Controls drawn from surgical,
orthopedic, and medical unit of same hospital.
Controls included patients with musculoskeletal
and abdominal disease, trauma, and other
non-coronary conditions.
36- Advantages of hospital controls
-
- Same selection factors that led cases to hospital
led controls to hospital (?) - Easily identifiable and accessible (so less
expensive than population-based controls) - Accuracy of exposure recall comparable to that of
cases since controls are also sick (?) - More willing to participate than population-based
controls
37- Disadvantages of hospital controls
- Since hospital based controls are ill, they are
unlikely to accurately represent the exposure
history in the population that produced the cases - Hospital catchment areas may be different for
different diseases
38- What illnesses make good hospital controls?
- Those illnesses that have no relation to the
risk factor(s) under study - Q Should respiratory diseases be used as
controls for a study of smoking and myocardial
infarction? Do they represent the distribution of
smoking in the entire population that gave rise
to the cases of MI?
39Selecting Controls
- Special control groups like friends, spouses,
siblings, and deceased individuals. - These special controls are rarely used.
- Exposures in these controls may not be
independent of cases, eg, diet in families. What
effect would that have on the estimate? - Some cases are not able to nominate controls
because they have few appropriate friends, are
widowed, or are only or adopted children. - Dead controls are tricky to use because they are
more likely than living controls to smoke and
drink.
40Sampling a cohort population for controls nested
case-control study
- 1. Sample the population at risk at the start of
the observation period - -------------------------------------------------
------------------------ - Start FU
End FU -
41Sampling a cohort population for controls nested
case-control study
- 2. Sample population at risk as cases develop
- -------------------------------------------------
------------------------ - Start FU
End FU -
42Sampling a cohort population for controls nested
case-control study
- 3. Sample survivors at the end of the observation
period - -------------------------------------------------
----------------------- - Start FU
End FU -
-
43Nested case-control study on Vit D and breast
cancer
- Hypothetical cohort study of 89,949 women 1,439
breast cancer cases identified over 8 years of
follow-up - Blood drawn on all 89,949 at beginning of
follow-up and frozen - Exposure Level of Vit D in blood characterized
as high or low
44Nested case-control study on Vit D and breast
cancer
Breast Cancer
Vit D
Analyzed blood on all cases (N1,439) and a
sample of controls (N2,878 3.3 of non cases).
45Analysis of case-control studies
Exposed
Because controls are a sample of the population
that produced the cases, size of the total
population may be unknown.
46Analysis of case-control studies
- Two possible outcomes for an exposed person case
or not - Oddsa/b
- Two possible outcomes for an unexposed person
case or not Oddsc/d - Odds ratio odds of an exposed person being a
case a/b ad/bc - odds of unexposed person being a case c/d
- Just like the incidence rate ratio and cumulative
incidence ratio, the odds ratio is a ratio
measure of association.
47Analysis of case-control studies
- EXAMPLE Case control study of spontaneous
abortion and prior induced abortion (OUTCOME
spontaneous abortion EXPOSURE prior induced
abortion)
48Analysis of case-control studies
- Odds of being a case among the exposed
- 42/247 (a/b)
- Odds of being a case among the unexposed
107/825 (c/d) - Odds ratio (a/b) / (c/d)
- (42/247) / (107/825)
1.31 - Women with a history of induced abortion had a
30 increased risk of having a spontaneous
abortion compared to women who never had an
induced abortion.
49Strengths of case-control studies
- Efficient for rare diseases and diseases with
long induction and latent period. - Can evaluate many risk factors for the same
disease. So, good for diseases about which little
is known.
50Weaknesses of case-control studies
- Inefficient for rare exposures
- Vulnerable to bias because of retrospective
nature of study - May have poor information on exposure because
retrospective - Difficult to infer temporal relationship between
exposure and disease - How do these strengths and weaknesses
compare to cohort studies?
51SUMMARYIs a Cohort an imitation of a randomized
controlled trial?
- Gold-standard (?) randomized, placebo-controlled,
double-blinded study - Least biased ?
- Exposure randomly allocated to subjects -
minimizes selection bias - Masking of exposure status in subjects and study
staff minimizes information bias - Selection bias into trials
- Generalizability of trial is not usually good
- Trials often suffer from the same biases as
observational studies
52Bias in prospective cohort studies
- Loss to follow up
- The major source of bias in cohort studies
- Assume that all do / do not develop outcome?
- Ascertainment and interviewer bias
- Some concern Knowing exposure may influence how
outcome determined - Non-response, refusals
- Little concern Bias arises only if related to
both exposure and outcome - Recall bias
- No problem Exposure determined at time of
enrollment
53Bias in retrospective cohort case-control
studies
- Ascertainment bias, participation bias,
interviewer bias - Exposure and disease have already occurred ?
differential selection / interviewing of
compared groups possible - Recall bias
- Cases (or ill) may remember exposures differently
than controls (or healthy)
54- Field methods in epidemiology
55Getting keeping participants
- Target population
- Study population
- Intervention vs. observational cohorts
- Sources of information on eligible participants
- Human subjects
- Eligibility screening
- Retention
- Recruitment
- http//www.youtube.com/watch?v104kVHB6nr0feature
related - Incentives
- Participant burden
- Selection bias
- Attrition bias
56Would you join the National Childrens study?
57EXAMPLE OF POPULATION SELECTION, RECRUITMENT
AND FOLLOW UP
58Sacramento Area Latino Study on Aging (SALSA)
Location of study population
59SALSA Study Counties Census Tracts Percent
Hispanic
Sacramento City
Data Source U.S. Census Bureau, 2000 Census of
Population and Housing, Summary File 3 Technical
Documentation, 2002.
60SALSA Study Participants
Data Source U.S. Census Bureau, Geography
Division, Cartographic
Products Management Branch
61Sacramento Area Latino Study on Aging Cohort
Study
- Study Population
- 1,789 Latinos aged 60, primarily Mexican
ancestry (95) - 49 US born and 51 Mexico or another Latin
American country - 58.3 female
- Mean age at baseline 71 (60-101)
- 51 Spanish speaking
- Median education 12 years in US born, 4 years in
migrants - Study period
- Baseline 1998-99
- Annual follow-up through 2008
- Semi annual phone interviews
- In home clinical evaluations and interview
- Cognitive testing, clinical assessments
- Socio-demographic factors
- Medical history
- Measured Vascular risk factors (blood pressure,
obesity, diabetes) - Biological samples (DNA, blood, cortisol)
- Mortality
- Dementia
62Sacramento Area Latino Study on Aging (SALSA)
Cohort Study, Baseline and Follow up 1998-2008
12-15 month Home Visits
Fasting blood draw in home visits
Semi-Annual phone calls
Neuropsychological Test , MRIs, and Diagnosis of
Dementia
Mortality Surveillance- ongoing
63Getting keeping high quality data
- Method in person, phone, mail
- Interview/questionnaire
- Clinical information
- Biological information
- Data collection protocols
- Pre/pilot testing
- Data management
- Electronic
- Data entry
- Coding
- Review/QC
64Questionnaire design problems
- Survey data appear precise and factual, but are
actually complex estimates - Some possible threats to accuracy derived from
the questionnaire - Questions not understood as intended
- Dont adequately capture respondent experience
- Pose an overly challenging response task
- Problems may not be visible in the actual survey
data - How can we find these before data collection?
65What, to you, is your abdomen?
66In the last year, have you been bothered by pain
in the abdomen?
- Seems to be straightforward
- But suppose we ask
- What, to you, is your abdomen?
- What does it mean to be bothered by pain in the
abdomen? - What period of time are you thinking about here
specifically?
67In the last year, have you been bothered by pain
in the abdomen?
- Possible revisions
- Show shaded picture of abdomen
- Drop bothered
- Use In the past 12 months
- in the past 12 months have you had..
- Define pain level, location, persistence
- Clear alternatives address these problems, with
no apparent drawbacks
68Cognitive stages involved in responding to survey
questions
- Comprehension Respondent interprets the
question - Retrieval Respondent searches memory for
relevant information - Estimation/Judgment Respondent evaluates and/or
estimates response - Response Respondent provides information in the
format requested
69Comprehension task
- Do you use any assistive devices to help with
mobility, communication, self care, accessing
your workplace? - Yes
- NoSource 1972/74 Social Security
Administration Survey of Disabled and Non
Disabled Adults
70Comprehension Transcription
- Thats a mouthful of questions. Assistive
device?? - Well, I guess we all could be classified. We
use glasses. I guess glasses are assisted devices
and so I guess almost everyone has to say yes to
that and in my case I wear glasses and I also
have hearing aids. Thats it. I dont use a cane
or anything
71 I think a lot of people may have trouble with
that question though because its kind of stiff
and formal. Like if you would say assistive
devices such as and give some examples, maybe a
person might pick up on it a little bit because
right off , the first thing I think about is a
walker or cane or something but you are telling
about hearing, communication which includes
hearing and speaking, seeing, all of those are
communication devices. Anyhow.
72Retrieval task
- How old were you when this (high blood pressure)
was first diagnosed? - Source US/Canada Joint Health Survey
73Retrieval Transcription
- Huh, shoot I was in my forties but I dont
remember exactly when - Because you said that it was in 1996
- Yeah, I remember this so good because I moved
back here in 95 I know I was in my forties
74Judgment task
- Some people who have health conditions,
impairments, or disabilities get help from other
people in order to get around, lift or carry
things, communicate, keep track of things, or
remember things. - As a result of your compression fracture, or your
hearing or your shoulder, do you require help
from other people? - Yes
- NoSource Disability Statistics Institute
75Judgment Transcription
- I hate to say require, but of course I did
require it when that compression fracture
happened first but now Im back doing everything
myself and I do have the neighbors come in and I
have my lawn mowed instead mowing it and I have
my gardening done instead of doing it and so
forth so I guess I required help
76Response task
- Does a physical condition or mental health
problem reduce the amount or the kind of activity
you can do at home? - Yes, sometimes
- Yes, often
- NoSource Canadian Cooperative Survey
77Response Transcription
- Hmm theres something I cant do so hmmm and
theres things that I can but with difficulty or
with aid hmm so actually I dont have an answer
to that I guess I would be if I have one always I
guess we fall in the always - You really would like an always category
- Yes
78Summary of field operations
- Recruitment issues retention issues
- Pre/pilot test all protocols
- Establish standardized protocol that is monitored
throughout the study for quality