Title: Ecologic vs. Individual Studies
1Ecologic vs. Individual Studies
Dan Wartenberg PhD Chief, Division of
Environmental Epidemiology Department of
Environmental and Occupational Medicine UMDNJRobe
rt Wood Johnson Medical School, Piscataway, NJ
08854 USA W.Douglas Thompson PhD Chair,
Department of Applied Medical Sciences School of
Applied Science, Engineering and
Technology University of Southern Maine,
Portland, ME 04104 USA
Research support U19/EH000102 from NCEH, CDC for
UMDNJ APEX NIEHS P30ES005022 from NIEHS, for
UMDNJ CEED
2Context of this Presentation
- SAHSU (UK)
- 20 years of innovative in the assessment of
the risk to the health of the population of
exposure to environmental factors, with an
emphasis on the use and interpretation of routine
health statistics - EPHT (USA)
- 5 years of building a nationwide network of
integrated environmental monitoring and public
health data systems so that all sectors may take
action to prevent and control environmentally
related health effects
3Our EPHT Goals
- Develop and maintain population data
- Health
- Environment
- Sociodemographic context
- Adapt, enhance, develop, and validate analytic
methods, as needed - Implement and demonstrate relevance for public
health
4Outline
- What are ecologic studies?
- Why do we conduct ecologic studies?
- What are some limitations of ecologic studies?
- What are some strengths and successes of ecologic
studies? - What are some of the newer approaches for
conducting ecologic studies?
5Defining Terminology
- What are ecologic data?
- Individual data that are grouped or aggregated
(e.g., averaged) for several individuals - What are ecologic analyses?
- What are ecologic studies
6Some Types of Ecologic Data
- Generic
- Indicators
- Summary measures that capture multiple aspects of
a situation (e.g., low birthweight, extent of
wetlands, concentration of criteria pollutants,
beach closings) - Exposure
- Substantive aggregation
- Measure of effective exposure (e.g., dioxin
congeners) - Regional measurement
- May be due to monitoring or reporting system
- Regional exposure (e.g., pesticide use, taxes,
BRFSS) - Temporal exposure (e.g., average annual, 24-hour
ave.) - Sampling limitations (e.g., ambient air monitors)
- Outcome
- May be aggregated to protect privacy (e.g.,
registries, surveys) - Note due to collection process, routinely
collected data tend not to be as reliable as
data collected for specific study
7Defining Terminology
- What are ecologic data?
- Individual data that are grouped or aggregated
(e.g., averaged) for several individuals - What are ecologic analyses?
- Analyses by group rather than individual
- Semi-ecologic studies may use individual outcome
data with grouped exposure data - Semi-individual Kunzli and Tager 1997
- Partially-ecologic Morgenstern 1998
- What are ecologic studies?
8Consequences of Aggregating
- Averages out within group variation
- Removes all joint distribution information
- (e.g., is it the smokers who got lung cancer)
- Often called The Ecologic Fallacy
- Random misclassification can lead to bias AWAY
FROM the null
9An Example
10Example Possible Values
11Alternative methods for analysis of racial
effects on low birth weight in the 21 counties
of New Jersey, 2003
12Defining Terminology
- What are ecologic data?
- Individual data that are grouped or aggregated
(e.g., averaged) for several individuals - What are ecologic analyses?
- Analyses by group rather than individual
- Semi-ecologic studies may use individual outcome
data with grouped exposure data - What are ecologic studies?
- Studies that use ecologic data
13Where Have We Come From
- Air pollution epidemiology
- Severe episodes
- Meuse Valley, Donora PA, London Fog
- Time series studies
- Philadelphia daily mortality
- Regional comparisons
- Six Cities Study, ACS
- Migrant study (e.g., diet, cancer mortality)
- Do transplanted populations acquire disease rates
of local populations? - Convenient sample, Integrated exposure
- Descriptive or Analytic
- 1975 NCI Cancer Mortality Atlas
- Found new and confirmed known etiologies
- Validation slow, partially successful but
fruitful - Both occupational and environmental risks
Stomach Cancer
14Map-Based Correlational Studies
- Various historical efforts
- New impetus triggered by
- NCI Atlas (1970s)
- Compared mortality maps to possible exposures
- Then validated with traditional epidemiology
- Bladder cancer and chemical manufacturing
- Nasal adenocarcinoma and furniture manufacturing
- Lung cancer and shipyards
- Oral cancers among women and snuff use
- Despite the Bad Press these can be useful
- Must be careful of limitations of ecologic
analysis
15Why Do We Do Ecologic Studies? Some motivations
for ecologic studies
? Exposure data are not available at the
individual level -- Access may be limited to
protect confidentiality ? Information is
available on the distribution of exposures
within each of a series of geographically
defined units (e.g., census blocks,
municipalities, counties, states) ?
Characterizing the spatial distribution of
disease is not the focus of these
analyses ? Interest is in effects of exposure
on disease in individuals (i.e.,
exposure-health linkage)
16Goal Exposure Etiology
Consider Cancer Incidence
- Ideal Case
- Know the amount (molecules) of relevant toxic
that enters a susceptible cell and causes change
in DNA leading to disease over a lifetime - Realistic Case
- Have crude estimate of ambient or self-reported
exposure to toxic, for a limited amount of time,
through limited exposure routes, and some measure
of disease occurrence
17Missing Information
- Space-Time Trajectory
- Full residential/work/travel history
- Complete list of exposures for
- each location at each time over lifetime
- Risk Factor History
- Diet
- Behavior (drinking, pharmaceuticals)
- Occupation/Hobbies
18Some Major Environmental Health Concerns
Where have be been. .and where to we
want to go
- Exposures often characterized by aggregate
measures - Air quality
- Drinking water quality
- Lead
- Ionizing radiation
- Magnetic fields
- Climate change
- Exposures often characterized for individuals
- Radon
- Cell phones
- Pesticides
19In the Omics Era,Why Work at the Population
Level?EPAs Environmental Public Health
Continuum
.
Study Designs
Ecologic Study
Individual Study
Laboratory Study
Modified from EPA RFA
20Why have Ecologic Studies been given a bad name?
- Misapplication
- Rarely preferable when individual data are
available - Misuse
- Failure to consider bias, confounding, effect
measure modification (similar to individual-based
studies) - Misinterpretation
- Lack of familiarity with methodology may lead
users to over interpret or extrapolate results - Lack of joint distribution of exposure and
outcome when both are aggregate - But, usually blamed on aggregate Data
21Many Limitations Common to Both Individual and
Aggregate Approaches
- Potential Problems
- Bias
- Confounding
- Effect Measure Modification
- Misclassification
- effects differ due to aggregation of risk factors
- Disease Latency
- Accurate exposure data
- lack of residential history, migration
information - Recommendations for Validation
- Exposure
- Compare results using related risk factors or
surrogates - Disease
- Compare results using uncorrelated outcome
22Weaknesses of Aggregate Analysis
- Cross-Level Effects
- Confounding
- Effect measure modification
- Absence of joint distribution information
- Misclassification can bias effect estimates AWAY
from null
23Strengths of Aggregate Analysis
- Enables analysis of large populations
- Not easily collectable
- Facilitates study of relatively small risks
- Can assess public health impact of an
intervention - Can be conducted easily and inexpensively with
routinely collected databases (surveillance)
24Our Research
- Additive vs. Multiplicative Models
- Evaluation of estimation vs. hypothesis testing
- Development of aggregation recommendations
- Number of units vs. size of units
- Necessary variation across units
- Assess effects of bias and confounding
25Some results
- Additive models show smaller bias than
multiplicative models (see figure) - Exposure measurement error biases estimates
towards the null - Random misclassification biases estimates away
from the null - Statistical inference is valid in spite of biased
estimates - Useful for hypothesis generation and
prioritization
26Example of bias when log-linear modeling is
employed for aggregate-level analysisTwo
geographic areas with a disease rate of 30 per
1000 per year in the exposed and 10 per 1000 per
year in the unexposed
27Two important differences between
individual-level and aggregate-level studies
Sampling error on the exposure
Misclassification variable
of exposure
Study using individual-level information on
exposure Study using aggregate-level information
on exposure
28Choice of Unit of Aggregation
Countries are particularly problematic
because they differ from one another is so
many ways (uncontrolled ecologic and
individual-level confounding is likely to bias
the results) The more that the exposure
varies across the units, the greater will be
the statistical precision of the estimates
The larger the number of units available for
analysis, the greater will be the
statistical precision of the estimates The
larger the number of individuals in each unit,
the greater will be the statistical
precision of the estimates There need to be
sufficient numbers of units to permit control
of potential confounding variables that
operate at the individual level
29Summary
? Ecologic studies involve assigning to
individuals information concerning the
aggregate-level distribution of the exposure
(risk factor) ? This approach is used when
individual-level information on exposure is
not readily available ? Ecologic studies must
generally be regarded as preliminary or as
useful for generating hypotheses ? A crucial
requirement in an ecologic study is that there be
substantial variation in exposure across the
ecologic units ? If the information on the
distribution of exposure is imprecise, then
the estimate of the exposure-disease association
will be biased toward the null value in an
ecologic analysis
30Summary (continued)
? Misclassification of exposure has the effect
of biasing the estimate away from the null
value in an ecologic analysis ? Required
assumptions regarding the absence of confounding
and interaction due to geographic area are
relatively weak, provided that
aggregate-level data on the distribution of
pertinent covariates is available ? Frequent
lack of information on confounding variables is
perhaps the most important limitation on the
usefulness of ecologic analysis ? Sufficient
provision for interactive effects due to
variables other than geographic area requires
information on the joint distribution of
exposure and covariates such information is
often lacking in practice
31Summary (continued)
? Because of the possibility of bias away from
the null due to misclassification in
ecologic studies, more emphasis might be
properly placed on hypothesis testing in ecologic
studies than in individual-level studies
such emphasis is appropriate for hypothesis-
generating studies ? A key element in EPHT is
surveillance for possible associations
warranting more formal studies therefore,
ecologic analysis is an important tool for
EPHT
32Analytic Approaches
- Regression
- Logistic
- Linear (binomial)
- Hierarchical Models
- Two Phase Studies