Title: Study Design in Molecular Epidemiology of Cancer
1Study Design in Molecular Epidemiology of Cancer
- Epi243
- Zuo-Feng Zhang, MD, PhD
2Objectives of Molecular Epidemiology
-
- To gain knowledge about the distribution and
determinants of disease occurrence and outcome
that may be applied to reduce the frequency and
impact of disease in human populations.
3(No Transcript)
4(No Transcript)
5(No Transcript)
6(No Transcript)
7Epidemiological Study Design and Analysis
- Transitional studies provide a bridge between the
use of biomarkers in laboratory experiments and
their use in cancer epidemiological studies. - The study is employed to characterization of
biomarkers - The problem of the use of biomarkers
- Serve as preliminary results rather than end
results about cancer etiology and prevention
8Epidemiological Study Design and Analysis
- Transitional studies
- Measure Intra- and inter-subject variability
- Explore the feasibility of marker use in field
condition - Identify potential confounding and
effect-modifying factors for the marker - Study mechanisms reflected by the biomarker
9Transitional Studies
- Transitional studies can be divided into three
functional categories - Developmental
- Characterization
- Applied studies
10Transitional Studies Developmental Studies
- Developmental studies involved
- determining the biological relevance
- pharmacokinetics
- reproducibility of measurement of the marker
- the optimal conditions for collecting,
processing, and storing biological specimens in
which the marker is to be measured
11Transitional Studies Characterization
- Assessing inter-individual variation and the
genetic and acquired factors that influence the
variation of biomarkers in populations
12Transitional Studies Characterization
- Assessing frequency or level of a marker in
populations - Identifying factors that are potential
confounders or effect modifiers
13Transitional Studies Characterization
- Establishing the components of variance in
biomarker measurement, laboratory variability,
intra-individual variation, and inter-individual
variation. The ratio of intra-individual
variation to inter-individual variation has
important implications for study size and power
14Transitional Studies Applied Studies
- The applied studies assess the relationship
between a marker and the event that it marks,
including exposure, pre-clinical effects,
disease, and susceptibility - The study is usually cross-sectional or short
term longitudinal design and not intended to
establish or refute a causal relationship between
given exposure and disease.
15Transitional Studies Ethical Issues
- The objectives of the research generally are not
to identify health risks, but to identify
characteristics of the biomarker or the
distribution of the marker by population
subtypes. - The meaning of the biomarker results is usually
unknown. - There is a need to anticipate the impact of
transitional studies on study subjects and plan
to address their concerns.
16Cohort or Case-Control Studies
- In the clinical-based cohort studies, of treated
patients or screened populations, the inclusion
of biological measures of exposure and
susceptibility is both methodologically sound and
logistically feasible
17Cohort or Case-Control Studies
- In population-based studies, the collection of
biological material for such markers is feasible
but logistically more complex. - For early biological marker, collection of
materials (e.g., pre-cancerous lesions) is
logistically feasible in a hospital setting, but
become more difficult in the population setting
18Prospective Studies Strengths
- Exposure is measured before the outcome
- The source population is defined
- The participation rate is high if specimen are
available for all subjects and follow-up is
complete
19Prospective Studies Weaknesses
- The usually small number of cases of each of many
type of cancer - The lack of specimen if the biomarker requires
large amounts of specimen or unusual specimens - Degradation of the biomarkers during long-term
storage - The lack of details on other potentially
confounding or interacting exposures
20Prospective Studies
- The major concern of cohort studies of the short
duration (as in case-control studies) is the
possibility that the disease process has
influenced the biomarker level among cases
diagnosed within 1 to 2 years of the specimen
being collected.
21Prospective Studies Misclassification
- In prospective studies in longer duration, there
may be considerable misclassification of the
etiologically relevant exposures if the specimens
have been collected only at baseline. - This misclassification occurs when individuals
exposure level may change systematically over
time and there may be intra-individual variation
in biomarker level.
22Prospective Studies Intra-Individual Variation
- The intra-individual misclassification may be
reduced by taking multiple samples, but this will
generally increase expenses of sample collection
and storage and the burden on study subjects - Similar approaches apply to taking sample at
several points in time in an attempt to estimate
time-integrated exposures or exposure change.
23Prospective Studies
- An alternative approach is to estimate the extent
of intra-individual variation, and the
misclassification involved in taking single
specimens, by taking multiple specimens in a
sample of the cohort. - This information can be used to correct for bias
to the null introduced if the misclassification
is non-differential, and therefore de-attenuate
observed relative risks
24Prospective Studies Ethical Issues
- Repeated contact of subjects
- Informing the cohort members of their biomarker
level is problematic if the biomarker is not
considered to be sufficiently predictive of
disease and if there is no preventive steps
cohort members can take to reduce their risk of
the disease
25Nested Case-Control Study
- The biomarker can be measured in specimens
matched on storage duration - The case-control set can be analyzed in the same
laboratory batch, reducing the potential for bias
introduced by sample degradation and laboratory
drift
26Case-Cohort Study Design
- Collecting the specimens at the baseline for
entire cohort and then collecting specimens from
cases as they occur. - Measuring the biomarker using newly collected
specimen and using the baseline cohort specimen
as control. - Because the specimens for cases and controls are
taken at the different times for cases and
controls, bias will be introduced if sample
degradation or lab drift occurs over time
27Case-Control Study Design
- For genetic susceptibility markers, case-control
study design is highly appropriate - Clinic-based case-control studies are
particularly suitable for studies of intermediate
endpoints, as these end-point can be
systematically measured. - Clinic-based case-control studies are excellent
for studying etiology of precancerous lesions
(e.g., CIN)
28Case-Control Study Design
- Biomarkers of internal dose (e.g., carrier status
for infectious agents, such as HBsAg) or
effective dose (PAH DNA adducts) are appropriate
when they are stable over a long period of time
or when the exposures have been constant over
exposure period. However, it is essential that
you are not affected by the disease process,
diagnosis, or treatment.
29The Case-Case Design
- Applications in Tumor Markers and Genetic
Polymorphisms Studies
30Case-Case Study Design
- To identify etiological heterogeneity
- To evaluate gene-environment interaction
31Case-Case Study Design
- Case-only, Case-series, etc.
- Studies with cases without using controls
- Can be employed to evaluate the etiological
heterogeneity when studying tumor markers and
exposure - May be used to assess the statistical
gene-environment or gene-gene interactions
32Interaction Assessment using Case-Control Study
- Genotype abnormal OR1
- Genotype normal OR2
- Interaction measure OR1/OR2
- here OR2OR01
- OR1OR11/OR10
- OR Interaction OR11/(OR10xOR01)
33Comparison of Case-Control and Case-Case Study
designs
Parameter Case-control Case-Case
Beta(01) OR01 Not measured
Beta(10) OR10 Not measured
Beta interaction ORint OR11/OR01xOR10 Measured
Beta (11) OR11OR01 x OR10 x ORint Not measured
34Assumptions for Case-Case Study Design
- Exposure and genotype occur independently in the
population - The Risk of disease is small (or the disease is
rare) at all level of the study variables
35Smoking and TGF-alpha Polymorphism
Smoking TGF-B Case Control OR adj.
Never Normal 36 A00 167 B00 1.0 OR00
Never Positive 7 A01 34 B01 1.0 OR01
Yes Normal 13 A10 69 B10 0.9 OR10
Yes Positive 13 A11 11 B11 5.5 OR11
36OR int OR11/(OR01 x OR10) 5.5/(1.0 x
0.9)6.1 OR CA(A11 x A00)/(A10 x A01) (13 x
36)/(13 x 7)5.1
37OR intOR CA/OR COOR 11/(OR01xOR10) OR11A11
B00/A00 B11 OR CA OR 11/(OR01xOR10) x OR
CO Assumption OR CO1, OR int OR CA
38Sample Size
Main effect Interaction
Case-control (RR) 2.0 (RR) 2.0
Sample size 150 cases 150 controls 600 cases 600 controls
Case-Case 300 cases
39Strengths of Case-Case Study Design
- Case-Case study design offers greater precision
for estimating gene-environment interaction than
case-control study design - The power for detecting gene environment
interactions in case-case study is comparable to
the power for assessing a main effect in a
classic case-control study. Which leads to
reduced sample size for interaction assessment.
40Strengths of Case-Case Study Design
- Only cases are needed, thus avoiding the
difficulties and often unsatisfying selection of
appropriate controls (avoiding selection bias for
controls)
41Limitations of Case-Case Study Design
- The main effects of susceptible genotype (G) and
environment effect (E) cannot be estimated - The case-case study will miss gene-environment
models with departures from additivity.
42Intervention Studies
- In studies of smoking cessation intervention, we
can measure either serum cotinine or protein or
DNA adducts (exposure) or p53 mutation, dysplasia
and cell proliferation (intermediate markers for
disease) - Measure compliance with the intervention such as
assaying serum b-carotene in a randomized trial
of b-carotene.
43Intervention Studies
- Susceptibility markers (GSTM1) can also be used
to determine whether the randomization is
successful (comparable intervention and control
arms)
44Family Studies
- Does familial aggregation exist for a specific
disease or characteristic? - Is the aggregation due to genetic factors or
environmental factors, or both? - If a genetic component exists, how many genes are
involved and what is their mode of inheritance? - What is the physical location of these genes and
what is their function?
45Issues in Study Design and Analysis
- Relating a particular disease (or marker of early
effect) to a particular exposure while
minimizing bias controlling for confounding
assessing and minimizing random error and
assessing interactions
46Sample Size and Power Consideration
- EPI243 Molecular Epidemiology of Cancer
47Sample Size and Power
- False positive (alpha-level, or Type I error).
The alpha-level used and accepted traditionally
are 0.01 or 0.05. The smaller the level of alpha,
the larger the sample size.
48Sample Size and Power
- False negative (beta-level, or Type II error).
(1-beta) is called the power of the study.
Investigator like to have a power of around 0.80
or 0.95 when planning a study, which means that
there have a 80 or 95 chance of finding a
statistically significant difference between
study and control groups.
49Sample Size and Power
- The difference between study and control groups
(delta). Two factors need to be considered here
one is what difference is clinically important,
and the another is what is the difference
reported by previous studies.
50Sample Size and Power
- Variability. The more the variability of the
data, the bigger the sample size.
51Power or Sample Size Estimate for Case-Control
Studies
- Alpha-level (false positive) 0.05
- Beta-level (false negative level 1-betapower)
0.20 - Delta-level Proportion of exposure in controls
and exposure in cases or expected odds ratio
52Power Estimate
53(No Transcript)
54(No Transcript)
55(No Transcript)
56(No Transcript)
57(No Transcript)
58(No Transcript)
59(No Transcript)
60(No Transcript)
61(No Transcript)
62(No Transcript)
63Sample Size Estimate
64(No Transcript)
65(No Transcript)
66(No Transcript)
67Estimate Minimum Detectable Odds Ratios
68(No Transcript)
69(No Transcript)
70(No Transcript)
71Gene-Environment (Gene-Gene) Interaction
- EPI242 Molecular Epidemiology
- Zuo-Feng Zhang, MD. PhD
72Definition for Interaction
- Interaction (effect modification) occurs when the
estimate of effect of exposure depends on the
level of other factor in the study base. - Interaction is distinct from confounding (or
selection or information bias), but rather a real
difference in the effect of exposure in various
subgroup that may be of considerable interest.
73Interaction Assessment
Factor A
Absent Present
Factor A Absent RR00 RR01
Present RR10 RR11
74Interaction Assessment
- RR00, relative risk when both factors absent
- RR01, relative risk when factor A present only
- RR10, relative risk when factor B present only
- RR11, relative risk when both factors A B
present
75Interaction Assessment
- Combined RR RR11
- RR11 gt RR01 x RR10 indicating more than
multiplicative interaction - or RR11/RR10 gtor lt RR01/RR00
- or RR11/RR01xRR10 gt or lt 1
- Interaction RR RR11 / (RR01 x RR10)
-
76Odds Ratios for two factors,Interaction?
Factor B
absent present
Factor A absent 1.0 2.5
present 4.0 10.0
77No more than multiplicative interaction
- ORs for factor B 2.5 when factor A present 2.5
(10.0/4.0) when factor A absent - ORs for factor A 4.0 when B absent and 4.0
(10.0/2.5) when factor B present
78Odds Ratios for two factors,Interaction?
Factor B
absent present
Factor A absent 1.0 2.5
present 4.0 20.0
79More than Multiplicative Interaction, Positive
Quantitative Interaction
- ORs for factor B 2.5 when factor A absent 5.0
(20.0/4.0) when factor A present - ORs for factor A 4.0 when B absent and 8.0
(20.0/2.5) when factor B present
80Odds Ratios for two factors,Interaction?
Factor B
absent present
Factor A absent 1.0 2.5
present 4.0 5.0
81More than Multiplicative Interaction, Negative
Quantitative Interaction
- Both factors increase the risk regardless of the
value of the other factor, but the combined
effect is less than the product of the two,
although greater than that of either factor
alone, giving a negative quantitative
interaction.
82Odds Ratios for two factors,Interaction?
Factor B
absent present
Factor A absent 1.0 2.5
present 4.0 4.0
83More than Multiplicative Interaction, Negative
Quantitative Interaction
- Both factors increase the risk
- When A is present, there is no additional effect
of factor B - Adding factor A to factor B, only increases the
risk to the degree found for factor A alone
(4.0), leading to negative quantitative
interaction.
84Sample Size Consideration for Interaction
Assessment
- Evaluation of interaction requires a substantial
increase in study size. For example, in a
case-control study involves comparing the sizes
of the odds ratios (relating exposure and
disease) in different strata of the effect
modifier, rather than merely testing whether the
overall odds ratio is different from the null
value of 1.0.
85Sample Size Consideration
- The power to test interaction depends on the
number of cases and controls in each strata (of
the effect modifier) rather than overall numbers
of cases and controls. - When considering possible interactions, the size
of the study needs to be at least four time
larger than when interaction is not considered
(Smith and Day)
86(No Transcript)
87(No Transcript)
88(No Transcript)
89(No Transcript)
90(No Transcript)
91(No Transcript)
92(No Transcript)
93(No Transcript)
94(No Transcript)
95(No Transcript)
96(No Transcript)
97(No Transcript)
98(No Transcript)
99(No Transcript)
100(No Transcript)
101(No Transcript)
102(No Transcript)
103(No Transcript)
104(No Transcript)
105(No Transcript)