Title: Statistical Methods for Testing Carcinogenic Potential of New Drugs in Animal Carcinogenicity Studies
1Statistical Methods for Testing Carcinogenic
Potential of New Drugs in Animal Carcinogenicity
Studies
- Hojin Moon, Ph.D.
- E-mail HMoon_at_nctr.fda.gov
- September 16, 2005
2Collaborators
- Dr. Ralph L. Kodell DBRA, NCTR, FDA
- Dr. Hongshik Ahn SUNY_at_Stony Brook
3Animal Carcinogenicity Study
- Studies are conducted to assess the oncogenic
potential of chemicals encountered in food or
drugs for the protection of public health - Studies often involve a problem of testing the
statistical significance of a dose-response
relationship among dose (treatment) groups. - Various statistical testing methods for a
dose-response relationship (Ahn and Kodell, 1998)
4Animal Carcinogenicity Study
- Typical Experimental Design
- A zero-dose control and 2 or 3 dose groups
- 50 or more animals (mice or rats) per sex/group
- Exposure to a test agent at treatment groups of
varying doses for the duration of a study - At least 18 months in mice, 24 months in rats
(CDER, US FDA in Office of the Federal Register,
1985) - Multiple (interval) sacrifices or single terminal
sacrifice - Age at death (survival time) and status
(presence/absence) of specific tumor types
5Animal Carcinogenicity Study
- Methods
- Data with cause-of-death information assigned by
pathologists (Peto type) - Data without cause-of-death information (Poly-k
type)
6Animal Carcinogenicity Study
- The statistical analysis of animal
carcinogenicity data and the Peto COD controversy
are current issues in the government-regulated
pharmaceutical industry - (Lee et al., 2002 STP Peto Analysis Working
Group, 2001, 2002 U.S. FDA, 2001) - Town Hall meetings were held in both June 2001
June 2002 at the annual meetings of the STP to
discuss issues surrounding COD assignment and
implications for using the Peto test or the
alternative Poly-3 test - Opinions of a number of statisticians (Lee et
al., 2002)
7Dose-Related Trend Tests
- Peto Test (Peto et al., 1980)
- Recommended by IARC
- Required for product registration in Europe
- Commonly used in most pharmaceutical companies
- Needed COD assigned by pathologists
- Use of COD information has been controversial
- (Lagakos, 1982 Racine-Poon Hoel, 1984 Lagakos
Louis, 1988 Kodell et al., 1995 Ahn et al.,
2000 Moon et al., 2002 Moon et al., 2003)
8Dose-Related Trend Tests
- Modifications of Petos Test
- The Peto imputed COD test (Moon et al., 2002)
- No COD required
- Developed constrained NPMLE method to impute COD
- Imputation of COD
- Solve the constrained NPMLE problem by
implementing Newton-based method of Ahn, Moon,
Kim and Kodell (2002) - The weight-adjusted Peto Test (Moon et al., 2004)
- Use of Fleming-Harrington type weight adjustment
(Fleming Harrington, 1981) - Web-based sample size and power estimator (Moon
et al, 2002) - http//biostatistics.mdanderson.org/ACSS
9Dose-Related Trend Tests
- Cochran-Armitage Trend Test (Cochran, 1954
Armitage, 1955) - To detect linear trend across dose groups in
lifetime tumor incidence rates - Does not require COD
- Requires an assumption under H0 that all animals
are at equal risk of developing a tumor over the
duration of a study - A problem for this test arises from the presence
of treatment-induced mortality unrelated to the
tumor of interest - The CA test is known to be sensitive to increase
in treatment lethality and to fail to control the
probability of a Type I error (Bailer Portier,
1988 Mancuso et al., 2002 Moon et al., 2003)
10Cochran-Armitage Trend Test
Dose Group Dose Group Dose Group Dose Group Dose Group
1 2 . g Total
w. T y1 y2 . yg y.
w/o T N1 - y1 N2 - y2 . Ng yg N - y.
subjects N1 N2 . Ng N
- The CA test utilizes the tumor data pooled over
the study duration for each group - Expected w T in group
- Dose level in group
-
-
- Under the null hypothesis of equal tumor
incidence rates among groups - Some treatments shorten overall survival -gt
decreased risks of tumor onset - Survival time is not utilized
11The Poly-k Trend Test
- Appropriate alternative to the Peto-type test
- No COD required
- Adopted by NTP as its official test for
carcinogenicity - Survival-adjusted quantal-response procedure that
takes dose-group differences in intercurrent
mortality (all deaths other than those resulting
from a tumor of interest) into account.
12The Poly-k Trend Test
- Bailer Portier (1988)
- Proposed the Poly-3 test, which made an
adjustment of the CA test by using a fractional
weighting scheme - at risk in group
-
- where
- (time-at-risk weight for the kth animal in group
i) - Replace Ni with ri in calculating ZCA
- First mentioned the Poly-k test without
specifying how to obtain k - Recommended k3 following evaluation of neoplasm
onset time distribution in control F344 rats and
B6C3F1 mice (Portier et al., 1986) - The Poly-k test with correct k -gt Superior
operating characteristics to the Poly-3 test
13The Poly-k Trend Test
- Bieler Williams (1993)
- Further modified the CA test by an adjustment of
the variance estimation of the test statistic
using the delta method (Woodruff, 1971) - Showed that the Bailer-Portier Poly-3 test is
anticonservative for low tumor incidence rates
and for high treatment toxicity - Characteristics of the BP Poly-3 test and the BW
Poly-3 test can be found in Chen et al. (2000) - Objectives
- The Poly-k statistic asymptotically normal under
H0 of equal tumor incidence rates among groups
(Bieler Williams, 1993) - Valid only if the correct value of k is used
- Develop the method of bootstrap resampling to
estimate the empirical distribution of the test
statistic and corresponding critical value of the
Poly-k test while taking into account the
presence of competing risks
14Generalized Poly-k Test
- Moon et al. (2003)
- Proposed a method for estimating k for data with
interval sacrifices (interim sacrifices and a
terminal sacrifice) - Estimation of the poly-k based empirical lifetime
cumulative tumor incidence rate, a function of k - Estimation of cumulative tumor incidence rate
(Kodell Ahn, 1997) - Equate two estimate and find k
15Generalized Poly-k Test
- Moon et al. (2005) Bootstrap-based age-adjusted
Poly-k test - Improving the Poly-k test for data with a single
terminal sacrifice - Estimation of k for single sacrifice data is more
difficult than that for data with interval
sacrifices due to lack of information on tumor
development among live animals before the
termination of the experiment - Propose a method of bootstrap-based age-adjusted
resampling to improve the Poly-k test via a
modification of the permutation method of Farrar
Crump (1990), which was used for exact
statistical tests
16Objectives
- Develop the method of bootstrap resampling to
estimate the empirical distribution of the test
statistic and corresponding critical value of the
Poly-k test while taking into account the
presence of competing risks that are possible COD - We attempt to keep the CRSR using an age-adjusted
resampling scheme as well as to preserve the
tumor incidence rates under H0 and to assess the
significance of the Poly-k test
17Bootstrap Method
100(1-a)th percentile CR(X)
Reject H0 if T(X) CR(X)
18Bootstrap Method
- Suitable for data with the same CRSR
- When the CRSR is different across dose groups in
the original data, the bootstrap samples from the
pooled data may not reflect the CRSR of each
group, while satisfying the null distribution of
equal tumor incidence rate across groups - Need to modify the bootstrap method in order to
preserve the survival rates in each dose group - Develop an age-adjusted scheme
19Age-adjusted Bootstrap Scheme
Age-adjusted scheme I(I,m) i1,.,G m1,.,Mi
. . . . .
Samples
. . . . .
X1
X2
XB
Bootstrap
Replicates
. . . . .
T(X1)
T(X2)
T(XB)
Bootstrap
100(1-a)th percentile CR(X)
Reject H0 if T(X) CR(X)
20Example
- Death times (in days) in a hypothetical animal
carcinogenicity data set with 4 groups
ID Group 1 Group 2 Group 3 Group 4
A 74
B 145
C 176
D 185
E 243
F 300
G 316
H 324
I 340
J 341
K 343
L 345
M 351
N 385
.. .. .. .. ..
21Example
- Death times (in days) in a hypothetical animal
carcinogenicity data set with 4 groups
ID Group 1 Group 2 Group 3 Group 4
A 74
B 145
C 176
D 185
E 243
F 300
G 316
H 324
I 340
J 341
K 343
L 345
M 351
N 385
.. .. .. .. ..
22Example
- Death times (in days) in a hypothetical animal
carcinogenicity data set with 4 groups
ID Group 1 Group 2 Group 3 Group 4
A 74
B 145
C 176
D 185
E 243
F 300
G 316
H 324
I 340
J 341
K 343
L 345
M 351
N 385
.. .. .. .. ..
23Example
- Death times (in days) in a hypothetical animal
carcinogenicity data set with 4 groups
ID Group 1 Group 2 Group 3 Group 4
A 74
B 145
C 176
D 185
E 243
F 300
G 316
H 324
I 340
J 341
K 343
L 345
M 351
N 385
.. .. .. .. ..
24Simulation Study
- To evaluate the improvement of the proposed test
in terms of the robustness to a variety of tumor
onset distributions - Typical bioassay design according to standard
designs of NTP - 4 dose groups (dose levels 0, 1, 2 and 4) of 50
animals each - Experimental duration of 2 yrs.
- A single terminal sacrifice at the end of the
experiment
25Simulation Study
- Illustration of illness and death with possible
transitions (Kodell Nelson, 1980)
Tumor (T1)
Normal
Death from Tumor (TD)
Death from Competing Risks (T3)
26Simulation Study
- Modeling
- T1 Time to tumor onset
- S(t) exp-?d(t/tmax)k
- T2 Time after onset until death from the tumor
- Q(t) exp-f(?1t ?2t ?3)
- T3 Time to death from a competing risk
- The same form as Q(t)
- f is selected to reflect tumor lethality
- T1 T2 TD Time to death from the tumor of
interest
27Simulation Study
- Tumor onset distributions
- Weibull tumor onset distribution with shape
parameter k 1.5, 3.0 and 6.0 - Tumor rates
- .05, .15 and .30 for the control
- Size evaluation
- tumor rates are the same across dose groups
- Power evaluation
- tumor rates for the highest dose group by 104
weeks 5, 3 and 2 times the background tumor
rates of .05, .15 and .30, respectively - CRSR (from NTP feeding studies, Haseman et al.,
1998) - (.6, .6, .6, .6) (.6, .5, .4, .3) (.6, .6, .5,
.2) (.5, .5, .5, .2) (.5, .6, .5, .4) (.5, .7,
.6, .4) (.5, .7, .6, .5) - 5000 simulated data sets a .05 significance
level - For each data set, 5000 bootstrap samples
28Simulation Study
- Size Power Evaluation with 5000 simulated data
sets, 5000 bootstrap samples for each data set
and 5 nominal significance level
TR CRSR Weibull 1.5 Weibull 1.5 Weibull 3.0 Weibull 3.0 Weibull 6.0 Weibull 6.0
TR CRSR B N B N B N
.3 .6,.6,.6,.6 .053 .050 .054 .050 .055 .052
.3 .5,.5,.5,.2 .044 .066 .044 .041 .040 .021
.3 .6,.6,.5,.2 .036 .072 .033 .037 .033 .018
.3 .6,.5,.4,.3 .047 .069 .043 .045 .040 .024
.3 .5,.6,.5,.4 .049 .055 .050 .048 .048 .037
.3 .5,.7,.6,.4 .046 .053 .048 .046 .045 .036
.3 .5,.7,.6,.5 .054 .050 .051 .047 .054 .044
.3 .6,.6,.6,.6 .918 .934 .908 .923 .893 .904
.3 .5,.5,.5,.2 .837 .932 .781 .847 .725 .667
.3 .6,.6,.5,.2 .790 .939 .734 .846 .668 .638
.3 .6,.5,.4,.3 .864 .938 .825 .884 .773 .748
.3 .5,.6,.5,.4 .886 .929 .868 .895 .834 .819
.3 .5,.7,.6,.4 .881 .930 .856 .892 .817 .810
.3 .5,.7,.6,.5 .904 .927 .884 .909 .859 .865
29Example
- The 2-yr Gavage Study of Furan
- Furan (C4H4O), a clear and colorless liquid,
serves primarily as an intermediate in the
synthesis and preparation of numerous organic
compounds (NTP, 1993) - Toxicology and carcinogenesis studies were
conducted by administering furan in corn oil by
gavage to groups of F344/N rats and B6C3F1 mice
of each sex for 2 yrs - Furan was nominated by the NCI for evaluation of
carcinogenic potential due to its large
production volume and use, and because of the
potential for widespread human exposure to a
variety of furan-containing compounds
30Example
- Female F344/N rats
- Evaluation of carcinogenic potential on
incidences of cholangiocarcinoma or
hepatocellular neoplasms of the liver - Groups of 50 rats were administered 2, 4 or 8 mg
furan per kg body weight in corn oil by gavage 5
days per week for 2 yrs - Male B6C3F1 mice
- Evaluation of carcinogenic potential on
incidences of adenocarcinoma or
alveolar/bronchiolar adenoma of the lung. - Groups of 50 mice received doses of 8 or 15 mg/kg
furan 5 days per week for 2 yrs
31Data
Group Animal Tumor Pathology
Livera Vehicle Control 1(0), 2(16), 3(0), 4(34)
Livera 2 mg/kg 1(1), 2(17), 3(1), 4(31)
Livera 4 mg/kg 1(3), 2(19), 3(3), 4(25)
Livera 8 mg/kg 1(4), 2(27), 3(6), 4(13)
Lungb Vehicle Control 1(3), 2(14), 3(4), 4(29)
Lungb 8 mg/kg 1(4), 2(24), 3(3),4(19)
Lungb 15 mg/kg 1(7), 2(23), 3(6), 4(14)
aCholangiocarcinoma or hepatocellular neoplasms
of the liver in female F344/N rats bAdenocarcinoma
or alveolar/bronchiolar adenoma of the lung in
male B6C3F1 mice
32- Test results on the carcinogenic activity of
furan in female F344/N rats based on increased
incidences of cholangiocarcinoma and
hepatocellular neoplasms of the liver (Reject
when T(X) CR(X))
mg/kg T(X)aBW CR(X)bNormal CR(X)cBootstrap
Overall 4.1617 1.6449 (plt.001) 2.0141 (plt.001)
0,2,4 2.7705 1.6449 (p.003) 1.9584 (p.004)
0,2,8 4.3559 1.6449 (plt.001) 1.9584 (plt.001)
0,4,8 3.6632 1.6449 (plt.001) 1.8214 (plt.001)
0,2 1.4641 1.6449 (p.072) 1.4625 (p.040)
0,4 2.6542 1.6449 (p.004) 1.5905 (p.001)
0,8 3.8420 1.6449 (plt.001) 1.7423 (plt.001)
aThe BWP3 test statistic obtained from the
data bStandard normal critical value at the
significance level .05 cCritical value estimated
by the 95th percentile of T(X)s from our method
- NTP concluded that under the conditions of these
2-yr gavage studies, there was clear evidence of
carcinogenic activity of furan in female F344/N
rats based on increased incidences of
cholangiocarcinoma and hepatocellular neoplasms
of the liver
33- Test results on the carcinogenic potential of
furan on incidences of adenocarcinoma and
alveolar/bronchiolar adenoma of the lung in male
B6C3F1 mice (Reject when T(X) CR(X))
mg/kg T(X)aBW CR(X)bNormal CR(X)cBootstrap
Overall 1.6995 1.6449 (p.045) 1.7774 (p.058)
0,15 1.6805 1.6449 (p.046) 1.6938 (p.052)
0,8 .2229 1.6449 (p.41) 1.9248 (p.53)
aThe BWP3 test statistic obtained from the
data bStandard normal critical value at the
significance level .05 cCritical value estimated
by the 95th percentile of T(X)s from our method
- Our test results agree with the conclusions from
NTP
34Significance
- The statistical analysis of tumorigenicity data
from animal bioassays remains an important
regulatory issue to FDA and the pharmaceutical
industry - The present research will build to further refine
the Poly-k test in order to make it more broadly
competitive with the Peto test - The improved Poly-k test for dose-related trend
will be robust to a variety of tumor onset
distributions. - It will control the false positive rate better
than the Poly-3 test, thus having enhanced
performance in identifying dose-related trends. - With no information on COD or tumor lethality,
the improved version can be used confidently when
Petos test can not be implemented
35References
- Ahn H, Kodell RL (1998). Analysis of long-term
carcinogenicity studies. In Design and Analysis
of Animal Studies in Pharmaceutical Development,
Chow SC, Liu JP (eds). Marcel Dekker, Inc. New
York, 259-289. - Armitage P (1955). Tests for linear trends in
proportions and frequencies. Biometrics, 11,
375-386. - Bailer AJ, Portier CJ (1988). Effects of
treatment-induced mortality and tumor-induced
mortality on tests for carcinogenicity in small
samples. Biometrics, 44, 417-431. - Bieler GS, Williams RL (1993). Ratio estimates,
the delta method, and quantal response tests for
increased carcinogenicity. Biometrics, 49,
793-801. - Chen JJ, Lin KK, Huque MF, Arani RB (2000).
Weighted p-value for animals carcinogenicity
trend test. Biometrics, 56, 596-592. - Cochran WG (1954). Some methods for strengthening
the common x2 tests. Biometrics, 10, 417-451. - Lee PN, Fry JS, Fairweather WR, Haseman JK,
Kodell RL, Chen JJ et al. (2002). Current issues
statistical methods for carcinogenicity studies.
Toxicologic Pathology, 30, 403-414. - Mancuso JY, Ahn H, Chen JJ, Mancuso JP (2002).
Age-adjusted exact trend tests in the event of
rare occurrences. Biometrics, 58, 403-412. - Moon H, Ahn H, Kodell RL, Lee JJ (2003).
Estimation of k for the poly-k test. Statistics
in Medicine, 22, 2619-2636. - National Toxicology Program (1993). Toxicology
and carcinogenesis studies of furan in F344/N
rats and B6C3F1 mice (Gavage studies). NTP
Technical Report, 402, Research Triangle Park,
NC. - STP Peto Analysis Working Group (2001). The
Society of Toxicological Pathologys position on
statistical methods for rodent carcinogenicity
studies. Toxicologic Pathology, 29(6), 670-672. - STP Peto Analysis Working Group (2002). The
Society of Toxicological Pathologys
recommendations on rodent carcinogenicity
studies. Toxicologic Pathology, 30, 415-418. - U.S. FDA (2001). Guidance for industry
statistical aspects of the design, analysis, and
interpretation of chronic rodent carcinogenicity
studies of pharmaceuticals. Federal Register,
66(89), 23266-23267. - Woodruff RS (1971). A simple method for
approximating the variance of a complicated
estimate. Journal of the American Statistical
Association, 66, 411-414.
36Ongoing Research
- Developing improved survival-adjusted statistical
tests in long-term tumorigenicity bioassays (NCTR
E07171.01) - Developing estimators for hazard identification
in long-term tumorigenicity bioassays (NCTR
E07172.01) - Developing statistical procedures for
incorporating dose-response-model uncertainty
into microbial risk assessment (NCTR E07045.01) - Developing statistical tests for distinguishing
tumor frequency risks (effect on the number of
induced tumors) from tumor latency risks (effect
on their times to observation) in mutiple-tumor
photococarcinogenicity studies (NCTR E07061.01)
37Abstract
- Researches in carcinogenicity have been actively
conducted to understand the carcinogenic
potential of chemicals exposed to humans.
Long-term and animal-intensive carcinogenic
studies have been performed to extrapolate
carcinogenic risks in humans from exposure to
drugs and food tainted. In this seminar, we
discuss recent development of improved
survival-adjusted and age-adjusted dose-related
trend tests to achieve improved robustness to a
variety of tumor onset distributions. We
consider extensions of the survival-adjusted
Cochran-Armitage test, known as the Poly-k test,
to improve the robustness not only to the effects
of differential mortality across groups but also
to various tumor onset distributions. The
Cochran-Armitage test is routinely applied for
detecting a linear trend in the incidence of a
tumor of interest across dose groups. We examine
our recently developed statistical methods with
real data sets including National Toxicology
Program data sets to evaluate a dose-related
trend of a test substance on the incidence of
neoplasms.
38Animal Carcinogenicity Data from the ED01 Study -
NCTR
- To find the carcinogenic effect of feeding 2-AAF
to female mice (Littlefield et al., 1980) - A subset (low dose groups) of data with a single
sacrifice was obtained - To test dose-related trend of the liver tumor
incidence
39Frequency Tabulation of number of mice in the
ED01 study
Dose (ppm) NTP intervals (days) Fatal tumors assigned by pathologists Natural Death Natural Death Sacrifice Sacrifice
Dose (ppm) NTP intervals (days) Fatal tumors assigned by pathologists With tumor Without tumor Tumor present No tumor present
0 0-364 0 0 9 0 0
0 365-546 0 0 15 0 0
0 547-644 1 1 34 0 0
0 645-726 1 1 60 7 137
30 0-364 0 0 17 0 0
30 365-546 0 2 42 0 0
30 547-644 1 6 67 0 0
30 645-726 2 7 84 22 279
35 0-364 0 1 9 0 0
35 365-546 3 3 31 0 0
35 547-644 1 1 55 0 0
35 645-726 0 1 80 18 192
45 0-364 0 0 7 0 0
45 365-546 1 1 13 0 0
45 547-644 3 5 43 0 0
45 645-726 2 3 66 19 132