Title: Mark Carpenter
1MULTIVARIATE SURVIVAL DISTRIBUTIONS
The Role of Statistical Modeling in Modern
Research
- Mark Carpenter
- Department of Mathematics and Statistics
- Auburn University
- Auburn, Alabama
2MULTIVARIATE SURVIVAL DISTRIBUTIONS
- When multiple events are observed on the same
- individual or cluster of individuals and the
associated - event times are correlated.
- Time to blindness in left or right eye in
diabetics - Time to event in twins (clustered)
- Tumor disappearance, tumor recurrence, death
3UNIVARITE SURVIVAL DISTRIBUTIONS
well studied/developed
Generalized Gamma
Lognormal
Gamma
Weibull
Extreme Value
Exponential
Nonparametric
Lifetimes within a population are typically
modeled with probability distributions possessing
positive supports, i.e., P(X0)1.
4MULTIVARIATE SURVIVAL DISTRIBUTIONS
- OUTLINE
- Two real examples that motivated this research
- Mechanism for generating a multivariate family
where the marginals are specified - -Weibull, Gamma and Exponential.
- Parameter estimation (Exponential case)
5MOTIVATING APPLICATION
SBIR Phase I
EXCUBITORE Guarding Against NovelNetwork
Attacks, Torch Technologies, Huntsville, AL.
6FIRST CONVERSATION
Iyer, Srilanth K. and Manjunath, D. (2004),
Correlated Bivariate Sequence for Queueing and
Reliability Applications.' Communications in
Statistics and Sankhya
Q1 Can this model generate bivariate Weibull?
Q2 Is this appropriate for modeling
correlated inter-arrival and Service time
(packet downloads) within subpopulations
(clusters)? Q3 Can we incorporate into our (EM
algorithm) Normal mixture modeling
software?
7SECOND CONVERSATION
Iyer, Srilanth K. and Manjunath, D. (2004),
Correlated Bivariate Sequence for Queueing and
Reliability Applications.' Communications in
Statistics and Sankhya
Q1 Can this model generate bivariate Weibull?
A1 Yes Q2 Is this appropriate for modeling
correlated inter-arrival and Service time
(packet downloads) within subpopulations
(clusters)? A2 No Q3 Can we incorporate into
our (EM algorithm) Normal mixture
modeling software? A3 Yes, but No.
8FINAL REPORT
Iyer, Srilanth K. and Manjunath, D. (2004),
Correlated Bivariate Sequence for queueing and
reliability applications.' Communications in
Statistics
- Generated a viable Bivariate Weibull from their
model - Discovered that the linear structure is not
appropriate for internet data because
9FINAL REPORT
Iyer, Srilanth K. and Manjunath, D. (2004),
Correlated Bivariate Sequence for queueing and
reliability applications.' Communications in
Statistics
- Generated a viable Bivariate Weibull from their
model - Discovered that the linear structure is not
appropriate for internet data because
Simultaneous occurrence
10FINAL REPORT
Hougland, Philip (1986), A class of
multivariate failure time distributions.''
Biometrika.
- Viable Multivarite Weibull Model (absolute
continuous) - Demonstrated that it fits well even when the data
are Normally distributed - Developed a multivariate regression model which
can be used as a discriminant function in
supervised classification - Initial success in fitting mixtures of Weibulls,
both in simulation and on real internet data. - Generated a multivariate location/scale family
- Theoretically reduces the number of false alarms
by allowing for skewed distributions.
11CHARACTERIZING NORMAL CONNECTION DATA WITH
MIXTURE MODELS
CONTINUOUS VARIABLES
CATAGORICAL VARIABLES
ROBOT ATTACK
SERVER INITIATION
INTERNAL FILE SERVER
PORT SCANS
12MIXTURES OF BIVARIATE WEIBULLS
13Tumorogensis Animal Study (Breast Cancer,
chemoprevention)
Female Sprague-Dawley CD rats
Feed 1 of 3 Diets
Green Tea
Red Wine
Control
At birth, litters were placed on one of three
diets 1) AIN-At birth,litters were placed on
one of three diets 1) AIN-76A with tap water as
thecontrol, 2) 1 gram resveratrol/kg AIN-76A
diet with tap water, or 3) 0.065EGCG in the
drinking water with AIN-76A diet.At 50 days
postpartum, 94 female rats (30 Control, 30
Resveratrol, and 34EGCG) were gavaged with 60 mg
dimethylbenzaanthracene (DMBA)/kg bodyweight,
a dose sufficient to cause 100 tumor incidence
in the control groupover the course of the
study. Animals were palpated twice a week
starting five weeks after DMBAadministration in
order to record the presence, location, size, and
date ofdetection for all tumors. Animals were
sacrificed when the tumor diameterreached one
inch, animals became moribund, or rats reached 18
weekspost-DMBA treatment.
14Tumorogensis Animal Study (Breast Cancer,
chemoprevention)
Female Sprague-Dawley CD rats
Feed 1 of 3 Diets
At birth
Green Tea
Red Wine
Control
At birth, litters were placed on one of three
diets 1) AIN-At birth,litters were placed on
one of three diets 1) AIN-76A with tap water as
thecontrol, 2) 1 gram resveratrol/kg AIN-76A
diet with tap water, or 3) 0.065EGCG in the
drinking water with AIN-76A diet.At 50 days
postpartum, 94 female rats (30 Control, 30
Resveratrol, and 34EGCG) were gavaged with 60 mg
dimethylbenzaanthracene (DMBA)/kg bodyweight,
a dose sufficient to cause 100 tumor incidence
in the control groupover the course of the
study. Animals were palpated twice a week
starting five weeks after DMBAadministration in
order to record the presence, location, size, and
date ofdetection for all tumors. Animals were
sacrificed when the tumor diameterreached one
inch, animals became moribund, or rats reached 18
weekspost-DMBA treatment.
15Tumorogensis Animal Study (Breast Cancer,
chemoprevention)
Female Sprague-Dawley CD rats
Whitsett,T. Jr, Carpenter, D.M. and Lamartiniere,
C.A. (2006), "Resveratrol, but not EGCG, in the
Diet Suppresses DMBA-induced Mammary Cancer in
Rats, Journal of Carcinogenesis, 2006 May 15 5
(1) 15.
Final analysis involved gamma-frailty model
(repeated measures), generalized Poisson
regression, KM curve estimation and life testing.
But
16Tumorogensis Animal Study (Breast Cancer,
chemoprevention)
(Incidence)
(Tumor burden)
17Multivariate Survival Distributions
(Event 1)
(Event 2)
(Event m)
Multivariate Response
Univariate Responses
18Multivariate Survival Distributions
(Event 1)
(Event 2)
(Event m)
Multivariate Response
Univariate Responses
19Univariate/marginal Gamma Distribution
20Univariate/marginal Gamma Distribution
Shape
Scale
Location
21Generating BVG with Linear Associations
- We start out with Pair-wise Associations
Latent Variable
22Generating BVG with Linear Associations
- We start out with Pair-wise Associations
23Generating BVG with Linear Associations
- We start out with Pair-wise Associations
24Generating BVG with Linear Associations
- We start out with Pair-wise Associations
Carpenter, Diawara and Han (2006) Mathai and
Moschopoulos (1991,1992)
25(No Transcript)
26BVG with Linear Associations
27(No Transcript)
28Linearly Related Bivariate Weibull Distributions
Carpenter, Diawara and Han (2005)
- The Latent Variable Z associates X2 to X1.
29- pdf of the Latent Variable Z
30 31MSE Plot
?
32Scatter Plots of first/second tumor times
33Scatter Plots of first/second tumor times
Simultaneous arrivals
34Time to tumor
Treated
Control
35Confidence Ellipsoids
36Conclusions
- Ignoring correlation between events is
inefficient and can lead to difficult
interpretations - He and Johnson (2004), J. R. S. S., suggest
bivariate location-scale models are more
efficient than working independence.