Title: Min Zhang, PhD
1Estimation and Validation of an Outbreak Simulator
- Min Zhang, PhD
- Xiaohui Kong
- Garrick L. Wallstrom, PhD
- RODS Laboratory
- Department of Biomedical Informatics
- University of Pittsburgh, Pittsburgh, PA
2Background
- Evaluation of detection algorithms
- Real data
- Semi-synthetic
- Outbreak simulation
- Direct simulation of effects
- Disease-specific models
3Outline
- Template-Driven Spatial-Temporal Outbreak
Simulator - Zhang M, Wallstrom GL, Template-Driven
spatial-temporal outbreak simulation for outbreak
detection evaluation, AMIA 2008 annual symposium,
Washington D.C., Nov. 2008. - BARD (Bayesian Aerosol Release Detector)
- Hogan WR, Cooper GC, Wallstrom GL, Wagner
MM, Depinay J-M. The Bayesian aerosol release
detector. Stat Med 2007. 26(29) 5225-52. - Evaluation of the simulator using BARD data
4Outline
- Template-Driven Spatial-Temporal Outbreak
Simulator - Zhang M, Wallstrom GL, Template-Driven
spatial-temporal outbreak simulation for outbreak
detection evaluation, AMIA 2008 annual symposium,
Washington D.C., Nov. 2008. - BARD (Bayesian Aerosol Release Detector)
- Hogan WR, Cooper GC, Wallstrom GL, Wagner
MM, Depinay J-M. The Bayesian aerosol release
detector. Stat Med 2007. 26(29) 5225-52. - Evaluation of the simulator using BARD data
5Template-Driven Spatial-Temporal Outbreak
Simulator
- The template-driven spatial-temporal simulator
- Is a flexible non-disease-specific simulator
- Uses simple simulation methods and minimal
parameters - Simulates either temporal or spatial-temporal
event time data
6Temporal outbreak simulation
- Three components for temporal simulation
- Outbreak magnitude Cthe number of expected
number of the captured outbreak cases during the
outbreak - Temporal template ? a function that describes
how the rate of new cases change over time - Generation algorithmthree approaches to generate
event times according to the user-defined
template function
7Temporal template f
- f is defined to be a probability density
function that is zero outside of an outbreak
interval 0,T)
8Generation algorithms
- Deterministic generation
- Create C event times in a regular non-random
pattern - Independent generation
- Draw C random samples according to function ?
- Poisson process generation
- Generate event times according to the
heterogeneous rate function - ?(t)C?(t)
9Example 1-parameter settings
- Outbreak magnitude
- C300 captured cases
- Template function
- a linear increasing function during 0,T) (T3
days) - Generation algorithm
- Each of the three algorithms
10Example 1-simulation
- Figure 1. Simulated visit times using a linear
template function. Hourly-aggregated visit times
are created using deterministic (a), independent
(b), and Poisson process (c) generation.
(a)
(b)
Figure 1.
(c)
11Spatial-Temporal Simulation
- Three components for spatial-temporal simulation
- Outbreak magnitude Cthe number of expected
number of the captured outbreak cases during the
outbreak - Spatial temporal template ? a function that
describes how the rate of new cases change over
space and time - Generation algorithmthree approaches to generate
event times according to the user-defined
spatial-temporal template function
12Spatial-Temporal Template
- f is defined to be a bounded function
13Forms of Spatial-temporal simulation
- General form
- Independent form
- Lagged form
14Setting fs
- fs(s) defines the probability that each captured
case is assigned to tract s -
vs - coverage ns - population rs - elevated
disease risk
Hs - captured historical non- outbreak
cases in tract srs - elevated disease risk
15Generation Algorithms
- Deterministic generation
- Distribute C event times in a regular
spatial-temporal pattern - Independent generation
- Determine the number of cases in each tract by
simulating one draw from a multinomial
distribution - where,
- Poisson process generation
- Generate event times to each tract independently
according to a Poisson process with rate
function - ?(t)C?(s,t)
16Example 2 parameter setting
- Outbreak magnitude C4000 cases
- Template function
- rs a linear decreasing function of distance from
the outbreak center S015213 in Pittsburgh area. - The lag function a function of the distance d
(in km) from S0 - fT(t) is a lognormal function
- with the mean 5.6 days.
- Poisson process generation
fT(t)
t
17Example 2 Day 0
18Example 2 Day 1
19Example 2 Day 2
20Example 2 Day 3
21Example 2 Day 4
22Example 2 Day 5
23Example 2 Day 6
24Example 2 Day 7
25Example 2 Day 8
26Example 2 Day 9
27Example 2 Day 10
28Example 2 Day 11
29Example 2 Day 12
30Example 2 Day 13
31Example 2 Day 14
32Example 2 Day 15
33Example 2 Day 16
34Example 2 Day 17
35Example 2 Day 18
36Example 2 Day 19
37Outline
- Template-Driven Spatial-Temporal Outbreak
Simulator - Zhang M, Wallstrom GL, Template-Driven
spatial-temporal outbreak simulation for outbreak
detection evaluation, AMIA 2008 annual symposium,
Washington D.C., Nov. 2008. - BARD (Bayesian Aerosol Release Detector)
- Hogan WR, Cooper GC, Wallstrom GL, Wagner
MM, Depinay J-M. The Bayesian aerosol release
detector. Stat Med 2007. 26(29) 5225-52. - Evaluation of the simulator using BARD data
38BARD (Bayesian Aerosol Release Detector)
Weather Data
Release Parameters
BARD Simulator
ED (Emergency Department) visit Data
- BARD a disease-specific outbreak simulator
39BARD Simulation
Affected zip codes in Pittsburgh area
40Outline
- Template-Driven Spatial-Temporal Outbreak
Simulator - Zhang M, Wallstrom GL, Template-Driven
spatial-temporal outbreak simulation for outbreak
detection evaluation, AMIA 2008 annual symposium,
Washington D.C., Nov. 2008. - BARD (Bayesian Aerosol Release Detector)
- Hogan WR, Cooper GC, Wallstrom GL, Wagner
MM, Depinay J-M. The Bayesian aerosol release
detector. Stat Med 2007. 26(29) 5225-52. - Evaluation of the simulator using BARD data
41Evaluation of the simulator using BARD data
- The spatial-temporal template f is a bounded
function of space s and time t - Estimate by the proportion of all cases
that reside in block group s. - We model the visit times in each block group by a
single lognormal distribution with
location-dependent parameters
42Estimation of and
- We assume that and are smooth
functions of space - Maximum likelihood estimation for each block
group - Computing a spatially-weighted average of the
maximum likelihood estimates
43Compute P-Values
- Data from 100 BARD simulations in Pittsburgh
region - Group 1 0.1kg release (50 data sets)
- Group 2 0.5kg release (50 data sets)
- Compute a Pearson goodness-of-fit test
statistic using block groups and days for bins. - Use Monte Carlo simulation to compute p-values
44Results
45Discussion
0.1kg release
0.5kg release
P-value
P-value
Counts/block
Counts/block
46Summary
- We previously introduced a non-disease specific
simulator for creating outbreak data. - We conducted a limited validation experiment
using simulated releases from BARD. - The validation experiment yielded mixed results.
The simulator is sufficiently flexible to
describe some (but not all) simulated releases
from BARD. - Further model validation should include
estimation from real outbreak data. - Despite these results, the simulator is a useful
tool for semi-synthetic evaluation of detection
algorithms.
47Acknowledgments
- This research was supported by a grant from the
Centers for Disease Control and Prevention
(R01PH000025). This work is solely the
responsibility of its authors and do not
necessarily represent the views of the CDC. - We thank Dr. William Hogan for providing the BARD
data, and Dr. Aurel Cami for technical assistance.