SpaceTime Modeling and Application to Emerging Infectious Diseases - PowerPoint PPT Presentation

1 / 61
About This Presentation
Title:

SpaceTime Modeling and Application to Emerging Infectious Diseases

Description:

Wild birds (89% are American crows) are the principal hosts. ... Crows are not much spreading the virus spatially, though they probably are ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 62
Provided by: fore83
Category:

less

Transcript and Presenter's Notes

Title: SpaceTime Modeling and Application to Emerging Infectious Diseases


1
Space-Time Modeling and Application to Emerging
Infectious Diseases
National Health Research Institutes
  • ???

July 26th, 2005
Division of Biostatistics and Bioinformatics
2
Outline
  • Introduction
  • STARMA Models
  • Methods for STARMA Modeling and Software IEAST
  • Modeling Emerging Infectious Diseases using
    STARMA and IEAST
  • Conclusion

3
Introduction
4
Introduction
  • Toblers First Law of Geography
  • Everything is related to everything else, but
    near things are more related than distant
    things.

5
Introduction
  • Biological and ecological processes are often
    organized and correlated in both space and time.
  • Why use space-time data and space-time analyses?
  • Various space-time models
  • STKF, KKF, VARMA, STARMA, etc.
  • Why STARMA models?
  • Is emerging infectious diseases the only
    application?

6
Scope of the Work
  • An efficient and robust STARMA modeling method
  • Space-time extensions of optimization algorithm
    and model fitness measures
  • Refinement of the space-time modeling procedure
  • Software development -- IEAST
  • The first general-purpose STARMA modeling and
    analysis software
  • Integrated Environment for Analyzing STARMA
    models
  • Application to the spread of WNV in an epidemic
    in Detroit
  • Modeling and analysis of Dead Crow Data
  • Modeling and analysis of Human Case Data
  • Cross analysis of Human Case Data and Dead Crow
    Data
  • Statistical inferences from these space-time
    analyses

7
STARMA Models
8
Space-Time Variables Evolving over Time
  • zt,x some ecological variable at spatial
    coordinates vector x at time t. zx forms a time
    series for location x.
  • These time series are not independent, but
    influence each other via spatial proximity.

zt,(2,2)
random noise
zt,(1,2)
zt,(2,1)
zt,(0,0)
time
X
Y
9
General STARMA Models
  • The general STARMA model has the stochastic
    equation
  • Model types
  • STAR model (when ?k,b0)
  • STMA model (when ?k,b0)
  • Mixed model (when ?k,b ? 0 and ?k,b ? 0).

----- AR terms ----- ----- MA
terms ----- The strengths of the autoregressive
components is measured by ?k,b and the strengths
of shared moving average stochastic inputs are
?k,b.
10
A Useful Form for STARMA Modeling
  • By introducing the spatial weight matrices W(l),
    we can express the general STARMA model as the
    following form
  • This is the equation actually used for the
    implementation of IEAST and applications.

where l spatial lag, k temporal lag zt is
the observation vector at time t W(l) is the
weight matrix for l-th order ?kl are the
parameters of autoregressive terms ?kl are the
parameters of moving average terms et is the
random noise vector at time t.
11
Spatial Correlation Structure and Weight Matrices
  • Spatial weight matrices are used to construct the
    spatial correlation structure among locations.
  • The following ordering is an example of the
    definition of spatial correlation structure (up
    to 4th order neighbors) in 2D system.

12
Some Limitations of STARMA Modeling
  • Raster based
  • Requires massive amount of space-time data
  • Models generally may not be fully mechanistic
  • Assumptions
  • Stationarity
  • Spatial Regularity
  • Effects are constant
  • Effects are linearly correlated

13
Methods for STARMA Modeling and Software IEAST
14
Box-Jenkins Modeling Method
Data
Model Identification
Parameter Estimation
Modify Model
Diagnostic Check
No
Good?
Yes
End
15
Model Identification
  • To determine the model type and orders.
  • Conventionally, space-time autocorrelations (i.e.
    STACF/STPACF) are used (Pfeifer and Deutsch,
    1980).
  • In this research, space-time extensions of model
    fitness measures (i.e. AIC, BIC) are used to
    assist identification when the method above does
    not work. These measures are more objective and
    computationally efficient.

16
Model Identificationusing Space-Time
Autocorrelation Functions
  • Example 1 STAR (MaxT2, MaxS1)
  • STACF tails-off
  • STPACF cuts-off at T-lag2 S-lag1
  • Example 2 STMA (MaxT1, MaxS1)
  • STACF cuts-off at T-lag1 S-lag1
  • STPACF tails-off

17
Model Identification using Space-Time
Autocorrelation Functions
18
Model Identification using Model Fitness
Measures
Accuracies (number in red) of model type
selection using (1)Variance of residuals, (2)AIC,
(3)BIC, and (4)AICBIC based on 150 Monte Carlo
simulated datasets
19
Parameter Estimation
  • To calculate coefficients of a candidate model
    for given model type and orders.
  • Two methods needed for two kinds of models
  • Linear models (i.e. STAR) Linear ML estimator.
  • Non-linear models (i.e. STMA and Mixed)
    Multi-variate nonlinear optimization.
  • The multi-variate and non-linear nature raises
    problems while in optimization
  • Converge to local optima
  • Very time-consuming
  • A good starting point is crucial for optimization
  • Extra step Pre-estimation
  • Space-time extended Hannan-Rissanen Algorithm is
    used.

20
Diagnostic Check
  • To decide the adequacy of a candidate model for
    representing the given data.
  • Methods
  • Variance of residuals
  • Space-time autocorrelations of residuals
  • Significance testing of parameters
  • Space-time extension of AIC/BIC

21
Modeling Procedures
Data
Model Identification
Parameter Estimation
Modify Model
Diagnostic Check
No
Good?
Yes
End
Box-Jenkins method
22
Software for STARMA Modeling -- IEAST
  • Developed using GNU Octave v2.1.40 and able to be
    used under various popular OS, e.g. MS Windows,
    Mac OS, Unix.
  • Two interfaces menu-driven mode and programming
    mode.
  • Features
  • True spatio-temporal analysis software
  • Analyzing 2D lattice space-time datasets
  • Full configurability
  • Programming environment
  • Improved estimation algorithms
  • Improved diagnostic measures
  • Estimation of spatial correlation structure
  • Cross correlation analysis
  • 2D/3D plotting abilities

23
IEAST Menu-Driven Mode vs Programming Mode
In menu-driven mode, users can conduct the
modeling procedure by selecting a series of
commands/options from the menu hierarchy.
24
IEAST Menu-Driven Mode vs Programming Mode
In programming mode, a set of sophisticated
instructions can be used to compose programs to
control the modeling flow and to conduct
statistical analyses.
25
Modeling Emerging Infectious Diseases using
STARMA and IEAST
26
State of Art for Statistical Analyses of Emerging
Infectious Diseases
  • As far as we know, no true spatial-temporal
    statistical models and methods have been used.
  • Space-time cluster analysis available
    (Theophilides et al, 2003 Mostashari et al,
    2003 Hoebe et al, 2004)
  • Spatial models available (Watson et al, 2004).
  • Temporal models available.

27
Limitations of Simply Observing How a Spatial
Distribution Changes over Time
  • For example, expansion of the leading edge of a
    disease range.
  • Is the disease spreading directly over long
    distances but infrequently, or over short
    distances frequently?
  • This is important for projecting the future
    spread.

28
STARMA Has Potential for the Early
Characterization of Infectious Diseases.
  • STARMA acts as a prism. Can filter the
    spatial-temporal correlations into direct effects
    with known magnitude and spatial and temporal
    lags.
  • Not generally a complete, mechanistic model, but
    puts critical constraints on models.

29
West Nile Virus
  • The West Nile Virus (WNV) was first detected
    in a woman with a mild fever in the West Nile
    District of Uganda in 1937. Since then WNV has
    been spreading to North Africa, Europe, West and
    Central Asia, and the Middle East.

30
West Nile Virus in the United States
  • Outbreak in NYC in Sep 1999. Vector is Culex
    mosquitoes.
  • Wild birds (89 are American crows) are the
    principal hosts. Humans, horses, etc. are
    incidental hosts.
  • The incidence rate among crows is high. Infected
    crow almost always die (68).
  • Surveillance of Dead crows has been used as an
    indicator of WNV epidemic.

31
Dead Crow Data (DCD) Human Case Datasets (HCD)
in 2002
  • Time Summer in 2002 (AprilOctober)
  • Place Detroit metro area (Oakland, Macomb, and
    Wayne)
  • DCD were collected systematically before and
    during an outbreak among humans. Data mainly
    consisted of locations and dates of reported
    public sightings.
  • HCD were obtained from clinicians in Michigan.
    Data on address of residence and date of onset of
    disease were obtained from the case-patient or
    attending physician through telephone interviews.

32
Two Datasets Collected in 2002
Human Cases
Interview
GIS - ArcMap
Toll-free
Dead Crows
Longitude/Latitude
Data Cleaning Geocoding
WWW pages
From www.rci.rutgers.edu/ insects/crowid.htm
33
Space-Time Analysis for Dead Crow Data
34
The Dead Crow Data
  • Totally, 1817 dead crow sightings scattered
    within the three counties (red lines), spanning
    28 weeks.
  • Covered area (after truncation) a rectangular
    area of 31.6x25.8 mi
  • Divide the covered area into 10x10 cells. Cell
    size 3.16x2.58mi

35
Spatial Correlation Structure and Trends
  • Spatial correlation structure (uniform weighting)
  • Preprocessing
  • Remove spatio-temporal trend
  • Spatial trend 4th order polynomial regression
    trend surface
  • Temporal trend averaging over space.
  • Remove mean

36
Model Identification STACF
Tail-off
STACF tails-off
37
Model Identification STPACF
38
Parameter Estimation
The parameters (?ts) of this STAR model can be
estimated in IEAST by linear maximum likelihood
estimator.
  • Values in dark blue are nominally significant at
    the 0.001 level.
  • Values in light blue are nominally significant
    at the 0.01 level.

39
Diagnostic Check
  • Statistical significance of parameters
  • The probabilities P that ?ts are not significant
    are
  • Residuals autocorrelations

STACF
STPACF
40
Interpretations for the DCD Analysis
  • STAR(3,4) model is the best-fitted one.
  • The max. of spatial and temporal lags that are
    important are still smaller. S2 (or 6.4 km) and
    T2 weeks.
  • Compare S1 to S2. Value for S1 is much
    largercell boundary length effects.
  • The virus is not spreading very far very fast.
    Crows are not much spreading the virus spatially,
    though they probably are amplifying it locally.
  • Negative Autoregressive Effect At S1, and T2,3.
  • Appears to be a real effect.
  • May be due to crow population depletion.
  • Suggests there is a mixture of two STAR
    processes, the dominant one reflecting
    probability of infection, the other an echo
    effect from depletion.

41
Additional Analyses and Results
  • Additional Analyses
  • Using 20x20 and other cell configurations
  • Using different lag structures Pfeiffers vs.
    Ring structure
  • Using various polynomials for Spatial de-trending
  • Using sub-sample of the data
  • Results
  • Consistent over various methods of spatial
    de-trending, except high order polynomials
    resulted in smaller AR.
  • Consistent AR values using different lag
    structures and cell sizes.
  • Consistent implied spatial and temporal scales
    over which there are significant or substantial
    AR effects

42
Distances for Which There Are Significant Spatial
Correlation
  • Based on different cell configurations 10x10,
    16x16, and 20x20
  • The effective correlated area in the modeling
    result is consistently about 10.75 km regardless
    of cell sizes.

43
Alternative Spatial Correlation Structures
Ring structure
Pfeifers
44
Space-Time Analysis for Human Case Data
45
Human Case Data
  • Over 500 human cases spanning 13 weeks
  • Date of onset-converted to week
  • Home addresses (names stripped)-converted to
    cell, same as for DCD.
  • Used same arrays of cell sizes and spatial
    correlation structures as for DCD.
  • Same spatial and temporal de-trending method

46
Model Identification STACF
47
Model Identification STPACF
48
Parameter Estimation
Spatial lags
Temporal lags (weeks)
  • Values in dark blue are nominally significant at
    the 0.001 level.
  • Values in light blue are nominally significant
    at the 0.01 level.

49
Diagnostic Check
  • Residuals STACF and STPACF

STACF
STPACF
50
Interpretations for the HCD Analysis
  • Most people are getting infected at or near their
    homes.
  • The incidences are highly autocorrelated in space
    and time.
  • The distribution or probability of infection is
    highly localized.
  • The WNV load and probability of human infection
    is spreading slowly, in the sense of not
    spreading very far very fast.
  • Suggests localized spraying could reduce cases.
  • Without depletion effect, the human case data
    show positive and significant above zero for
    T-lag2 and S-laggt1, esp. at S-lag1.

51
Space-Time Cross Analysis for HCD and DCD
52
Space-Time Data HCD and DCD
  • The areas for cross analysis are same for both
    datasets.
  • The configuration is again 10x10 and spanning 28
    weeks.
  • Cell size is 6.31x6.31 km.

53
Both Temporal Epidemic Curves
  • Dead crow reported is leading human cases in time.

54
Space-Time Cross Correlations
-3
55
Interpretations for Space-Time Cross Correlations
  • Drop smoothly to zero spatially and temporally.
  • Very large (as high as 0.7).
  • Across all spatial lags, the max. cross
    correlations are aligned at 3 weeks.
  • The cross correlations at spatial lag 1 is
    slightly greater than at spatial lag 0.
  • When temporal lag decreases to 8 or below, the
    correlations between these two datasets are
    negligible (lt0.1).
  • When spatial lag increases up to 10, the cross
    correlations are reduced to as low as 0.2.

56
Is the Cross Correlations Spurious?
The autocorrelation of the DCD can spuriously
contribute to cross correlations. To eliminate
this effect, both datasets were pre-whitened
before calculating cross correlations.
  • The result shows that the real cross
    correlations are much larger than the spurious
    components.

57
Summary for Modeling the Spread of WNV
  • Crows are not spreading the disease spatially
    very far very fast.
  • Spread is very localized, perhaps other animals
    or the mosquitoes themselves are spreading it
    spatially.
  • Humans are being infected largely at or near
    their homes.
  • Both crows and humans appear to be responding to
    local viral loads.
  • Dead crow findings precede human cases by two to
    three weeks. Dead crows can be a good indicator
    of human epidemics.

58
Conclusion
  • It appears that STARMA modeling could be an
    important tool of the early characterization of
    many emerging and re-emerging infectious disease
    epidemics.
  • During the course of an epidemic, it could be
    used (in principle) for forecasting, under
    existing conditions or under potential courses of
    action.
  • While not generally a mechanistic model, STARMA
    does inform spatial and temporal scales of
    spread, hence places constraints on mechanistic
    models (which otherwise may have too many
    parameters).

59
Funding Acknowledgements
  • Michigan Agricultural Experiment Station,
    Michigan State University.
  • Center for Emerging Infectious Diseases, Michigan
    State University.
  • Centers for Disease Control and Prevention, USA.

60
  • Thanks for your attention!
  • Questions?

61
References
  • C.J.P.A. Hoebe, H. de Melker, L. Spanjaard, J.
    Dankert, and N. Nagelkerke. Space-time cluster
    analysis of invasive meningococcal disease,
    Emerging Infectious Disease, Vol.10, No. 9,
    p1621-1626, 2004.
  • C.N. Theophilides, S.C. Ahearn, S. Grady, and M.
    Merlino. Identifying West Nile virus risk areas
    The dynamic continuous-area space-time system.
    American Journal of Epidemiology, 157843-854,
    2003.
  • J. Watson, R. Jones, K. Gibbs, and W. Paul. Dead
    crow reports and location of human West Nile
    virus cases, Chicago, 2002. Emerging Infectious
    Diseases, 10(5)938-940, 2004.
  • F. Mostashari, M. Kulldorff, J.J. Hartman, J.R.
    Miller, V. Kulasekera. Dead bird clustering A
    potential early warning system for West Nile
    virus activity. Emerging Infectious Diseases,
    9641-646, 2003.
Write a Comment
User Comments (0)
About PowerShow.com