Title: Surveillance and spatial analysis of Infectious Diseases
1Surveillance and spatial analysis of Infectious
Diseases
- Computer Science Colloquium
- The University of Iowa, 2/2/07
Uriel Kitron Dept. of Pathobiology and Center
for Zoonoses Research University of Illinois
2Elements of disease surveillance and control
- Data sources
- Data layers
- Data storage
- management
- Data integration
- Data analysis
- Data visualization
- Application
- dissemination
3Sources existing tools for spatial data
- Field data
- Surveillance data
- Environmental data
- Data Analysis
- Approaches
- GPS
- GIS
- Remote sensing
- Spatial statistics, time series, dynamic models
- Landscape ecology epidemiology
Metapopulation biology Ecological risk assessment
SCALE
4Geographic Information Systems (GIS)
A system to capture, manage, manipulate,
analyze, model, display spatially referenced
data for research, management and planning
Human cases
Human settlements
Water bodies
Soil type
Vegetation data
Adult mosquitoes
5Spatial Statistics Geostatistics
- Global clustering
- (spatial autocorrelation, K
function, join counts) - Local clustering (hot spots)
- interpolation smoothing and kriging
- Spatial filtering
- (screening of spatial components)
- Spatial - temporal processes
6First Law of Geography (Tobler 1979)
- Everything is related to everything else, but
- near things are more related
- than distant things.
Calculation of Spatial Statistics
Based on giving weight to the distances between
items of interest
7Role of GIS, remote sensing spatial analysis
in VBD research
- Analysis of transmission dynamics on multiple
scales - Consideration of complex role of landscape
climate - Development of predictive spatial models and
risk maps
SCALE
8Temporal and Spatial Scale and Resolution
- Geographic - ranging from the village/town to the
continental level - Temporal - ranging from the duration of an
outbreak, through the seasonal to multi-year
models - Multiple scales can be considered simultaneously
or in succession, but with caution
9Some examples Vector borne zoonoses
- 1. West Nile virus - introduction and
distribution in an urban area - 2. Chagas disease distribution,
habitat modification and role of various
zoonotic hosts - 3. Lyme disease spread and distribution
of vector and pathogen - (4. Malaria space-time association of cases in
Trinidad)
10Zoonotic Vector-borne diseases (VBD) transmission
system
Humans (domestic
animals)
Vector
Pathogen
Reservoir Host(wildlife)
Environment
11Prerequisites for an active zoonotic VBD focus
- Vector survival
- Presence of reservoir hosts
- Pathogen transmission
- Opportunities for human/animal exposure
121. West Nile virus Eco-epidemiology of disease
emergence in urban areas
- Develop a spatial model and risk maps based
on - demographic and environmental risk factors for
WNV and SLE in birds, mosquitoes and humans - reservoir capacity and differential effects of
WNV on various bird species - anthropogenic features of the urban environment
that support Culex mosquito production,
mosquito-bird transmission and virus
amplification. - Dynamics of viral transmission over space and
time using molecular evolutionary and
phylogeographic techniques
funded by NSF/NIH Ecology of Infectious Disease
Program
13Research Team
- Co-Investigators
- University of Illinois
- Uriel Kitron
- Marilyn Ruiz
- Tony Goldberg
- Jeff Brawn
- Scott Loss
- Michigan State University
- Edward Walker
- Gabe Hammer
- Collaborators
- Audubon Chicago Region
- Karen Glennemeier
- Judy Pollack
- Illinois Department of Public Health
- Constance Austin
- Linn Haramis
- Illinois State Water Survey
- Kenneth Kunkel
funded by NSF/NIH Ecology of Infectious Disease
Program
14Chicago
15(No Transcript)
16West Nile Virus in Illinois
- 2001 - 123 positive bird specimens, 0 human cases
- 2002 - 884 human cases, 66 deaths, more than any
other state that year
(U.S. - 4,156/284) - Over 680 cases occurred in Chicago and
surroundings - 2003 - 54 human cases, 1 death (U.S 9,862/264)
- 2004 - 60 human cases, 4 deaths (U.S.
2,539/100) - 2005 - 252 human cases, 12 deaths (U.S.
3000/119) - 2006 210 human cases, 9 deaths (U.S. 4180/149)
2002, 2005, 2006 hot and dry
2002
2003
172002
2005
2006
2004
2003
18Locations of human WNV cases in 2002 with land
cover
Human WNV case rate per 10,000 people
19Smoothed Map of Disease Cases summarized by
1196 1.8 km hexagons
Local Spatial Autocorrelation of Cases - LISA
statistic (Anselin)
Range 0-15 cases/cell
20WNV Human Cases with Housing Density
1
- Human cases tend to be outside of the more
densely populated urban core. - 3 areas with most cases (circled on map)
- in the south, near Oak Lawn
- 2) in north, around Skokie
- 3) southwest of Skokie
3
2
21Vegetation
PhysiographicRegion
22Dominant patterns in the Chicago urban landscape
- Each different colored area represents a place
with a common set of factors related to housing,
vegetation, socio-economics, and land use
Ruiz et al, Int'l J Health Geog 2005
23Urban Type 5, dominated by 40s, 50s, and 60s
housing. Mostly white, moderate vegetation and
moderate population density. 435 cases (64) were
in this group, 2.27 cases per 10,000 people
(RRgt3.5). (All other types lt0.65
cases per 10,000)
242005 Field Sites
25Site 3 Oak Lawn North
Green site Saint Casimirs Cemetery
Residential site
26Avian Host Community
- Bird Surveys
- Line transect bird surveys during May and June
- Bird Mist-netting
- 6-8 nets/morning from sunrise to noon during May
to October - Seropositivity of Captured Birds
- ELISA
- Virus Detection in Captured Birds - RT-PCR
27Overall Prevalence 19.9 (n 1062)
28Vector Community
- Adult Mosquito Trapping - MIR
- Light trap, gravid trap, aspirator
- Quantification of Mosquito Productivity
- Catch basins, containers
- Index of Culex Density
- Ovitraps
- Mosquito Bloodmeal Analysis
29(No Transcript)
30Ultimately, our goal is to be able to explain and
predict
2005 outbreak
Mosquito pool WNV test results
Human WNV cases, 10/3/2005 185/197 in greater
Chicago
31Weekly 2005 mosquito infection rate by watershed
and cases of human illness in Cook and DuPage
County, Illinois. Human illness cases are
preliminary data and should not be considered
authoritative.
32Weekly 2005 mosquito infection rate by watershed
and cases of human illness in Cook and DuPage
County, Illinois. Human illness cases are
preliminary data and should not be considered
authoritative.
33Important Processes Behind the Cluster Patterns
- Ecological
- Mosquito and bird habitat suitability
- Housing, landscape and catch basins
- Socioeconomic
- Lifestyle
- Access to healthcare, biased reporting
- Race, income
- Mosquito Abatement Districts
- Control methods
- Geographic location
342. Eco-Epidemiology of Chagas Disease in
northwest Argentina
- Univ. of Buenos Aires, Argentina
- National Vector Control Program, Argentina
- Instituto Fatala Chabén, Argentina
- CNRS-IRD, France
- Rockefeller University, NY. USA
- CDC, USA
- Univ. of IllinoisSupported by NIH/NSF EID
Program through FIC
35Life cycle of Trypanosoma cruzi
36Eco-Epidemiology of Chagas Disease In Northwest
Argentina study area
Departamento Moreno
Landsat Thematic Mapper
Santiago del Estero Province
Amama
37Typical Compound with home and multiple
peridomestic structures
38Peridomestic Structures refuge for bugs and
sources for reinfestation
Pig corral
Storeroom
Goat corral
39Mapping and geostatistical tools
Sketch maps made in the field during 1993-2002
Ikonos Satellite imagery (1-4m2)
Digital map for each village
Joining of attribute data to a GIS file
Clusters of high infestation and potential
sources of community reinfestation
SPATIAL STATISTICS
40Georeferencing - relating infestation data to
locations
41Reinfestation by T. infestans (5 years
post-spraying)
42Gi(d) local spatial statistic
- Gi(d) ?j Wij(d) xj
- ?j xj
- Wij(d) is a spatial weights matrix with
- values of one for all links within
- distance d of a given I
- Concern about multiple comparisons
- (need to adjust significant z value)
We used Gi(d) to detect local and focal
clustering of infestations (number of bugs per
structure)
43FOCAL ANALYSIS OF REINFESTATION IN AMAMÁ
Primary source of T. infestans 1993
Subsequent infestations were clustered around an
initial focus at a distance of 450 mts.
Potential secondary sources fell within the range
of the clustering around the primary source.
Cecere et al. 2005. Am. J. Trop. Med. Hyg.,
71(6) 803810.
44Moving upscale - Including other
villages. Internal and external sources of
reinfestation.
Trinidad
Mercedes
External sources Villages not sprayed and
located within 1,500 m of the treated villages.
Cecere et al. EID, 2006
45RECOMENDATION
An effective control program on the community
level would entail residual spraying with
insecticides of the colonized site and all sites
within a radius of 450 m, and all communities
within 1,500 m of the target community in order
to prevent the subsequent propagation of T.
infestans
46STUDY AREAS OVER TWO DECADES AMAMA (RED), CORE
(BLUE), PERIPHERAL (GREEN) 2500 SQ. KM.
2002--
1985--
40 houses
1992--
130 houses
300 houses
1988--
600 houses
47 Moreno Department
5,439 houses, 2,911 rural houses, 275 villages,
25,000 habitants.
Vazquez-Prokopec et al
48 Department level clustering of infestation
5,439 houses, 2,911 rural houses, 275
villages, 25,000 habitants
Vazquez-Prokopec et al
Significantly associated with human population
density, density of rural houses, density of
houses with dirt floors
49HETEROGENEITY AT VARIOUS LEVELS
- BUG ABUNDANCE AT DOMESTIC AND PERIDOMESTIC SITE
LEVELS - HOST INFECTION 'INFECTED HOUSEHOLDS'
- HOST INFECTIVITY TO BUGS 'SUPERSPREADERS'
- SPATIAL DISTRIBUTION OF INFECTED DOGS WITHIN AND
AMONG VILLAGES gt 'HOTSPOTS OF TRANSMISSION' - gt VERY FOCAL TRANSMISSION
503. Foci of tick-borne diseases Predicting Lyme
Disease Risk
- M Guerra, R Cortinas, C Jones, E Grijalva,
U Kitron UI - E Walker MSU S Paskewitz, A Stancil UW
- L Beck, M Bobo, B Wood NASA CHAART
-
51Background
- Lyme disease, caused by Borrelia burgdorferi, is
the most common human vector-borne disease in the
United States. - In the eastern and central United States, the
tick, Ixodes scapularis, is responsible for
transmitting the bacteria to humans. - On the Pacific Coast, the bacteria is transmitted
to humans by the Ixodes pacificus.
52Biology of the vector
53Reservoir Hosts (for ticks and bacteria) and
dispersal agents
54Question
- Where Lyme disease is absent
- Is it the tick,
- the vertebrate host,
- or just a matter of time?
MI
WI
MN
IL
IN
IA
55Data sources
- Collections of questing and attached ticks
- Reservoir hosts studies (small mammals)
- Deer hunt check stations surveys
- Canine serology
- Human case data
- Environmental databases
- Satellite images
56Objectives
- To develop a model capable of predicting the
habitat suitability for Ixodes scapularis in the
eastern and central part of the United States, in
order to create a platform that can be used to
predict risk of transmission of Lyme disease. - The specific aims are
- To compile a spatial database of land cover,
soil, climate and range of host variables that
can be utilized to conduct spatial statistical
analysis related to the presence of tick. - To determine the variables that are associated to
the habitat suitability for I. scapularis ticks
at the county level. - To derive a statistical, spatial model capable of
predicting the probability of habitat suitability
for the tick vector.
57Hypotheses
- Specific land cover characteristic as well as
soil texture and soil order are major
determinants of the vector habitat. - The distribution of specific vertebrate host
species affects tick establishment. - Environmental factors such as temperature,
precipitation and relative humidity are likely to
regulate host and tick survival. - County level resolution can be used to capture
the general patterns and determinants of the tick
habitat.
58Model Overview
- The model covers 2023 counties, in 26 states. The
statistical analysis included a total of 1696
counties in 23 states.
- The final map was generated using GIS, spatial
analysis and geo-statistical functions. - 327 counties in North Dakota, Kentucky and North
Carolina were not included in the analysis. KY
and NC were selected randomly to validate the
final prediction model.
59(No Transcript)
60Model Overview (Statistics)
- Discriminant analysis (DA) was used to explore
the variables that best explain the
characteristics of tick habitat. - Logistic regression was used to generate the
final prediction model. - The analysis included a large number of
independent variables - 10 land cover characteristic
- 12 variables for soil
- 12 climatic variables, grouped in four seasons
- 5 different host variables
61Datasets
- Independent variables
- Vertebrate Host Data from National GAP
(Geographic Approach to Planning) Analysis
Program. (mice, chipmunks, skinks, lizards). - Land cover data from Land cover database of
North America 2000 (created using SPOT) - Soil order and soil texture obtained from the
State Soil Geographic database - Climate from National Climatic Data Center (NCDC)
Climate Atlas of the United States database
(CLIMAPS).
62Results DA per Group
- Significant variables
- Land cover Deciduous and evergreen forest mixed
needle leaf forests, mixed lands and proximity to
water bodies. - Soil inseptisol, alfisol, sand, sandy-clay and
sandy-loam - Climate
- Spring temperature and relative humidity
- Summer precipitation , temperature and relative
humidity - Fall Precipitation and relative humidity.
- Winter Precipitation.
- Vertebrate host Eastern-fenced lizards,
five-lined skinks and white-footed mice.
63DA Classification Results
- Overall correct classification was 86.7.
64Logistic Regression
- The same set of significant variables obtained
with DA were identified by forward logistic
regression. - The coefficients of the final model were
determined using forward logistic regression, the
results were verified using backward logistic
regression.
Logistic regression relates the probability of
1/0 (yes/no) answer to the values of a number of
explanatory variables.
65Logistic Regression Results
- The results from logistic regression analysis
supported the discriminant analysis model. - Eight variables were significant in the final
logistic regression model. - Contingency table analysis to assess the overall
classification given by the logistic regression
model using the 23 states. - Fishers exact test to assess positive
association of counties with tick population with
predicted probability from the logistic
regression model using a cut-off value of 0.33.
66Logistic Regression
- Predictive probability of tick establishment
- Mean 0.22
- Median 0.095.
- Positive skewed distribution,
- Overall classification 85
- OR (odds ratio) for the established tick group
(true positives is 16.50, (C.I.95 11.86,
22.97). - (?2 362.6, p lt 0.0001).
67Eastern-Fenced Lizards
68Habitat Suitability of Ixodes scapularis at
County Level
Model validated in North Carolina Waiting for
GAP data from key states Will add
Borrelia infection data
Supported in part by CDC Ecology of Lyme disease
contracts
69Model evaluation
- Accuracy
- Model capable of predicting the habitat
suitability with accuracy of 85. - More specific than sensitive allowing the
determination of locations/sites with low density
or absence of ticks. - Data Quality
- High data quality for all the independent
variables. - Accounts for variations across different
geographic regions and can capture association of
variables that determine habitat suitability at
different resolutions.
70 Habitat suitability for I. Scapularis ticks
in Wisconsin and Illinois (Earlier model based
on tick and host field data, RS data, soil
texture order, vegetation, Geology, glacial
history)
Data collected From 123 Field sites
Chicago
Invasion Along Riparian corridors
Guerra MA, Walker ED, Jones CJ, Paskewitz S,
Cortinas MR, Stancil A, Beck L, Bobo M, Kitron
U. Emerging Infectious Diseases (2002)
71Habitat Suitability Model
Model successfully Predicted spread of ticks
along Illinois River
Champaign Urbana
724. Imported malaria and risk of malaria outbreaks
in Trinidad
- Dave D. Chadee
- Insect Vector Control Division
- Ministry of Health, Trinidad
-
- Uriel Kitron
- College of Veterinary Medicine
- University of Illinois, Urbana, IL, USA
73Mosquito Imported Malaria Surveillance Program
Monitoring of Breeding sites Collection of
Anopheles larvae
GIS based Surveillance System
Collections of adult Mosquitoes
Reports of imported Malaria cases
Environmental data
Targeting of Efforts, Funds
Mosquito Control Case treatment
74Malaria cases by source
70
60
Malaria cases by parasite species
50
vivax
40
malariae
No. of Cases
30
falciparum
20
10
0
1968-
1973-
1978-
1983-
1988-
1993-
1972
1977
1982
1987
1992
1997
5-Year period
75N
Malaria cases by parasite species Trinidad,
1968-1997
E
W
S
T
T
T
T
T
T
T
S
T
T
P. falciparum
T
T
T
Port-of-Spain
T
S
T
Sangre Grande
T
P. malariae
T
T
T
T
T
T
T
T
T
T
T
P. vivax
T
T
N
a
r
i
v
a
-
M
a
y
a
r
o
T
T
T
T
T
T
T
T
T
T
T
Precise spatial and temporal information about
each case
T
T
T
T
T
T
T
T
S
T
T
T
T
P. Vivax Icacos outbreak
T
2
0
0
2
0
K
i
l
o
m
e
t
e
r
s
76Anopheles aquasalis larval site
Icacos P. vivax transmission site
77P. malariae cases in Trinidad, 1994-1995
Port-of-Spain
Sangre Grande
Ñ
Ñ
Ñ
Ñ
Ñ
Ñ
Ñ
Ñ
Ñ
Ñ
Ñ
Ñ
Ñ
Ñ
Ñ
Ñ
Ñ
Ñ
Ñ
San Fernando
Ñ
Princes town
Ñ
N
Ñ
E
W
30
0
30 Km
S
78Residence nested into a cacao grove
Cacao Trees Shaded by Immortelle Trees
Vectors Anopheles bellator
Anopheles homunculus
Bromeliads
79Space-time Interaction
- Do nearby cases tend to occur at about the same
time? - Can occur whether or not spatial and/or temporal
clustering is present - Not biased by heterogeneous population density
through space - Many statistical tests for space-time
interaction (Mantels, Knox, Jacquezs,
Kulldorffs scan)
80Jacquezs KNN test
- Avoids subjectivity
- Space and time nearest-neighbor relationships
- Null hypothesis - the space and time nearest-
neighbor relationships are independent - Point data (case locations and dates)
- Does not require controls
- Not biased by heterogeneous population density
- Biased by changing population size through time
81K-Nearest Neighbor space-time statistic
Cumulative test statistic
k-specific test statistic
NN - Nearest Neighbor k nearest neighbors - set
of cases near or nearer to case
than kth NN Sijk - Spatial NN
measure, Sijk 1 if case j is a k NN of
case I in space, 0 otherwise tijk - Time NN
measure, tijk 1 if case j is a k NN of
case I in time, 0 otherwise
82K specific test statistic Distribution of
malaria species by Plasmodium species, Trinidad,
1968-97
D
J
K
k
f
alciparum vivax
malariae
1
6
13
13
2
30
30
25
3
49
43
26
4
62
57
23
5
67
61
26
6
100
68
26
7
41
80
84
8
117
76
30
9
97
70
28
10
135
86
63
-
plt0.05
-
plt0.01
83Pattern and Process
- For P. vivax in Icacos - tight clustering in
space and time suggests a common source and
direct contact between cases - the Icacos
outbreak - For P. malariae in Nariva-Mayaro - loose
clustering in space and time suggests several
independent epicenters - For P. falciparum, lack of association in
clustering in space with clustering in time,
suggests independent imported cases
84Questions 1 Research
- Spatial determinants of disease transmission
- Spatial associations of risk factors with disease
and interaction with temporal processes - Origins of diseases and outbreaks
85Questions 2 Surveillance/Control
- How do we most effectively plan and conduct
surveillance/control programs based on
disease/risk patterns - How do we evaluate control programs based on
changes in disease patterns - How do we integrate spatial epidemiology
research with surveillance/control programs?
86Questions 3 Issues of scale
- How do we relate local changes (risk factors) to
global processes (emergence/transmission of
zoonoses? - How do we relate global changes to local
processes? - How do we choose the appropriate scale to study
risk factors and emergence/transmission of
zoonoses? - What are the risks and advantages of
interpolation and extrapolation for surveillance
and control?
87microscope. Levins, 1968
Upscale vs. Downscale
Spatial heterogeneity and processes that
operate on the micro- and meso- scale may not
be detected using high temporal but low spatial
resolution
- The detailed analysis
- of a model for purposes
- other than that which it
- was constructed may be
- as meaningless as
- studying a map under a
Spatial
Generality
Resolution
Model
Temporal
Spectral
Precision
Realism