Statistical approaches for detecting unexplained clusters of disease. - PowerPoint PPT Presentation

1 / 86
About This Presentation
Title:

Statistical approaches for detecting unexplained clusters of disease.

Description:

A number of similar things grouped closely together Webster's Dictionary ... from Talbot et al., Statistics in Medicine, 2000. Problems with smoothing ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 87
Provided by: thomas120
Learn more at: http://www.albany.edu
Category:

less

Transcript and Presenter's Notes

Title: Statistical approaches for detecting unexplained clusters of disease.


1
  • Statistical approaches for detecting unexplained
    clusters of disease.
  • Spatial Aggregation
  • Thomas Talbot
  • New York State Department of Health
  • Environmental Health Surveillance Section
  • Albany School of Public Health
  • GIS Public Health Class
  • March 3, 2009

2
Cluster
  • A number of similar things grouped closely
    together

    Websters Dictionary
  • Unexplained concentrations of health eventsin
    space and/or time

  • Public Health Definition

3
  • Occupation
  • Sex, Age
  • Socioeconomic class
  • Behavior (smoking)
  • Race
  • Time
  • Space

4
Spatial Autocorrelation
Everything is related to everything else, but
near things are more related than distant
things.
- Toblers first law of geography








Positive autocorrelation
  • Negative autocorrelation

5
Morans I
  • A test for spatial autocorrelation in disease
    rates.
  • Nearby areas tend to have similar rates of
    disease. Moran I is greater than 1, positive
    spatial autocorrelation.
  • When nearby areas are dissimilar Moran I is less
    than 1, negative spatial autocorrelation.

6
Detecting Clusters
  • Consider scale
  • Consider zone
  • Control for multiple testing

7
Talbot
8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
Cluster Questions
  • Does a disease cluster in space?
  • Does a disease cluster in both time and space?
  • Where is the most likely cluster?
  • Where is the most likely cluster in both time and
    space?

15
More Cluster Questions
  • At what geographic or population scale do
    clusters appear?
  • Are cases of disease clustered in areas of high
    exposure?

16
Nearest Neighbor AnalysisCuzick Edwards Method
  • Count the the number of cases whose nearest
    neighbors are cases and not controls.
  • When cases are clustered the nearest neighbor to
    a case will tend to be another case, and the test
    statistic will be large.

17
Nearest Neighbor Analyses
18
Advantages
  • Accounts for the geographic variation in
    population density
  • Accounts for confounders through judicious
    selection of controls
  • Can detect clustering with many small clusters

19
Disadvantages
  • Must have spatial locations of cases controls
  • Doesnt show location of the clusters

20
Spatial Scan StatisticMartin Kulldorff
  • Determines the location with elevated rate that
    is statistically significant.
  • Adjust for multiple testing of the many possible
    locations and area sizes of clusters.
  • Uses Monte Carlo testing techniques

21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
The Space-Time Scan Statistic
  • Cylindrical window with a circular geographic
    base and a height corresponding to time.
  •  
  • Cylindrical window is moved in space and time.
  • P value for each cylinder calculated.

30
Knox Method test for space-time interaction
  • When space-time interaction is present cases near
    in space will be near in time, the test statistic
    will be large.
  • Test statistic The number of pairs of cases that
    are near in both time and space.

31
Focal tests for clustering
  • Cross sectional or cohort approach Is there a
    higher rate of disease in populations living in
    contaminated areas compared to populations in
    uncontaminated areas? (Relative risk)
  • Case/control approach Are there more cases than
    controls living in a contaminated area? (Odds
    ratio)

32
Focal Case-Control Design
500 m.
250 m.
Case
Control
33
Regression Analysis
  • Control for know risk factors before analyzing
    for spatial clustering
  • Analyze for unexplained clusters.
  • Follow-up in areas with large regression
    residuals with traditional case-control or cohort
    studies
  • Obtain additional risk factor data to account for
    the large residuals.

34
(No Transcript)
35
(No Transcript)
36
(No Transcript)
37
At what geographic or population scale do
clusters appear?Multiresolution mapping.
38
  • A cluster of cases in a neighborhood provides a
    different epidemiological meaning then a cluster
    of cases across several adjacent counties.
  • Results can change dramatically with the scale of
    analysis.

39
1995-1999
40
(No Transcript)
41
(No Transcript)
42
(No Transcript)
43
Interactive Selections by rate, population and p
value
44
References
  • Talbot TO, Kulldorff M, Forand SP, and Haley VB.
    Evaluation of Spatial Filters to Create Smoothed
    Maps of Health Data.  Statistics in Medicine.
    2000, 192451-2467
  • Forand SP, Talbot TO, Druschel C, Cross PK. Data
    Quality and the Spatial Analysis of Disease
    Rates Congenital Malformations in New York.
    2002. Health and Place.  2002, 8191-199
  • Haley VB, Talbot TO. Geographic Analysis of Blood
    Lead Levels in New York State Children Born
    1994-1997.  Environmental Health Perspectives
    2004, 112(15)1577-1582
  • Kuldorff M, National Cancer Institute. SatScan
    User Guide www.satscan.org

45
Geographic Aggregation of Health DatabyThomas
TalbotNYS Department of HealthEnvironmental
Health Surveillance Section

46
Health data can be shown at different geographic
scales
  • Residential address
  • Census blocks, and tracts
  • Towns
  • Counties
  • State

47
Concerns about release of small area data
  • Risk of disclosure of confidential information.
  • Rates of disease are unreliable due to small
    numbers.

48
Rate maps with small numbers provide very little
information.
http//www.nyhealth.gov/statistics/ny_asthma/hosp/
zipcode/hamil_t2.htm http//www.nyhealth.gov/stat
istics/ny_asthma/hosp/zipcode/pdf/hamil_m2.pdf
49
Disclosure of confidential information
Census Blocks
50
Smoothed or Aggregated Count Rate Maps
  • Protect Confidentiality so data can be shared.
  • Reduce random fluctuations in rates due to small
    numbers.

51
Smoothed Rate Maps
  • Borrow data from neighboring areas to provide
    more stable rates of disease.
  • Shareware tools available
  • Empirical or Hierarchal Bayesian approaches
  • Adaptive Spatial Filters
  • Head banging
  • etc.

52
(No Transcript)
53
from Talbot et al., Statistics in Medicine, 2000
54
Problems with smoothing
  • Does not provide counts rates for defined
    geographic areas.
  • Not clear how to link risk factor data with
    smoothed health data.
  • Methods are sometimes difficult to understand -
    black boxes
  • Does not meet requirements of some recent New
    York policies legislation.

55
Environmental Facilities Cancer Incidence Map
Law, 2008 3-0317
  • Plot cancer cases by census block, except in
    cases where such plotting could make it possible
    to identify any cancer patient.
  • Census blocks shall be aggregated to protect
    confidentiality.

56
Environmental Justice Permitting NYSDEC
Commissioner Policy 29
  • Incorporate existing human health data into the
    environmental review process.
  • Data will be made available at a fine geographic
    scale (ZIP code or ZIP Code Groups).

57
Aggregated Count or Rate Maps
  • Merge small areas with neighboring areas to
    provide more stable rates of disease and/or
    protect confidentiality.
  • Aggregation can be done manually.
  • Existing automated tools were difficult to use.

58
Original ZIP Codes3 Years Low Birth Weight
Incidence Ratios
59
Aggregated to 250 Births per ZIP Code Group
60
Goal
Our Tool Requirements
  • Aggregate small areas into larger ones.
  • User decides how much aggregation is needed.
  • Works with various levels of geography.
  • census blocks, tracts, towns, ZIP codes etc.
  • can nest one level of geography in another
  • Uses software which is readily available in
    NYSDOH (SAS)?
  • Can output results for use in mapping programs.

61
Aggregation Tool
Regions
Original Block Data
SAS Tool
Simulated data
62
Aggregation Process
  • Populated blocks with the fewest cases are merged
    first.
  • If there is a tie the program starts with the
    block with the fewest neighbors.
  • Selected block then is merged with the closest
    neighbor in the same census block group.
  • After merging the first block the list of
    neighbors is updated.
  • Process repeats until all regions have a minimum
    number of cases
  • program can also merge to user specified
    population

63
Special Situations
  • Tool tries to avoid merging blocks in different
    census areas
  • Census block groups
  • Census tracts (homogeneous population
    characteristics).
  • Counties
  • Tool tries to avoid merging blocks across major
    water bodies
  • e.g. Finger lakes, Hudson River, Atlantic Ocean

64
Water
65
9 Cases 98 Population
Simulated data
66
9 Cases 98 Population
Simulated data
67
9 Cases 98 Population
Simulated data
68
9 Cases 98 Population
Simulated data
69
9 Cases 98 Population
Simulated data
70
9 Cases 98 Population
Simulated data
71
9 Cases 98 Population
Simulated data
72
9 Cases 98 Population
Simulated data
73
9 Cases 98 Population
Simulated data
74
9 Cases 98 Population
Simulated data
75
9 Cases 98 Population
Simulated data
76
9 Cases 98 Population
Simulated data
77
9 Cases 98 Population
Simulated data
78
9 Cases 98 Population
Simulated data
79
9 Cases 98 Population
Simulated data
80
9 Cases 98 Population
Simulated data
81
9 Cases 98 Population
Simulated data
82
New York StateDescriptive StatisticsYear 2000
populated census blocks
NY number of cases 470,000 NY population
18,976,457
83
Performance Measures
  • Compactness
  • Homogeneity with respect to demographic factors
    (measured as index of dissimilarity)
  • Similar population sizes.
  • Number of aggregated areas.
  • Aggregated zones are completely contained within
    larger areas.
  • e.g. blocks aggregation areas contained within
    tracts

84
(No Transcript)
85
Index of dissimilaritythe percentage of one
group that would have to move to a different area
in order to have a even distribution
Wikipedia
bi the minority population of the ith
area, e.g. census tract B the total minority
population of the large geographic entity for
which the index of dissimilarity is being
calculated. wi the non-minority population of
the ith area W the total non-minority
population of the large geographic entity for
which the index of dissimilarity is being
calculated.
86
The End
Write a Comment
User Comments (0)
About PowerShow.com