Title: Minimizing False Alarms in Syndromic Surveillance
1Minimizing False Alarms in Syndromic Surveillance
- 2007 International Society for Disease
Surveillance 6th Annual Conference - Track 2 Analytical Methods Multivariate
Detection Methods - 11 October 2007
- William Peter, PhD
- Amir Najmi, DPhil
- Howard Burkom, PhD
- The Johns Hopkins University Applied Physics
Laboratory
2Should counts or proportions be used for
biosurveillance anomaly detection?
Can Mutual Information (M.I.) methods be used to
decide this?
Context example Total daily emergency room
visit counts
Target Daily emergency room Resp counts
- Monitor target syndrome visit counts or
target/context ratios for better signal-to-noise
properties? - What kind of ratios (i.e., contexts) should be
used?
3Introduction False Alerts
Identification of public health threats or
disease outbreaks in the Public Health
Information Network relies on opportunistic data
streams that vary across geographic regions and
time periods and can be biased by data
collection, seasonality, etc.
Is this increase in respiratory counts of
epidemiological interest?
Maybe Not.
4Proportions Can Clarify Outbreaks and Highlight
Alerts That Might Be Lost.
Alert
- When a denominator variable such as total
syndrome counts is available, a proportion can
clarify an outbreak, and prevent false alerts.
5Example 2 Chief Complaint Gastro-Intestinal
Syndrome Data
Disease tracking systems in a health information
network that rely on detecting sudden increases
in a syndromic count are not robust to changes in
health care utilization, seasonality, data
collection, etc. (cf. Reis et. al., this meeting).
Has an alert of epidemiological interest been
detected here?
6False Alerts Can Arise from changes in baseline
surveillance monitoring.
Total syndrome visit counts increased due to
another county data stream coming online.
False alerts can arise when a surveillance system
fails to adjust to major shifts in monitored
health data streams or baseline healthcare
utilization. Reis BY, Kohane IS, Mandl KD. An
epidemiological network model for disease
outbreak detection. PLoS Med 2007, 4e210.
7Reis et al.s Network Model Use all possible
ratios (relationships") of health data streams
instead of the streams themselves.
An epidemiological network containing 15 data
streams from 5 hospitals is shown responding to a
simulated outbreak in data stream T-5. Each data
stream appears twice in the network The context
nodes on the left are used for interpreting the
activity of the target nodes on the right. Each
edge represents the ratio of the target node
divided by the context node, with a thicker edge
indicating that the ratio is anomalously high.
Consensus view is constructed for each node by
combining all perspectives.
For N health data serieshigh computational load
of N(N-1)
Reis BY, Kohane IS, Mandl KD. PLoS Med 2007,
4e210.
8Immediate Questions
- When should individual counts be used? When is
it a benefit to use ratios? What kind of ratios
between the health streams should be formed? - Can the Reis model be computationally simplified
such that only important ratios are considered
instead of every possible ratio?
match.com Hypothesis Choose a context
that has something in common with a given
target. Quantitatively, the target and
context should have sufficient mutual
information. Otherwise, they should not be
paired.
9Mutual Information measures the information that
two random variables share.
Given two time series X x1, x2, xn) and Y
y1, y2, yn), their mutual information (M.I.)
is how well we can predict X given that we have
measured Y (and vice-versa). I(xy) measures
how much knowing one of these variables reduces
our uncertainty about the other.
p(x) is the probability density function (pdf) of
X x1, x2, xn), p(y) is the pdf of the time
series Y y1, y2, yn), and p(x,y) is the
joint probability density function of X and Y.
10Mutual Information Examples
I(xy) measures how much knowing one random
variable reduces our uncertainty about the other
variable. -- X 0 0 0 1 1 0 and Y 1 1 1 0
0 1 ? I(xy) 1 (normalized). -- if X and Y
are independent, p(x, y) p(x)p(y), then
knowing X does not give any information
about Y and vice versa. Their mutual information
is zero. -- if X and Y are identical then
knowing X determines Y and vice versa. As a
result, the mutual information is the same as the
uncertainty contained in Y (or X) alone, namely
the entropy of Y, H(Y).
11Mutual Information Relation to Entropy
The entropy (uncertainty or surprisability)
of a random variable X with prob. density
function p(x) is
H(YX)
I(X,Y) is the uncertainty in X removed by
knowing Y (or vice versa).
I(XY)
H(XY)
H(X,Y)
Uncertainty in X
Uncertainty in X after Y is known
12Mutual Information Techniques are now being
applied to a wide variety of fields.
- Bioinformatics
- Financial/Stock Market Time Series Data
- Medical Device Time Series (EEG, EKG, etc.)
- Feature Selection and Image Processing
- Beat Detection In Music
13Correlation and Mutual Information can both be
used to compare two time series.
Mutual information is similar tobut is the
generalization oflinear correlation.
Advantage in using correlation over mutual
information is faster and easier computation.
100-day moving window
14Why use M.I. instead of correlation?
I(xy) 0.51
I(x,y) 0.57
I(x,y) 0.65
I(x,y) 0.49
Pearson correlation coefficient is linear. Not
always relevant as a summary statistic!
Mutual information Is more general. It includes
nonlinear relationships.
Anscombes Quartet The four y variables shown
have the same mean (m 7.5), standard deviation
(s 4.12), correlation (r 0.81) and regression
line ( y 3 0.5 x ).
15Mutual Information Numerical Calculation Issues
I. Histogram problem How many bins should be
used to construct a histogram of X and Y to
calculate p(X), p(Y), and p(X,Y)?
Use Adaptive Binning or Kernel Density Estimation
(i.e., Parzen windows).
II. Because I(xy) varies from zero to
infinityshould normalize I(xy) to be between
zero and unity.
Witten Frank (2005)
Strehl Ghosh (2002)
DionĂsio et al. (2003)
16Construct Monte Carlo Simulations
Rash Counts
Target Rash Syndrome Counts
Poisson Counts I(xy) 0.019
Context Poisson Process (m 50)
Poisson 5Rash I(xy) 0.336
Context Mix Rash Counts And Poisson
17Monte Carlo Simulations Demonstrate Significance
of Choosing Appropriate Context Series.
- Target Rash syndrome counts
- Context
- Mix Poisson Noise with increasing Target
- Inject artificial outbreak signal into the
target. - Calculate Number of Detections for constant False
Alarm Rate.
Alert Detection Probability
Target-Context Mutual Information
18ROC Curves from Monte Carlo Simulations strong
outbreak
Correlation 0.816
- Target Rash syndrome counts.
- Context
- Mix Poisson Noise with increasing Target Signal
- Inject artificial signal into the target.
- Calculate ROC curve.
Correlation - 0.009
19Monte Carlo Simulations for actual neurological
syndromic complaint reports with different
contexts
- Target Neuro syndrome counts.
- Context
- Use different syndromic complaint time series
- Inject artificial signal into the target.
- Calculate ROC curve.
Probability Detection (pd)
Probability False Alarms (pfa)
20An Example of a Time Series Mismatch for
Gastrointestinal Syndromic Data
Probably not of epidemiological significance.
Correlation between the two series 100-day
moving window
This is the SP 500 Stock Index time series from
3/1/2002 -12/17/06
21Loss of mutual information (-DI/Dt)/(1-I) can
act as a robust outbreak alert detector
Loss of mutual info may act as a better alert
detector than simply using ratios.
22Conclusions and Future Work
- Proportions or ratios for syndromic surveillance
are best applied when the context series and the
target series share sufficient mutual information
(I(xy) gt 0.25). - Streamline proportional modeling networks with
mutual information criteria to reduce the
computational load. - Investigating a novel robust alert detector
obtained by thresholding the mutual information
loss (DI /Dt) / (1-I) between the target and
context, instead of using the target-to-context
ratio.