A Wavelet-based Anomaly Detector for Disease Outbreaks - PowerPoint PPT Presentation

About This Presentation
Title:

A Wavelet-based Anomaly Detector for Disease Outbreaks

Description:

A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins ... – PowerPoint PPT presentation

Number of Views:150
Avg rating:3.0/5.0
Slides: 23
Provided by: oreg82
Category:

less

Transcript and Presenter's Notes

Title: A Wavelet-based Anomaly Detector for Disease Outbreaks


1
A Wavelet-based Anomaly Detector for Disease
Outbreaks
  • Thomas Lotze
  • Galit Shmueli
  • University of Maryland College Park
  • Sean Murphy
  • Howard Burkom
  • Johns Hopkins University Applied Physics Lab

2
Outline
  • Motivation
  • Wavelet method
  • Difficulties
  • Preconditioning
  • Results

3
Related Work
  • Bakshi
  • Wavelets in Chemical SPC
  • Zhang
  • Baseline wavelets
  • Normalize syndromic baseline
  • Goldenberg, et. al.
  • Wavelets in syndromic surveillance

4
Motivation
  • Detecting disease outbreaks
  • Bioterrorist attacks
  • Virulent diseases
  • Early detection saves lives!
  • Syndromic Data will show outbreaks
  • Anomaly detection to find outbreaks faster

5
Wavelets
  • Models a series as a sum of wavelets
  • Wavelets are at different scales
  • Wavelets are local (change over time)

6
(No Transcript)
7
Difficulties
  • Holidays
  • Non-stationary
  • Day of week
  • Seasonal
  • Noisy
  • Outbreaks are not labeled
  • Outbreak pattern not known in advance

8
Preconditioning
  • Differs from Goldenberg, et. al
  • Replace holidays
  • One week previous
  • Day-of-week
  • Ratio to moving average

9
Evaluation Simulated Outbreaks
  • Real data from 5 cities, Resp and Gi
  • Simulated outbreak patterns inserted
  • Specific pattern of additional syndromes over
    several days
  • Size is normalized by standard deviation of
    recent days
  • Inserted at different starting points within the
    sample data
  • Average detection rates vs. false alarm rates can
    be determined to create ROC curves

10
Results
  • Comparable to Holt-Winters
  • Not amazing

11
Results
  • Preconditioning is important
  • Detection is much better when preconditioned

12
Results
  • Easier to detect on some days than others
  • Days with low counts
  • Daily preconditioning not sufficient

13
Summary
  • Wavelets are a fairly good detection method
  • Preconditioning is very important
  • Day-of-week not fully accounted for

14
Questions?
  • More details on wavelets method?
  • Difficulties?
  • Other outbreak signals?
  • Future work?
  • Will Microsoft survive Bill Gates' stepping down?

15
Bonus More on Wavelets
  • Level 1
  • Run the data through a low-pass filter. This
    gives the approximation coefficients
  • Run the data through a high-pass filter. This
    gives the detail coefficients
  • Down-sample
  • Reconstruct approximation and detail by
    up-sampling and running reconstruction filters.
  • Level 2 and on
  • Repeat the steps by applying them to the previous
    level approximation coefficients.

16
Bonus Wavelets on Cough Medication Sales
Haar Wavelet h 1/sqrt(2), 1/sqrt(2) g
1/sqrt(2), -1/sqrt(2) Downsample Upsample h
1/sqrt(2), 1/sqrt(2) g -1/sqrt(2),
1/sqrt(2) In general s a5 d1 d2 d5
17
Bonus Wavelet Prediction
  • Additional details
  • 5 level decomposition
  • Can be performed with more or fewer
  • SWT Fill in holes
  • Perform a decomposition for every possible
    position
  • Series are no longer independent
  • Edge issue
  • Prediction is not possible at all time steps
  • Solution construct wavelets backwards from
    most recent observations

18
Bonus Ratio-to-Moving-Average
  • Way of normalizing day-of-week effects
  • 1 Determine moving averages
  • a(i)(x(i-3) x(i-2) ... x(i3)) /7
  • 2 Determine ratio (raw seasonal) for each day
  • r(i)x(i)/a(i)
  • 3 Determine avg. ratio for each day
  • r(Mon)sum(r(i) i is Mon) / count(i is Mon)
  • 4 Normalize ratios to sum to 1
  • r'(Mon)r(Mon) / (r(Mon) ... r(Sun))
  • 5 Divide each day by its ratio
  • x'(i)x(i)/r(Mon)

19
Bonus Possible Extensions
  • Multivariate wavelets
  • Each day-of-week as a separate series
  • Different wavelet shapes
  • Different wavelet scale basis
  • Different preconditioning
  • Different sizes, lengths of outbreaks
  • Don't normalize outbreak by standard deviation of
    recent days
  • Show when outbreaks are harder to detect
  • Estimate confidence based on experience
  • Boosting

20
Bonus Wavelet Prediction
  • Decompose into timescales
  • Use AR or EWMA to predict for each timescale
  • Reconstruct prediction from predicted timescales
  • Monitor deviations from prediction

21
Bonus Alternative Preconditioning
  • Regression using day-of-week predictors
  • 7-day differencing
  • Holt-Winters as preconditioner
  • Seasonal preconditioning

22
Bonus Other Outbreak Signals
  • Normalized by total size
  • Lognormal, exponential, step
  • Spike is much easier than the others
Write a Comment
User Comments (0)
About PowerShow.com