Title: BioSense Data Analyses and Anomaly Characterization
1BioSense Data Analyses and Anomaly
Characterization
- Gabriel Rainisch, MPH
- Colleen A. Martin, MSPH
- Division of Emergency Preparedness and Response
- National Center for Public Health Informatics
- Centers for Disease Control and Prevention
- "The findings and conclusions in this
presentation are those of the authors and do not
necessarily represent the views of the Centers
for Disease Control and Prevention."
2Purpose Outline
- Purpose Summary of recent analyses using
BioSense data - To inform alerting monitoring protocols
- Improving the utility of BioSense
- Outline
- BioSense overview
- Descriptive analyses
- Cluster detection methods and anomaly
characterization - Epidemiologic studies
3BioSense Data Sources
- Department of Defense (DoD n466) and Veterans
Affairs (VA n905) outpatient medical facilities - Daily data
- ICD-9 diagnosis codes and CPT procedure codes
- Hospital real-time data (n34)
- Chief complaints, diagnoses, demographics
- Patient class emergency department, inpatient,
outpatient
4Disease Indicators Syndromes (N 11)
- Botulism-like
- Fever
- Gastrointestinal
- Hemorrhagic illness
- Localized cutaneous
- lesion
- Lymphadenitis
- Neurological
- Rash
- Respiratory
- Severe illness/death
- Specific infection
- (http//www.bt.cdc.gov/surveillance/syndromedef/in
dex.asp)
- Monitor critical bioterrorism associated and
natural infectious disease outbreaks
5Disease Indicators Sub-syndromes (N 78)
- Newly developed
- Monitor infectious, chronic, and other disease
indicators - More granular than the 11 syndromes
- Require ongoing evaluation and enhancement
- 46 sub-syndromes map to syndromes
- Syndrome Gastrointestinal
- Sub-syndromes included diarrhea, abdominal pain,
gastrointestinal hemorrhage - 32 sub-syndromes do not map to syndromes
- Examples Allergy, Injury, Excessive heat
6Analyses
7Descriptive Analyses
- Understand and compare syndrome data from
long-term sources (VA/DoD) vs. newer (Hospital
Real-Time) - Explore distribution of new disease indicators
(sub-syndromes) - Understand the capabilities for disease
monitoring using both existing and new data and
indicators
8Visits Assigned to a Syndrome, Feb-June 2006(in
thousands)
- Hospital facilities on average receive 7 times
the mean monthly visit volume of VA/DoD
facilities - The percent of total visits meeting a syndrome
definition is similar for all data sources
9 Percent of Visits Assigned to a Syndrome
- Respiratory and Gastrointestinal combined,
comprise 60-75 of binned visits - The only marked difference between data sources
is for the Gastrointestinal syndrome (Hospital,
30 DoD 18 VA, 11)
10Percent of Visits Assigned to a SyndromeHospital
Real-Time Data
11Hospital Real-Time Final Diagnoses20 Most
Common Sub-Syndromes (Feb-Aug 2006)
12Sub-syndrome Demographics
- Age effect
- Older ages more chronic conditions
- Younger ages more infectious, acute conditions
- Variation among sub-syndrome distributions by
hospital system - Differences in patient population (e.g. ages)
- Variation in patient class mix
- outpatient clinics, emergency department,
inpatient
13Descriptive Analyses
- Newer Hospital real-time data
- Expands capabilities for disease monitoring
- Is similar to VA/DoD data in many respects
- Combined analyses appear to be feasible
14Cluster Detection Anomaly Characterization
- Modified CuSum C2 Statistic W2
- 7-day average with 2-day lag
- Weekdays compared with weekdays weekend days
with weekend days - Calculate a recurrence interval
- Reciprocal of a p-value
- Recurrence interval (RI) of days between two
observed similar residual values - Better analytic method that produces anomalies at
an appropriate frequency
15Anomaly Characterization
- What characteristics identify anomalies most
likely to be of potential importance for further
evaluation?
16Ranking Data Anomalies
- Metrics already in use
- Recurrence Interval (RI)
- Observed vs. previous max count
- Observed / expected (relative risk)
- Observed expected (residual)
- of consecutive days flagged
- Other anomalies at the same facility/area
- Metrics slated for evaluation
- Severity of illness (e.g. pneumonia vs. cough)
- Homogeneity of chief complaints/diagnoses, age,
gender, etc.
17BioSense Epidemiologic Studies
- Influenza
- Neurological Syndrome
- Respiratory Syncytial Virus (RSV)
- Injuries following Hurricane Katrina
- Heat Related Illnesses
18Heat Related Illness
- Goal Explore utility of BioSense data for heat
related illness surveillance - Hospital Real-Time Data (n 34 hospitals)
- Study period June 1st August 10th 2006
- Sub-syndrome Heat, Excessive
- Diagnosis 1 ICD-9 Code E900.0 (Excessive heat
due to weather conditions) - Chief complaint suggestive of heat injury
19Heat Related Illness Visits, BioSense
Application, ICD-9 Diagnosis (n139 Visits)
20Heat-Related Illness (HRI) Diagnosis, June 1-
August 10, 2006
- Several HRI diagnoses were identified that are
not being used in BioSense - Visit counts
- All HRI diagnoses 217 visits
- HRI diagnoses identified by current BioSense
methods 139 visits - Additional HRI diagnoses should be included in
BioSense
21Heat-Related Illness (HRI) Chief Complaints, June
1- August 10, 2006
- 217 visits with HRI diagnoses
- 194 visits with HRI diagnosis but not an HRI
chief complaint - 125 of these visits had chief complaints. Such
as - Malaise and fatigue 24
- Syncope and collapse 19
- Chest pain 19
- Dizziness 17
- Dehydration 14
22HRI Indicators
- Breakdown of 297 HRI visits
- Male Female 6634
- Age
- 0-3 yrs. 2
- 4-11 yrs. 4
- 12-19 yrs. 12
- 20-49 yrs. 45
- 50 yrs. 37
- 19 of patients with an HRI chief complaint also
mapped to the Cerebrovascular Disease
sub-syndrome
23HRI Analyses
- Text parsing creates challenges
- HEAT MURMUR
- IRREGULAR HEAT BEAT
- rapid heat beat
- HEAT PROBLEMS ???
- Next steps
- Correlate with temperature data
- Examine distribution of chronic conditions among
historical visits
24Summary
- Descriptive analyses
- Hospital Real-Time data adds new capabilities
- Disease indicator patterns similar to VA/DoD data
- Cluster detection methods modified methods and
evaluation of ranking criteria can help
prioritize anomalies for further investigation - Epidemiologic studies ongoing to correct
problems, expand capacities, and improve utility
of BioSense
25Acknowledgements
- Jerry Tokars, MD, MPH
- Colleen Martin, MSPH
- Data Quality Programming Team
- Roseanne English
- Paul McMurray, MDS
- Felicita David, MS
- Laura Hall, MPH
26For more information
- Gabriel Rainisch GRainisch_at_cdc.gov
- Colleen Martin CMartin5_at_cdc.gov_at_cdc.gov
- BioSense Website http//www.cdc.gov/biosense
- Technical Help Desk 1-800-532-9929
- Secure Data Network question
- BioSense technical support
- BioSense E-mail Address BioSenseHelp_at_cdc.gov
- General questions
- Additional user requests
- Problem reporting
- Suggestion for enhancements
- All feedback!