Title: Public Health Information Network PHIN Series II
1Public Health Information Network (PHIN) Series
II
- Outbreak Investigation Methods
- From Mystery to Mastery
2(No Transcript)
3Access Series Files Online http//www.vdh.virgini
a.gov/EPR/Training.asp
- Session slides
- Session activities (when applicable)
- Session evaluation forms
- Speaker biographies
- Alternate Web site http//www.sph.unc.edu/nccphp/
phtin/index.htm -
4Site Sign-in Sheet
- Please submit your site sign-in sheet and
- session evaluation forms to
- Suzi Silverstein
- Director, Education and Training
- Emergency Preparedness Response Programs
- FAX (804) 225 - 3888
5Series IISession VI
6Series II Sessions
- Recognizing an Outbreak
- Risk Communication
- Study Design
- Designing Questionnaires
- Interviewing Techniques
- Data Analysis
- Writing and Reviewing Epidemiological
Literature
7Todays Presenters
- Amy Nelson, PhD
- Consultant
- NC Center for Public Health Preparedness
- Sarah Pfau, MPH
- Consultant
- NC Center for Public Health Preparedness
8Analyzing Data Learning Objectives
- Upon completion of this session, you will
- Understand what an analytic study contributes to
an epidemiological outbreak investigation - Understand the importance of data cleaning as a
part of analysis planning
9Analyzing Data Learning Objectives
- Know why and how to generate descriptive
statistics to assess trends in your data - Know how to generate and interpret epi curves to
assess trends in your outbreak data - Understand how to interpret measures of central
tendency
10Analyzing Data Learning Objectives (contd.)
- Know why and how to generate measures of
association for cohort and case-control studies - Understand how to interpret measures of
association (risk ratios, odds ratios) and
corresponding confidence intervals - Know how to generate and interpret selected
descriptive and analytic statistics in Epi Info
software
11Lecturer
- Amy Nelson, PhD
- Consultant,
- NC Center for Public Health Preparedness
12Analyzing Data Session Overview
- Analysis planning
- Descriptive epidemiology
- Epi curves
- Spot maps
- Measures of central tendency
- Attack rates
- Analytic epidemiology
- Measures of association
- Case study analysis using Epi Info software
13Analysis Planning
14Analysis Planning
-
- An invaluable investment of time
- Helps you select the most appropriate
epidemiologic methods - Helps assure that the work leading up to analysis
yields a database structure and content that your
preferred analysis software needs to successfully
run analysis programs
15Analysis Planning
- Several factors influenceand sometimes
limityour approach to data analysis - Research question
- Exposure and outcome variables
- Study design
- Sample population
-
16Analysis Planning
- Three key considerations as you plan your
analysis - Work backwards from the research question(s) to
design the most efficient data collection
instrument - Study design will determine which statistical
tests and measures of association you evaluate in
the analysis output - Consider the need to present, graph, or map data
17Analysis Planning
- Work backwards from the research question(s) to
design the most efficient data collection
instrument - Develop a sound data collection instrument
- Collect pieces of information that can be
counted, sorted, and recoded or stratified - Analysis phase is not the time to realize that
you should have asked questions differently!
18Analysis Planning
- Study design will determine which statistical
tools you will use - Use risk ratio (RR) with cohort studies and odds
ratio (OR) with case-control studies need to
know which to evaluate, because both are
generated simultaneously in Epi Info and SAS - Some sampling methods (e.g., matching in
case-controls studies) require special types of
analysis
19Analysis Planning
- Consider the need to present, graph, or map data
- Even if you collect continuous data, you may
later categorize it so you can generate a bar
graph and assess frequency distributions - If you plan to map data, you may need X-and
Y-coordinate or denominator data
20Basic Steps of an Outbreak Investigation
- Verify the diagnosis and confirm the outbreak
- Define a case and conduct case finding
- Tabulate and orient data time, place, person
- Take immediate control measures
- Formulate and test hypotheses
- Plan and execute additional studies
- Implement and evaluate control measures
- Communicate findings
21Descriptive Epidemiology
22Step 3 Tabulate and orient data time, place,
person
- Descriptive epidemiology
- Familiarizes the investigator with the data
- Comprehensively describes the outbreak
- Is essential for hypothesis generation (step 5)
23Data Cleaning
- Check for accuracy
- Outliers
- Check for completeness
- Missing values
- Determine whether or not to create or collapse
data categories - Get to know the basic descriptive findings
24Data CleaningOutliers
- Outliers can be cases at the very beginning and
end that may not appear to be related - First check to make certain they are not due to a
collection, coding or data entry error - If they are not an error, they may represent
- Baseline level of illness
- Outbreak source
- A case exposed earlier than the others
- An unrelated case
- A case exposed later than the others
- A case with a long incubation period
25Data CleaningDistribution of Variables
Outlier
26Data CleaningMissing Values
- The investigator can check into missing values
that are expected versus those that are due to
problems in data collection or entry - The number of missing values for each variable
can also be learned from frequency distributions
27Data CleaningFrequency Distributions
28Data CleaningData Categories
- Which variables are continuous versus
categorical? - Collapse existing categories into fewer?
- Create categories from continuous? (e.g., age)
29Descriptive Epidemiology
- Comprehensively describes the outbreak
- Time
- Place
- Person
30Descriptive Epidemiology
31Descriptive Epidemiology Time
32Descriptive EpidemiologyTime
- What is an epidemic curve and how can it help in
an outbreak? - An epidemic curve (epi curve) is a graphical
depiction of the number of cases of illness by
the date of illness onset
33Descriptive EpidemiologyTime
- An epi curve can provide information on the
following characteristics of an outbreak - Pattern of spread
- Magnitude
- Outliers
- Time trend
- Exposure and / or disease incubation period
34Epidemic Curves
- The overall shape of the epi curve can reveal
the type of outbreak (the pattern of spread) - Common source
- Intermittent
- Continuous
- Point source
- Propagated
35Epidemic CurvesCommon Source
- People are exposed to a common harmful source
- Period of exposure may be brief (point source),
long (continuous) or intermittent
36Epi Curve Common Source Outbreak with
Intermittent Exposure
Pattern of Spread
37Epi Curve Common Source Outbreak with
Continuous Exposure
Pattern of Spread
38Epi Curve Point Source Outbreak
Pattern of Spread
39Epi Curve Propagated Outbreak
Pattern of Spread
40Epidemic Curves
Magnitude
41Epidemic CurvesTime Trend
- Provide information about the time trend of the
outbreak - Consider
- Date of illness onset for the first case
- Date when the outbreak peaked
- Date of illness onset for the last case
42Epidemic Curves
Time Trend
43Epidemic CurvesIncubation Period
- If the timing of the exposure is known, epi
curves can be used to estimate the incubation
period of the disease - The time between the exposure and the peak of the
epi curve represents the median incubation period
44Epidemic CurvesIncubation Period
- In common source outbreaks with known incubation
periods, epi curves can help determine the
average period of exposure - Find the average incubation period for the
organism and count backwards from the peak case
on the epi curve
45Epidemic Curves
- This can also be done to find the minimum
incubation period - Find the minimum incubation period for the
organism and count backwards from the earliest
case on the epi curve
46Exposure / Outbreak Incubation Period
- Average and minimum incubation periods should be
close and should represent the probable period of
exposure - Widen the estimated exposure period by 10 to 20
47Calculating Incubation Period
Onset of illness among cases of E. coli O157H7
Infection, Massachusetts, December, 1998.
48Creating an Epidemic Curve
- Provide a descriptive title
- Label each axis
- Plot the number of cases of disease reported
during an outbreak on the y-axis - Plot the time or date of illness onset on the
x-axis - Include the pre-epidemic period to show the
baseline number of cases
49Epi Curve for a Common Source Outbreak with
Continuous Exposure
Y- Axis
X - Axis
50Creating an Epidemic Curve
- X-axis considerations
- Choice of time unit for x-axis depends upon the
incubation period - Begin with a unit approximately one quarter the
length of the incubation period - Example
- 1. Mean incubation period for influenza 36
hours - 2. 36 x ¼ 9
- 3. Use 9-hour intervals on the x-axis for an
outbreak of influenza lasting several days
51Creating an Epidemic Curve
- X-axis considerations
- If the incubation period is not known, graph
several epi curves with different time units - Usually the day of illness onset is the best unit
for the x-axis
52Epi Curve X-Axis Considerations
X-axis unit of time 1 week
X-axis unit of time 1 day
53Descriptive Epidemiology
54Descriptive Epidemiology Place
- Spot map
- Shows where cases live, work, spend time
- If population size varies between locations being
compared, use location-specific attack rates
instead of number of cases
55Descriptive Epidemiology Place
Source http//www.phppo.cdc.gov/PHTN/catalog/pdf-
file/LESSON4.pdf
56Descriptive Epidemiology
57Descriptive Epidemiology Person
- Data summarization for descriptive
- epidemiology of the population
- Line listings
- Graphs
- Bar graphs
- Histograms
58Line Listing
59Bar Graph
60Descriptive Epidemiology
- Measures of central tendency
- Mean
- Median
- Mode
- Range
61Measures of Central Tendency
- Mean (Average)
- The sum of all values divided by the number of
values - Example
- Cases 7,10, 8, 5, 5, 37, 9 years old
- Mean (710855379)/7
- Mean 11.6 years of age
62Measures of Central Tendency
- Median (50th percentile)
- The value that falls in the middle position when
the measurements are ordered from smallest to
largest - Example
- Ages 7,10, 8, 5, 5, 37, 9
- Ages sorted 5, 5, 7, 8, 9,10, 37
- Median age 8
63Calculate a Median Value
- If the number of measurements is odd
- Median value with rank (n1) / 2
- 5, 5, 7, 8, 9,10, 37
- n 7, (n1) / 2 (71) / 2 4
- The 4th value 8
- Where n the number of values
64Calculate a Median Value
- If the number of measurements is even
- Medianaverage of the two values with
- rank of n / 2 and
- rank of (n / 2) 1
- Where n the number of values
- 5, 5, 7, 8, 9,10, 37
- n 7 (7 / 2) 3.5. So 8 is the first value
- (7 / 2) 1 4.5, so 9 is the second value
- (8 9) / 2 8.5
- The Median value 8.5
65Measures of Central Tendency
- Mode Modal Value
- The value that occurs the most frequently
- Example 5, 5, 7, 8, 9,10, 37
- Mode 5
- It is possible to have more than one mode
- Example 5, 5, 7,8,10,10, 37
- Modes 5 and 10
66Measures of Central Tendency
- Mode Modal Value
- The value for the variable in which the greatest
frequency of records fall - Epi Info limitation
- If multiple values share the same frequency that
is also the highest frequency, Epi Info will
identify only the first value it encounters as
Mode as it scans the table in ascending order
67Measures of Central Tendency Mode Software
Limitation
Modal Values
The ages 11, 17, 35, and 62 all qualify for the
status of mode, but Epi Info identifies Age 11
as the mode in analysis output for MEANS AGE in
viewOswego.
68Measures of Central Tendency
50th percentile
3
77
11
36.8
36.0
Min
Max
Mode
Median
Mean (average)
69ActivityCalculate Mean and Median
- Completion time 5 minutes
70Calculate Mean and Median Age
- For an even number of measurements,
- Median the average of two values ranked
- N / 2
- (n / 2) 1
71Calculate Mean and Median Age
- Mean age
- 59768540
- 40 / 6 6.67 years
- Median age
- 5,5,6,7,8,9
- Average of values ranked (n/2) and (n/2)1
- (6/2) and (6/2) 1 average of 6 and 7
- (67) / 2 6.5 years
72Question Answer Opportunity
735 minute break
74Attack Rates
75Attack Rates (AR)
- AR
- of cases of a disease
- of people at risk (for a limited period of
time) - Food-specific AR
- people who ate a food and became ill
- people who ate that food
76Food-Specific Attack Rates
CDC. Outbreak of foodborne streptococcal disease.
MMWR 23365, 1974.
This food is probably not the source of infection
77Stratified Attack Rates
Attack rate in women 13 / 29 45 Attack rate
in men 5 / 32 16
78Hypothesis Generation vs. Hypothesis Testing
79Hypothesis Generation vs. Hypothesis Testing
- Formulate hypotheses
- Occurs after having spoken with some case
patients and public health officials - Based on information form literature review
- Based on descriptive epidemiology (step 3)
- Test hypotheses
- Occurs after hypotheses have been generated
- Based on analytic epidemiology
80(No Transcript)
81Analytic Epidemiology
82Analytic Epidemiology
- Measures of Association
- Risk Ratio (cohort study)
- Odds Ratio (case-control study)
83Cohort versus Case-Control Study
84Cohort versus Case-Control Study
85Analysis Output
86Cohort Study
87Risk Ratio
88Risk Ratio Example
RR (43 / 54) / (3 / 21) 5.6
89Interpreting a Risk Ratio
-
- RR1.0 no association between exposure and
disease - RRgt1.0 positive association
- RRlt1.0 negative association
90Case-Control Study
91Odds Ratio
92Odds Ratio Example
OR (60 / 18) / (25 / 55) 7.3
93Interpreting an Odds Ratio
- The odds ratio is interpreted in the same way as
a risk ratio - OR1.0 no association between exposure and
disease - ORgt1.0 positive association
- ORlt1.0 negative association
94What to do with a Zero Cell
- Try to recruit more study participants
- Add 1 to each cell
- Remember to document / report this!
95Confidence Intervals
96Confidence Intervals
- Allow the investigator to
- Evaluate statistical significance
- Assess the precision of the estimate (the odds
ratio or risk ratio) - Consist of a lower bound and an upper bound
- Example RR1.9, 95 CI 1.1-3.1
97Confidence Intervals
- Provide information on precision of estimate
- Narrow confidence intervals more precise
- Wide confidence intervals less precise
- Example OR10, 95 CI 0.9 - 44.0
- Example OR10, 95 CI 9.0 - 11.0
98Plan and Execute Additional Studies
- To gather more specific info
- Example Salmonella muenchen
- Intervention study
- Example implement intensive hand-washing
99Question Answer Opportunity
1005 minute break
101Epi Info Analysis
- Case Study
- Download Epi Info software for free at
http//www.cdc.gov/epiinfo
102Oswego Tutorial
- 1. Epi Info Main Menu
- 2. Help
- 3. Tutorials
- 4. Oswego Tutorial
103Case Study Overview
- Oswego County, New York 1940
- 80 people attended a church supper on 4 / 18
- 46 people who attended the supper suffered from
gastrointestinal illness beginning 4 / 18 and
ending 4 / 19 - 75 people (ill and non-ill) interviewed
- Investigation focus church supper as source of
infection
104Church Supper
- Supper held in the church basement.Â
- Foods contributed by numerous families.Â
- Supper from 600 PM to 1100 PM, so food consumed
over a period of several hours.
105Case StudyDescriptive Epidemiology
- Investigators needed to determine
- The type of outbreak occurring
- The pathogen causing the acute gastrointestinal
illness and - The source of infection
106Data Cleaning
- Know your data! Know the
- Number of records
- Field formats and contents
- Special properties
- Table relationships
107Data Cleaning
Tell Epi Info which records to include in
analyses
Set command in Analyze Data
108Case Study Line Listing
- Organize and review data about time, person, and
place that were collected via hypothesis
generating interviews.
109Case Study Line Listing
Code for generating output
110Line Listing Windows Commands
- 1. Read (viewOswego in Sample.MDB)
- 2. Sort (on AGE, in ascending order)
- 3. Select (only the cases where ILLYes)
- 4. List (generate a line listing with the fields
AGE, SEX, and DATEONSET)
111Case Study Means
Code
Windows command
Means (of AGE)
112Distribution Frequency by Gender
Windows command Frequencies (by SEX)
113Case StudyEpidemic Curve
- Variable of Interest
- DATEONSET (date of onset of illness)
- Entered into database mm/dd/yyyy/hh/mm/ss/AM PM
-
114Case Study Epidemic Curve
115Point-Source Outbreak
Textbook distribution
Case Study distribution
116Case Study Epidemic Curve
Average incubation period
Maximum incubation period
Overlap
Outlier?
117Using Epi Info to Create Epi Curves
- Step-by-Step Instructions
- Open the Analyze Data component
- Use the Read command to access your data table
- Click on the Graph command
- Choose Histogram as the Graph Type
- Choose your date / time of illness onset variable
as the x- axis main variable
118Using Epi Info to Create Epi Curves
- Step-by-Step Instructions
- Choose count from the Show value of option
beneath the y-axis option - Choose weeks, days, hours, or minutes for the
x-axis interval from the interval dropdown menu - Type in graph title where it says Page title
- Click OK
119Determine Incubation Period
- Alternative Create a temporary variable called
Incubation in Analyze Data - INCUBATION DATEONSET TIMESUPPER
- Where field format is identical
- Date / time mm/dd/yyyy/hh/mm/ss/AM PM
120Means INCUBATIONAnalysis Output
121Calculate Mean Incubationin Epi Info
122Identify the Pathogen. . .
123Identify the Pathogen. . .
- CDCs Foodborne Outbreak Response and
Surveillance Unit - Guide to Confirming the Diagnosis in Foodborne
Diseases - http//www.cdc.gov/foodborneoutbreaks/guide_fd.htm
124Case Study Attack Rates
- Obtain the information that you need to
calculate food-specific attack rates via - Stratified Frequency Tables
- Line Listings
- 2 x 2 Tables
- Food-specific AR
- people who ate a food and became ill
- people who ate that food
125Stratified Frequency Tables
40 people ate cake 27 people who ate cake are
ill.
AR for people who consumed cake 27 / 40 67.5
35 people did not eat cake 19 of those people
are ill.
AR for people who did not consume cake 19 / 35
54.2
Frequencies CAKE Stratify by ILL
126Line Listings
13 27 people ate cakes
27 people who ate cake are ill
AR for people who Consumed cake 27 / 40 67.5
Not Ill
Ill
127Tables Analysis Output
2 x 2 Table
Windows command Tables (Exposure CAKES
Outcome ILL)
128Activity Interpreting Output
What percentage of people who ate cake did not
get ill?
129Activity Interpreting Output
Exposure Outcome
Answer 32.5 of the people who ate cake did not
get ill.
130Case Study Attack Rates
We should further investigate the association of
vanilla ice cream consumption and illness
131Generate and Test a Hypothesis!
- The epi curve is indicative of a Point-Source
outbreak - Based on the incubation period, we suspect
Staphylococcus aureus as the pathogen - The food-specific attack rates lead us to believe
that vanilla ice cream may be the source of
infection
132Case Study
133Tables Analysis Output
Epi Info 2 x 2 Table
2 x 2 Table Shell
Windows command Tables (for VANILLA)
134Tables Analysis Output
The risk of becoming ill was more than five
times greater for people who consumed vanilla ice
cream than for people who did not consume
vanilla ice cream.
135Case StudyAnalytic Results
- - Point-Source Outbreak
- - Staphylococcus aureus suspected pathogen based
on 4.3 hr average incubation period - - Vanilla ice cream suspected source of infection
(highest food-specific AR of 80) - - Vanilla ice cream RR 5.6
- - Vanilla ice cream C.I. 1.9 16.0
136Online Epi Info Instruction
- http//www.sph.unc.edu/nccphp/training/all_trainin
gs/at_epi_info.htm - 8 Self-Instructional Training Modules for
various screen components, functions, and
commands in Analyze Data
137Question AnswerOpportunity
138Next Session December 1st, 100 p.m. 300 p.m.
- Topic Writing and Reviewing Epidemiological
Literature
139Session V Summary
- Analysis planning can be an invaluable
investment of time help you select the most
appropriate epidemiologic methods and help
assure that the work leading up to analysis
yields a database structure and content that your
preferred analysis software needs to successfully
run analysis programs. - As you plan your analysis 1) Work backwards
from the research question(s) to design the most
efficient data collection instrument 2) Consider
your study design to guide which statistical
tests and measures of association you evaluate in
the analysis output and 3) Consider the need to
present, graph, or map data.
140Session V Summary
- Descriptive epidemiology 1) Familiarizes the
investigator with data about time, place, and
person 2) Comprehensively describes the
outbreak and 3) Is essential for hypothesis
generation. - Data cleaning is the first step in preparing to
generate descriptive statistics, as it
contributes to the accuracy and completeness of
the data. - Measures of central tendency provide a means of
assessing the distribution of data. Measures
include mean, median, mode, and range. - Epi curves, spot maps, and line listings are all
ways in which you can generate and review the
time, place, and person elements respectively
of descriptive statistics. -
-
-
141Session V Summary
- Attack rates are descriptive statistics that are
useful for comparing the risk of disease in
groups with different exposures (such as
consumption of individual food items). - Analytic epidemiology allows you to test the
hypotheses generated via review of descriptive
statistics and the medical literature. - The measures of association for case control and
cohort analytic studies, respectively, are odds
ratios and risk ratios. - Confidence intervals that accompany measures of
association evaluate the statistical significance
of the measures and assess the precision of the
estimates.
142References and Resources
-
- Centers for Disease Control and Prevention
(1992). Principles of Epidemiology, 2nd ed.
Atlanta, GA Public Health Practice Program
Office. - Division of Public Health Surveillance and
Informatics, Epidemiology Program Office, Centers
for Disease Control and Prevention (January
2003). Epi Info Support Manual. included with
installation of the software, which can be found
at http//www.cdc.gov/epiinfo/index.htm - Gordis L. (1996). Epidemiology. Philadelphia,
WB Saunders.
143References and Resources
- Rothman KJ. Epidemiology An Introduction. New
York, Oxford University Press, 2002. - Stehr-Green, J. and Stehr-Green, P. (2004).
Hypothesis Generating Interviews. Module 3 of a
Field Epidemiology Methods course being developed
in the NC Center for Public Health Preparedness,
UNC Chapel Hill. - Torok, M. (2004). FOCUS on Field Epidemiology.
Epidemic Curves. Volume 1, Issue 5. NC Center
for Public Health Preparedness