Title: Selecting Evidence for Comparative Effectiveness Reviews: When to use Observational Studies
1Selecting Evidence for Comparative Effectiveness
ReviewsWhen to use Observational Studies
- Dan Jonas, MD, MPH
- Meera Viswanathan, PhD
- Karen Crotty, PhD, MPH
- RTI-UNC Evidence-based Practice Center
2Sources
- AHRQ Methods Guide, Chapters 4 and 8,
http//www.effectivehealthcare.ahrq.gov/repFiles/2
007_10DraftMethodsGuide.pdf - Draft manuscript, Norris et al., Observational
Studies in Systematic Reviews of Comparative
Effectiveness. - Chou R, Aronson N, Atkins D, et al. Assessing
harms when comparing medical interventions AHRQ
and the Effective Health Care Program. J Clin
Epidemiol 2008 Sep 25.
3Overview
- Why should reviewers consider including
observational studies (OS) in comparative
effectiveness reviews (CERs)? - When should OS be included in CERs?
- What are the differences in considering inclusion
of OS for benefits as opposed to OS of harms?
4Current Perspective
- CERs should consider including observational
studies - this should be the default strategy
- Reviewers should explicitly state the rationale
for including or excluding OS
5Comparative Effectiveness Reviews (CERs)
- Systematic reviews that compare the relative
benefits and harms among a range of available
treatments or interventions for a given condition
6CER Process Overview
7Hierarchy of Evidence
8Danger of Over-reliance on RCTs
- May be unnecessary, inappropriate, inadequate, or
impractical - May be too short in duration
- May report intermediate outcomes rather than main
health outcomes of interest - Often not available for vulnerable populations
- Generally report efficacy rather than
effectiveness - AHRQ Evidence-based Practice Centers include wide
variety of study designs (not only RCTs)
9Observational Studies (OS)
- Definition Studies where the investigators did
not assign the exposure/intervention - i.e. non-experimental studies
- Controlled clinical trials are quasi-experimental
studies, not OS - We present considerations for including OS to
assess benefits and to assess harms separately
10OS to Assess Benefits
- Often insufficient evidence from trials to answer
all KQs in CERs (think PICOTS) - Population may not be available for
sub-populations and vulnerable populations - Interventions may not be able to assign
high-risk interventions randomly - Outcomes may report intermediate outcomes rather
than main health outcomes of interest - Timing may be too short in duration
- Setting may not represent typical practice
11Group Exercise
- What should reviewers consider when deciding
whether or not to include observational studies
in CERs?
12OS to Assess Benefits
- Reviewers should consider 2 questions
- Are there gaps in trial evidence for the review
questions under consideration? - Will observational studies provide valid and
useful information to address key questions?
13Are there gaps in trial evidence? Will OS provide
valid and useful information?
14Group Exercise Include OS?
- CER of PCI vs. CABG for coronary disease
identified 23 RCTs. Experts (TEP) raised
concerns that the studies enrolled patients with
a relatively narrow spectrum of disease relative
to those having the procedures in current
practice - Review of antioxidant supplementation to prevent
heart disease found numerous large clinical
trials, including over 20,000 elevated-risk
subjects in the Heart Protection Study. No
beneficial effects were seen in CV outcomes,
including mortality. Findings were consistent
across trials with varying populations, sizes,
etc.
15Group Exercise include OS?
- CER of PCI vs. CABG----Need to look for OS
- OS from 10 large cardiovascular registries were
identified - These confirmed that the use of the procedures in
the community included patients with wider
variation in disease - For patients similar to those enrolled in trials,
mortality results in the registries were similar
to trials (no difference between interventions) - Relative benefits of the procedures varied
markedly with extent of disease, raising caution
about extending trial conclusions to patients
with greater or lesser disease than those in
trial populations - Review of antioxidant supplementation to prevent
heart disease----Trial data are sufficient
16Gaps in Trial Evidence PICOTS
- Trial data may be insufficient for a number of
reasons - PICOTS
- Populations included (missing certain groups)
- Interventions included
- Outcomes reported (only intermediate)
- Duration
- All trials may be efficacy studies
17Are Trial Data Sufficient? PICOTS and Beyond
- Risk of bias (internal validity)
- Degree to which the findings may be attributed to
factors other than the intervention under review - Consistency
- Extent to which effect size and direction vary
within and across studies - Inconsistency may be due to heterogeneity across
PICOTS - Directness
- Degree to which outcomes that are important to
users of the CER (patients, clinicians, or
policymakers) are encompassed by trial data - Health outcomes generally most important
18Are Trial Data Sufficient? PICOTS and Beyond
- Precision
- Includes sample size, number of studies, and
heterogeneity within or across studies - Reporting bias
- Extent to which trial authors appear to have
reported all outcomes examined - Applicability
- Extent to which the trial data are likely to be
applicable to populations, interventions, and
settings of interest to the user - The review questions should reflect the PICOTS
characteristics of interest
19When to Identify Gaps in Trial Evidence
- Identification of gaps in trial evidence
available to answer review questions can occur at
a number of points in the review - When first scoping the review
- Consultation with Technical Expert Panel
- Initial review of titles and abstracts
- After detailed review of trial data
20CER Process Overview
21Gaps in Trial Evidence
- Operationally, may perform initial searches
broadly, to identify both OS and trials, or may
do searches sequentially and search for OS after
reviewing trials in detail to identify gaps in
evidence
222. Will observational studies provide valid and
useful information to address key questions?
- Reviewers should
- Refocus the study question on gaps in trial
evidence - specify the PICOTS characteristics for gaps in
trial evidence - Assess whether available OS may address the
review questions (applicable to PICOTS?) - Assess suitability of OS to answer the review
questions
23Valid and Useful Information
- Assess suitability of OS to answer the review
questions - After gaps have been identified in trial
literature and that OS potentially fill those
gaps - Consider the clinical context and natural history
of the condition under study - Assess how potential biases may influence the
results of OS
24Clinical context
- Fluctuating or intermittent conditions are more
difficult to assess with OS - Especially if there is no comparison group
- OS may be more useful for conditions with steady
progression or decline
25Group Exercise
- Here are two very different conditions
- Acute low back pain
- Amyotrophic lateral sclerosis (ALS)
- How might the differences in these conditions
impact whether OS would provide useful
information?
26Group Exercise
- Main considerations here are the natural history
of the condition under study - People with acute low back pain often recover
spontaneously - A cohort study of treatments for acute low back
pain cant establish, with any degree of
certainty, whether the treatments affected
patient outcomes - ALS has a course of steady decline
- An uncontrolled cohort study of treatments for
ALS may well be able to demonstrate meaningful
effects
27Potential biases
- Selection bias (and confounding by indication)
- Performance bias
- Detection bias
- Attrition bias
28Group Exercise
- Suppose youre conducting a CER of medications
for rheumatoid arthritis (RA) - You find several retrospective analyses of
administrative databases comparing outcomes of RA
patients taking etanercept vs. methotrexate - Suppose that etanercept is restricted in many of
the health systems to patients with more severe
RA who have failed other treatments - Should you include these OS?
- What considerations will influence your decision?
29Group Exercise
- Confounding by indication
- A type of selection bias
- When different diagnoses, severity of illness, or
comorbid conditions are important reasons for
physicians to assign different treatments - Common problem in pharmacoepidemiology studies
comparing beneficial effects of interventions - Generally would not include this information due
to a high risk of bias (poor internal validity),
unless studies had a good way to adjust for
severity of disease
30Harms
- Assessing harms can be difficult
- Trials often focus on benefits, with little
effort to balance assessment of benefits and
harms - OS are almost always necessary to assess harms
adequately - There are tradeoffs between increasing
comprehensiveness of reviewing all possible harms
data and decreasing quality (increasing risk of
bias) for harms data
31Trials to Assess Harms
- Randomized controlled trials gold standard for
evaluating efficacy - But, relying solely on RCTs to evaluate harms in
CERs is problematic - Most lack prespecified hypotheses for harms as
they are designed to evaluate benefits - Assessment of harms is often a secondary
consideration - Quality and quantity of reporting of harms is
frequently inadequate - Few have sufficient sample sizes or duration to
adequately assess uncommon or long-term harms
32Trials to Assess Harms
- Most RCTs are efficacy trials
- they assess benefits and harms in ideal,
homogenous populations and settings - patients who are more susceptible to harms are
often under-represented - Few RCTs directly compare alternative treatment
strategies - Publication bias and selective outcome reporting
bias - RCTs may not be available
33Trials to Assess Harms
- Nevertheless, head-to-head RCTs provide the most
direct evidence on comparative harms - In addition, placebo-controlled RCTs can provide
important information - In general, CERs should routinely include both
head-to-head and placebo-controlled trials for
assessment of harms - In lieu of placebo-controlled RCTs, CERs may
incorporate findings of well-conducted systematic
reviews if they evaluated the specific harms of
interest
34Unpublished Supplemental Trials Data
- Consider including results of completed or
terminated unpublished RCTs and unpublished
results from published trials - FDA website, http//www.ClinicalTrials.gov, etc.
- Must contemplate ability to fully assess risk of
bias - When significant of published trials fails to
report an important AE, CER authors should report
this gap in the evidence and consider efforts to
obtain unpublished data
35OS to Assess Harms
- OS are almost always necessary to assess harms
adequately - Exception is when there are sufficient data from
RCTs to reliably estimate harms - May provide best or only data for assessing harms
in minority or vulnerable populations who are
under-represented in trials - Types of OS included in a CER will vary
different types of OS might be included or
rendered irrelevant by availability of data from
stronger study types
36Hypothesis Testing vs. Hypothesis Generating
- Important consideration in determining which OS
to include - Case reports are hypothesis generating
- Cohort and case-control studies are well suited
for testing hypotheses of whether one
intervention is associated with a greater risk
for an adverse event than another and for
quantifying the risk
Chou et al, JCE 2008
37Hierarchy of Evidence
38OS to Assess Harms
- Cohort and case-control studies
- CERs should routinely search for and include,
except when RCT data are sufficient and valid - OS based on patient registries
- OS based on analyses of large databases
- Case reports and post-marketing surveillance
- New medications
- Other OS
39OS to Assess Harms
- Criteria to select OS for inclusion
- there are often many more OS than trials
evaluating a large number of OS can be
impractical when conducting a CER - Several criteria commonly uses in CERs to screen
OS for inclusion (empirical data lacking) - Minimum duration of follow-up
- Minimum sample size
- Defined threshold for risk of bias
- Study design (cohort and case-control)
- Specific population of interest
40Key Take-home Points
- Often insufficient evidence from trials to answer
all Key Questions in CERs - CERs should consider including OS default
strategy - Should explicitly state the rationale for
including or excluding OS - For OS to assess benefits, reviewers should
consider 2 questions - Are there gaps in trial evidence for the review
questions under consideration? - Will observational studies provide valid and
useful information to address key questions? - For harms, should routinely search for and
include cohort and case-control studies