Angus Deaton - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Angus Deaton

Description:

ISIS-2 trial of giving people aspirin immediately after an MCI (heart attack) ... Astrological signs in ISIS-2 ... are risk of heart attacks on other grounds, ... – PowerPoint PPT presentation

Number of Views:345
Avg rating:3.0/5.0
Slides: 19
Provided by: dea85
Category:
Tags: angus | attack | deaton | heart | of | signs

less

Transcript and Presenter's Notes

Title: Angus Deaton


1
Randomized Controlled Trials
  • Angus Deaton
  • Research Program in Development Studies
  • Center for Health and Wellbeing
  • Princeton University
  • September 2009

2
Main messages
  • Randomized controlled trials are useful,
    sometimes uniquely so
  • There is no unique gold standard claim for RCTs
    in developing evidence for policy
  • There is no gold standard of any kind
  • No blanket exemption from scrutiny or skepticism
    for RCTs than does not apply to other methods
  • General program of finding out what works by
    routine RCTs is not well-adapted to deepen
    scientific understanding of development
  • Though it may be useful for other purposes, such
    as accountability or auditing
  • A progressive scientific research program
    generally requires the investigation of
    mechanisms, why things work, not what works

3
Outline of these remarks
  • Internal issues
  • Things to think about when you are doing an RCT
  • Using the results
  • RCTs in social science versus RCTs in medicine
  • The comparison is often made, with medicine seen
    as an example that we should follow
  • But the situation is a little more complicated
  • Useful to think about medicine versus what we do
  • External issues
  • What are RCTs good for, and not good for?
  • The investigation of mechanisms
  • Alternatives project evaluation v. doing
    economics

4
Project evaluation
  • Think of a project, which might be a school
    reform, or building a clinic, a new incentive
    scheme, provision of some service, a new HYV, or
    a drug
  • If a unit (say person) gets the treatment
    there is a change in outcome by an amount that
    varies from person to person
  • Emphasize the heterogeneity very little
    structure on this problem, and we want to assume
    as little as possible to get credibility
  • If we randomize across persons, and compare
    average outcome in treated group v. untreated
    group, we can estimate the average treatment
    effect
  • This requires essentially no assumptions,
    compared with econometrics, for example
  • Randomization guarantees that two groups come
    from the same probability law

5
Testing effectiveness
  • If there is a net beneficial outcome, could it
    have happened by chance?
  • Always possible, obviously so with small numbers
    of people
  • We can calculate the probability of a favorable
    outcome having come about by chance using
    combinatorics, which is what R. A. Fisher did in
    the first randomized trials
  • So we have a method of finding out whether the
    project worked on average, which in its
    assumptions, contrasts favorably with standard
    econometric practice
  • Selection, direction of causality, simultaneity,
    and so on
  • Benefits of randomization versus observational
    data

6
An important example
  • ISIS-2 trial of giving people aspirin immediately
    after an MCI (heart attack)
  • 17,000 patients over several studies (this size
    is needed)
  • Even bottom of confidence interval would save
    hundreds of thousands of lives worldwide
  • Doctors did not take this treatment seriously
    prior to the trial
  • Within two years of publication in 1988, use in
    UK went from 10 to 90 (much less in the US,
    where it is still a problem!)
  • Hard to see how these results would have been
    obtained in any other way
  • Effect is small enough to be a problem both for
    small RCTs, and for observational studies
  • Dont need RCT for tobacco, because effect is so
    large

7
What does an RCT not tell us?
  • Informative about the mean, not of any other
    characteristic of the distribution of treatment
    effects, e.g. the median, or the fraction of
    people who benefit, or lose
  • Policymakers are often interested in these
  • Does yield the full distribution of outcomes for
    both treatments and controls
  • For some purposes, this might be enough
  • If one distribution first-order stochastically
    dominates the other
  • Another aspirin example low dose regime
  • RCTs show a net reduction in mortality
  • But it kills some and saves some
  • Public health perspective says do it
  • Individual doctor or patient perspective is much
    less clear
  • If two groups like this, the RCT average applies
    to no one!
  • MCI aspirin example, the effects in same
    direction for everyone, or at least broad classes
    of people

8
Estimates and standard errors
  • The RCT gives us a mean treatment effect
  • This is not worth much without a standard error
    the oft heard reply, the estimate is fine, only
    the standard error is a problem, is nonsense
  • This is not the same as the p-value for the null
    hypothesis that the treatment has no effect,
    which can be done without additional assumptions
  • Standard errors cannot be obtained
    non-parametrically
  • Unless we bootstrap the RCT!
  • We need to make the sort of assumptions that
    advocates of RCTs dont like using other methods
  • Not clear how much better-off we are at least
    diminishes the benefit
  • Example, regressing outcomes on treatment dummy
    gives the wrong standard error
  • Heteroskedasticity correction gives a t-value
    that does not have the t-distribution
    (Fisher-Behrens problem)
  • Much more attention needs to be given to these
    issues, in medicine as well as economics

9
Reducing standard errors
  • The effects RCTs are used to estimate tend to be
    relatively small
  • If they were large, we would not need a trial
  • Large trials are typically necessary
  • Which are expensive
  • Especially with saturation experiments, where
    units are schools or villages, and not
    individuals, because of interactions (typically
    not an issue with medical trials)
  • Reduce variance by using baseline information as
    covariates
  • E.g. regression on treatment dummy with
    covariates
  • This leads to bias in the treatment effect can
    be important with small numbers
  • Generally biases the standard errors
  • Opens up to charges of data mining choose
    covariates until you get a significant (or
    insignificant) result, whichever you are looking
    for
  • Again, more work needs to be done here to guide
    practice
  • Pseudo-randomization, e.g. alphabetization, is
    likely to introduce bias and makes it impossible
    to calculate standard errors

10
RCTs in medicine?
  • Most doctors will tell you yes they are the gold
    standard same at NIH
  • Yet there are serious concerns
  • Ethics and IRBs
  • Influence of money selective publication of
    results, and other evils
  • Cost and timeliness
  • Public health versus individual perspectives
  • 80 percent of oncology trials are unfilled
    people will not participate
  • Populations who participate in trials are strange
    in some way
  • Exclusion of co-morbidities makes extrapolation
    hazardous
  • Undoing of blinding, which we economists dont
    even try to do
  • Meta-studies sometimes contradicted by later
    trials
  • Argued that good observational studies produce
    the same results
  • Led to extensive funding for CER in ARRA in the
    US
  • Chairman of Stanford Medical Schools Department
    of Medicine looks forward to a world in which
    there are no randomized clinical trials
  • Happy to talk further about these in the
    discussion period

11
Using the results from RCTs
  • External validity
  • Which I will come to
  • Using the results in the original experimental
    population
  • This would seem to be favorable
  • Yet mean effect is often not very useful for
    individuals, even those in the trial
  • Treatment for everyone might improve social
    welfare, but that does not imply that any
    individual should be treated
  • Subgroup analysis of trials is so prevalent (and
    makes sense) but it is subject to data mining
  • Astrological signs in ISIS-2
  • If all effects have the same sign, that helps,
    better still if they are all the same
  • Once again, we are trying to reduce
    heterogeneity, by some sort of modeling
  • Regime aspirin
  • If we confine it to those who are risk of heart
    attacks on other grounds, lose many of those who
    will die from it
  • Mechanism is thinning of blood, so we can
    understand how to divide
  • Cant learn this from the trial itself without
    some theoretical understanding of what is going on

12
Why might RCT results not apply?
  • General equilibrium effects are well-known, but
    other threats even in the population as a whole
  • Population subject to randomization may be
    different
  • E.g. no elderly people included, or India is not
    the same as the Philippines
  • No one with co-morbidities included in the trial,
    but those who get the treatment often have
    co-morbidities (trade-off between internal and
    external validity)
  • Not everyone wants to participate in a trial
    e.g. if there is risk involved, selects more risk
    averse
  • People running the trial almost always different
    from those who would administer it more generally
    (true even in agricultural crop research)
  • People in school now will be different from those
    in school if the educational system is changed
  • Would have to randomize from birth, or even
    before
  • Attrition or refusals, though there are various
    techniques for dealing with this
  • Randomization may have failedand there are many
    practical difficulties in the fieldand there is
    no way of testing that it did work within the RCT
    framework because we can only check on
    observables
  • Need to apply exactly the same expert skepticism
    to RCTs as other studies no free pass!

13
The order of rigor is irrelevant
  • Comes from the philosopher Nancy Cartwright
  • If we are to use evidence in policy, we need to
  • Develop the evidence, e.g. from an RCT
  • Argue that it applies to the population that we
    want to treat
  • The overall quality of the evidence depends on
    the weakest of these two steps that the first
    step is rigorous and convincing does not help if
    the second one is weak
  • The second step is often argued by like or
    matching arguments, that are inherently tied to
    observables, not the unobservables that the RCT
    can control
  • So a matching estimator at the first step might
    do just as well in backing the policy, in spite
    of its inferiority to the RCT
  • There is, at present, no basis for the popular
    belief that extrapolation from social experiments
    is less problematic than extrapolation from
    observational data. As we see it, the recent
    embrace of reduced-form social experimentation to
    the exclusion of structural evaluation based on
    observational data is not warranted. Manski and
    Garfinkel on training programs.

14
Alternatives?
  • I am not arguing for IV estimation, natural
    experiments, regression discontinuity designs,
    fixed effect estimation, or even OLS as a better
    than an RCT for project evaluation
  • Indeed, these methods often see themselves as
    mimicking RCTs, but being inferior to them
  • I agree
  • What I doubt is whether project evaluation itself
    can lead to scientific progress on issues in
    economic development
  • At least if we confine ourselves to what works as
    opposed to why it works
  • For the latter we need theory, mechanisms, or
    whatever
  • This does not necessarily involve structural
    estimation as it is usually construed in
    econometrics, which has problems of its own
  • Just that we need a connection to a mechanism of
    some sort

15
More on mechanisms
  • RCTs not very good at mechanisms the results are
    what they are, and accounts of why are often
    fairy stories, ex post rationalizations with no
    evidence base
  • This makes them difficult to generalize, because
    without the mechanism, hard to assess external
    validity
  • Also hard to assess welfare, because there are
    often positive and negative accounts that give
    the same answer
  • An example from the World Bank research review
  • Excellent project on doctor behavior in India and
    elsewhere, by Jishnu Das, Jeff Hammer, Lant
    Pritchett and others
  • Attempt to find out why private and public
    doctors behave as they do, the constraints and
    incentives they face, and the welfare
    consequences
  • Reviewer argued that this work was worthless, and
    should be replaced by RCTs to find out what works
  • RCTs would be an excellent idea here, but at a
    later stage, when we have some mechanisms to
    test, and test why things work, and from that we
    can learn and possibly generalize

16
Moving to opportunity
  • Another persuasive example comes from the MTO
    experiment, in which people were randomly
    assigned in city centers in the US to move to
    better places
  • Analyzed by a large team of economists and other
    social scientists
  • Interchange in American Journal of Sociology 2008
  • Clampet-Lundquist and Massey argue that the
    results of MTO dont make sense and dont give
    recognition to what they know
  • Katz, Kling and Liebman say you dont understand
    selection or what RCTs do, with some
    justification
  • Sampson cleans up in a beautiful paper that
    explains what us going on in one of the cities,
    Chicago, why the MTO experiment gets the results
    it gets, and why Massey is right in spirit if not
    in detail

17
Where do we go from here?
  • Economics is not project evaluation!
  • We need theories and tests of them
  • RCTs will often be an excellent way of testing
    theory
  • But there are other methods too
  • Such as
  • I think structural econometrics often has a role
  • Comprehensive, full-information test of a theory
  • Trouble is there is often much additional
    structure, often incredible
  • My ideal is the development of theory to the
    point where it is possible to develop acid tests
    in simple non-parametric way
  • Hypothetico-deductive method, Popper and beyond
  • Requires close interaction between theorists and
    empiricists
  • Medicine may rely on RCTs, physics never uses them

18
Examples of positive progress
  • Taken from a paper I am writing for the Journal
    of Economic Perspectives
  • Saving and growth
  • Life-cycle theory predicts that growth drives
    saving
  • Confirmation by Modigliani in 1970s not obvious
    and impressive
  • Later refutations in simple non-parametric tests
  • Which parts of the theory need to be abandoned,
    and which can we work on
  • Commodity prices
  • My work with Guy Laroque
  • Here structural econometrics helped elucidate
    non-parametric predictions that we were not smart
    enough to see in advance
  • These were falsified, but again provided
    suggestions of where to go
  • Nutrition in India
  • Work with Jean Drèze
  • Per capita calorie consumption is falling in
    spite of rapid growth and upward sloping Engel
    curves
  • Here we are starting from data, but have a
    mechanism in mind, that real income growth
    generates improvements in nutrition this is
    unlikely to be abandoned but it needs to be
    supplemented
Write a Comment
User Comments (0)
About PowerShow.com