Randomization and Impact evaluation - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Randomization and Impact evaluation

Description:

Identification issues are more transparent compare with other evaluation technique. ... School-based deworming: treat with a single pill every 6 months at a cost of 49 ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 44
Provided by: world72
Category:

less

Transcript and Presenter's Notes

Title: Randomization and Impact evaluation


1
Randomization and Impact evaluation
2
The Types of Program Evaluation
  • Process evaluation
  • Audit and monitoring
  • Did the intended policy actually happen
  • (2) Impact evaluation
  • What effect (if any) did the policy have?

3
Why Impact Evaluation ?
  • Knowledge is a global public good
  • Long term credibility
  • Help choosing best projects build long term
    support for development

4
The evaluation problem and alternative solutions
  • Impact is the difference between the relevant
    outcome indicator with the program and that
    without it.
  • However, we can never simultaneously observe
    someone in two different states of nature.
  • So, while a post-intervention indicator is
    observed, its value in the absence of the program
    is not, i.e., it is a counter-factual.

5
Problems when Evaluation is not Built in Ex-Ante
  • Need a reliable comparison group
  • Before/After Other things may happen
  • Units with/without the policyMay be different
    for other reasons than the policy (e.g. because
    policy is placed in specific areas)

6
We observe an outcome indicator,

Intervention
7
and its value rises after the program

Intervention
8
However, we need to identify the counterfactual

Intervention
9
since only then can we determine the impact of
the intervention

10
How can we fill in the missing dataon the
counterfactual?
  • Randomization
  • Matching
  • Propensity-score matching
  • Difference-in-difference
  • Matched double difference
  • Regression Discontinuity Design
  • Instrumental variables

11
1. RandomizationRandomized out group reveals
counterfactual.
  • Only a random sample participates.
  • As long as the assignment is genuinely random,
    impact is revealed in expectation.
  • Randomization is the theoretical ideal, and the
    benchmark for non-experimental methods.
    Identification issues are more transparent
    compare with other evaluation technique.
  • But there are problems in practice
  • internal validity selective non-compliance
  • external validity difficult to extrapolate
    results from a pilot experiment to the whole
    population

12
2. MatchingMatched comparators identify
counterfactual.
  • Propensity-score matching Match on the basis of
    the probability of participation.
  • Match participants to non-participants from a
    larger survey.
  • The matches are chosen on the basis of
    similarities in observed characteristics.
  • This assumes no selection bias based on
    unobservable heterogeneity.
  • Validity of matching methods depends heavily on
    data quality.

13
3. Propensity-score matching (PSM)Match on the
probability of participation.
  • Ideally we would match on the entire vector X of
    observed characteristics. However, this is
    practically impossible. X could be huge.
  • Rosenbaum and Rubin match on the basis of the
    propensity score
  • This assumes that participation is independent of
    outcomes given X. If no bias give X then no bias
    given P(X).

14
Steps in score matching
1 Representative, highly comparable, surveys of
the non-participants and participants. 2 Pool
the two samples and estimate a logit (or probit)
model of program participation. Predicted values
are the propensity scores. 3 Restrict
samples to assure common support Failure of
common support is an important source of bias in
observational studies (Heckman et al.)
15
Density of scores for participants
16
Density of scores for non-participants
17
Density of scores for non-participants
18
Steps in score matching
4 For each participant find a sample of
non-participants that have similar propensity
scores. 5 Compare the outcome indicators.
The difference is the estimate of the gain due to
the program for that observation. 6 Calculate
the mean of these individual gains to obtain the
average overall gain.
19
4. Difference-in-difference (double difference)
Observed changes over time for nonparticipants pro
vide the counterfactual for participants.
  • Collect baseline data on non-participants and
    (probable) participants before the program.
  • Compare with data after the program.
  • Subtract the two differences, or use a regression
    with a dummy variable for participant.
  • This allows for selection bias but it must be
    time-invariant and additive.

20
Selection bias

Selection bias
21
Diff-in-diff requires that the bias is additive
and time-invariant

22
The method fails if the comparison group is on a
different trajectory

23
  • Diff-in-diff
  • if (i) change over time for comparison group
    reveals counterfactual
  • and (ii) baseline is uncontaminated by the
    program,

24
5. Matched double differenceMatching helps
control for bias in diff-in-diff
  • Score match participants and non-participants
    based on observed characteristics in baseline
  • Then do a double difference
  • This deals with observable heterogeneity in
    initial conditions that can influence subsequent
    changes over time

25
6. Regression Discontinuity Design
  • Selection function is a discontinuous function
  • UPP in Indonesia two similar kecamatan in the
    same kabupaten that have scores within the
    neighborhood of the cut off score can be treated
    differently

Selection
1
0
Kecamatan score
control
treatment
26
7. Instrumental variables Identifying exogenous
variation using a 3rd variable
  • Outcome regression
  • D 0,1 is our program not random
  • Instrument (Z) influences participation, but
    does not affect outcomes given participation (the
    exclusion restriction).
  • This identifies the exogenous variation in
    outcomes due to the program.
  • Treatment regression

27
Randomization An example from Mexico
  • Progresa Grants to poor families, conditional on
    preventive health care and school attendance for
    children. Given to women
  • Mexican government wanted an evaluation order of
    community phase-in was random
  • Results child illness down 23 height increased
    1-4cm 3.4 increase in enrollment
  • After evaluation PROGRESA expanded within
    Mexico, similar programs adopted throughout other
    Latin American countries

28
Randomization An example from Kenya
  • School-based deworming treat with a single pill
    every 6 months at a cost of 49 cents per student
    per year
  • 27 of treated students had moderate-to-heavy
    infection, 52 of comparison
  • Treatment reduced school absenteeism by 25, or 7
    percentage points
  • Costs only 3 per additional year of school
    participation

29
Lessons randomized experiments
  • Randomized evaluations are often feasible
  • Have been conducted successfully
  • Are labor intensive and costly, but no more so
    than other data collection activities
  • Results from randomized evaluations can be quite
    different from those drawn from retrospective
    evaluations
  • NGOs are well-suited to conduct randomized
    evaluations in collaboration with academics and
    external funders

30
Lessons randomized experiments
  • While randomization is a powerful tool
  • Internal validity can be questionable if we do
    not allow properly for selective compliance with
    the randomized assignment.
  • Not always feasible beyond pilot projects, which
    raises concerns about external validity.
  • Contextual factors influence outcomes scaled up
    program may work differently.

31
Matching Method Example Piped water and child
health in rural India
  • Is a child less vulnerable to diarrhea if he/she
    lives in a household with piped water?
  • Do children in poor, or poorly educated,
    households realize the same health gains from
    piped water as others?
  • Does income matter independently of parental
    education?

32
The evaluation problem
  • There are observable differences between those
    households with piped water and those without it.
  •  And these differences probably also matter to
    child health.

33
Naïve comparisons can be deceptive
  • Common practice compare villages with piped
    water, or some other infrastructure facility, and
    those without.
  • Failure to control for differences in village
    characteristics that influence infrastructure
    placement can severely bias such comparisons.

34
Model for the propensity scores for piped water
placement in India
  • Village variables agricultural modernization,
    educational and social infrastructure.
  • Household variables demographics, education,
    religion, ethnicity, assets, housing conditions,
    and state dummy variables.

35
More likely to have piped water if
  • Household lives in a larger village, with a high
    school, a pucca road, a bus stop, a telephone, a
    bank, and a market
  • it is not a member of a scheduled tribe
  • it is a Christian household
  • it rents rather than owns the home this is not a
    perverse wealth effect, but is related to the
    fact that rental housing tends to be better
    equipped
  • it is female-headed
  • it owns more land.

36
Impacts of piped water on child health
  • The results for mean impact indicate that access
    to piped water significantly reduces diarrhea
    incidence and duration.
  • Disease incidence amongst those with piped water
    would be 21 higher without it. Illness duration
    would be 29 higher.

37
Stratifying by income per capita
  • No significant child-health gains amongst the
    poorest 40 (roughly corresponding to the poor in
    India).
  • Very significant impacts for the upper 60  
  • Without piped water there would be no difference
    in infant diarrhea incidence between the poorest
    quintile and the richest.

38
When we stratify by both income and education
  • For the poor, the education of female members
    matters greatly to achieving the child-health
    benefits from piped water.
  • Even in the poorest 40, womens schooling
    results in lower incidence and duration of
    diarrhea among children from piped water.
  • Womens education matters much less for upper
    income groups.

39
Lessons on matching methods
  • When neither randomization nor a baseline survey
    are feasible, careful matching to control for
    observable heterogeneity is crucial.
  • This requires good data, to capture the factors
    relevant to participation.
  • Look for heterogeneity in impact average impact
    may hide important differences in the
    characteristics of those who gain or lose from
    the intervention.

40
Tracking participants and non-participants over
time
  • 1. Single-difference matching can still be
    contaminated by selection bias
  • Latent heterogeneity in factors relevant to
    participation
  • 2. Tracking individuals over time allows a double
    difference
  • This eliminates all time-invariant additive
    selection bias
  • 3. Combining double difference with matching
  • This allows us to eliminate observable
    heterogeneity in factors relevant to subsequent
    changes over time

41
Improving Evaluation Practice
  • When there is an impact evaluation
  • Build in evaluation ex-ante
  • Make a quality evaluation a primary
    responsibility of the manager of the program
  • Allocate the necessary resources
  • Encourage randomization whenever feasible
    (education, health, micro-finance, governance,
    not monetary policy)

42
Practical suggestions
  • Not every project needs impact evaluation select
    projects in priority areas, where knowledge
    needed
  • Take advantage of budget constraints and
    phase-in
  • Require pilot project before large scale project
  • Finance pilot projects and evaluations with
    grants
  • Collaborate with others
  • Academics (e.g. Evaluation Based Policy Fund in
    UK)
  • NGOs

43
Evaluation An Opportunity
  • Creating hard evidence of success will
  • spend future resources more effectively
  • influence other policymakers
  • build public support
Write a Comment
User Comments (0)
About PowerShow.com