Matching Estimators - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Matching Estimators

Description:

Differences-in-Differences and A (Very) Brief Introduction to Panel Data Author: suntory Last modified by: RLAB Created Date: 2/5/2006 11:28:05 AM – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 33
Provided by: sunt6
Category:

less

Transcript and Presenter's Notes

Title: Matching Estimators


1
Matching Estimators
  • Methods of Economic Investigation
  • Lecture 11

2
Last Time
  • General Theme If you dont have an experiment,
    how do you get a control group
  • Difference in Differences
  • How it works compare before-after between two
    comparable entities
  • Assumptions Fixed differences over time
  • Tests to improve credibility of assumption
  • Pre-treatment trends
  • Ashenfelter Dip

3
Todays Class
  • Another way to get a control group Matching
  • Assumptions for identification
  • Specific form of matching called propensity
    score matching
  • Is it better than just a plain old regression?

4
The Counterfactual Framework
  • Counterfactual what would have happened to the
    treated subjects, had they not received
    treatment?
  • Idea individuals selected into treatment and
    nontreatment groups have potential outcomes in
    both states
  • the one in which they are observed
  • the one in which they are not observed.

5
Reminder of Terms
  • For the treated group, we have observed mean
    outcome under the condition of treatment
    E(Y1T1) and unobserved mean outcome under the
    condition of nontreatment E(Y0T1).
  • For the control group we have both observed mean
    E(Y0T0) and unobserved mean E(Y1T0)

6
What is matching?
  • Pairing treatment and comparison units that are
    similar in terms of observable characteristics
  • Can do this in regressions (with covariates) or
    prior to regression to define your treatment and
    control samples

7
Matching Assumption
  • Conditioning on observables (X) we can take
    assignment to treatment as if random, i.e.
  • What is the implicit statement unobservables
    (stuff not in X) plays no role in treatment
    assignment (T)

8
A matched estimator
  • E(Y1 Y0 T1)
  • EY1 X, T1 EY0 X, T0 -
  • EY0 X, T1 EY0 X, T0
  • Key idea all selection occurs only through
    observed X

Assumed to be zero
Matched treatment effect
9
Just do a regression
  • Regression are flexible
  • if you only put in a main effect the regression
    will estimate a purely linear specification
  • Interactions and fixed effects allow different
    slopes and intercepts for any combination of
    variables
  • Can include quadratic and higher order polynomial
    terms if necessary
  • But fundamentally specify additively separable
    terms

10
Sometimes regression not feasible
  • The issue is largely related to dimentionality
  • Each time you add an observable characteristics,
    you partition your data into bins.
  • Imagine all variables are zero-one variables
  • Then if you have k Xs, you have 2k potential
    different values
  • Need enough observations in each value to
    estimate that precisely

11
Reducing the Dimensionality
  • Use of propensity score Probability of receiving
    treatment, conditional on covariates
  • Key assumption if
  • and defining
  • If this is true, can interpret estimate of
    differences in outcomes conditional on X as
    causal effect

12
Why not control for X
  • Matching is flexible in a different way
  • Avoid specifying a particular for the outcome
    equation, decision process or unobservable term
  • Just need the right observables
  • Flexible in the form of how Xs affect treatment
    probability but inflexible in how treatment
    probability affects outcome

13
Participation decision
  • Remember our 3 groups
  • Always takers take the treatment if offered AND
    take the treatment if not offered
  • We observe them if T0 but R1
  • Never takers dont take the treatment if not
    offered AND dont take it even if it is offered
  • We observe them if T1 but R0
  • Compliers just do what theyre assigned to do
  • T1 R1 OR T0 R0

14
Conditions for Matching to Work
  • Take 1-sided non-compliance for easeif not
    offered, cant take it, but some people dont
    take it even if offered

Error term for never takers
Error term for compliers
On avg, conditional on X unobservable are the same
If its zero ? Perfect compliance so
conditioning on X replicates experimental setting
15
Common Support
  • Can only exist if there is a region of common
    support
  • People with the same X values are in both the
    treatment and the control groups
  • Let S be the set of all observables X, then
    0ltPr(T1 X)lt0 for some S subset of S
  • Intuition Someone in control close enough to
    match to treatment unit OR enough overlap in the
    distribution of treated and untreated individuals

16
Lots of common support
Between red and blue line is area of common
support
17
Not so much common support
18
Trimming
  • Define Min and Max values of X for region of
    overlapdrop all units not in that region
  • Remove Regions which do not have strictly
    positive propensity score in both treatment and
    control distributions
  • (Petra and Todd, 2005)
  • Both are quite similar when used in practice but
    if missing sections in middle of distribution can
    use the second option

19
How do we match on p(X)
  • Taken literally, should match on exactly p(Xi)
  • In practice hard to do so strategy is to match
    treated units to comparison units whose p-scores
    are sufficiently close to consider
  • Issues
  • How many times can 1 unit be a match
  • How many to match to treatment unit
  • How to match if using more than 1 control unit
    per treatment unit

20
Replacement
  • Issue once control group person Z is a match for
    individual A, can she also be a match for
    individual B
  • Trade-off between bias and precision
  • With replacement minimizes the propensity score
    distance between the matched and the comparison
    unit
  • Without replacement

21
Are we doing a one-to-one match?
  • If 1-to-1 match units closely related but may
    not be very precise estimates
  • More you include in match, the more the p-score
    of the control group will differ from the
    treatment group
  • Trade-off between bias and precision
  • Typically use 1-to-many match because 1-to-1 is
    extremely data intensive if X is multi-dimensional

22
Different matching algorithms-1
  • Can use nearest neighbor which chooses m closest
    comparison units
  • implicitly weights these all the same
  • Get fixed m but may end up with different pscores
  • Can use caliperradius around a point
  • Again implicitly weights these the same
  • Fixed difference in p-scores, but may not be many
    units in radius
  • Stratify
  • Break sample up into intervals
  • Estimate treatment effect separately in each
    region

23
Different Matching Algorithms-2
  • Can also use some type of distribution
  • Kernel estimator puts some type of distribution
    (e.g. normal) around the each treatment unit and
    weights closer control units more and farther
    control units less
  • Explicit weighting function can be used if you
    have some knowledge of how related units of
    certain distances are to each other

24
How close is close enough?
  • No right answer in these choiceswill depend
    heavily on sample issues
  • How deep is the common support (i.e. are there
    lots of people in both control and treatment
    group at all the p-score values
  • Should all be the same asymptotically but in
    finite samples (which is everything) may differ

25
Tradeoffs in different methods
Source Caliendo and Kopeinig, 2005
26
How to estimate a p-score
  • Typically use a logit
  • Specific, useful functional form for estimating
    discrete choice models
  • You havent learned these yet but you will
  • For now, think of running a regular OLS
    regression where the outcome is 1 if you got the
    treatment and zero if you didnt
  • Take the ET X and thats your propensity score

27
The Treatment Effect
  • CIA holds and sufficient region of of common
    support
  • Difference in outcome between treated individual
    i and weighted comparison group J, with weight
    generated by the p-score distribution in the
    common support region

J is comparison group with J is the number of
comparison group units matched to i
N is the treatment group and N is the size of
the treatment group
28
General Procedure
  • Run Regression
  • Dependent variable T1, if participate T 0,
    otherwise.
  • Choose appropriate conditioning variables, X
  • Obtain propensity score predicted probability
    (p)
  • 1-to-1 match
  • estimate difference in outcomes for each pair
  • Take average difference as treatment effect
  • 1-to-n Match
  • Nearest neighbor matching
  • Caliper matching
  • Nonparametric/kernel matching

Multivariate analysis based on new sample
29
Standard Errors
  • Problem Estimated variance of treatment effect
    should include additional variance from
    estimating p
  • Typically people bootstrap which is a
    non-parametric form of estimating your
    coefficients over and over until you get a
    distribution of those coefficientsuse the
    variance from that
  • Will do this in a few weeks

30
Some concerns about Matching
  • Data intensive in propensity score estimation
  • May reduce dimensionality of treatment effect
    estimation but still need enough of a sample to
    estimate propensity score over common support
  • Need LOTS of Xs for this to be believable
  • Inflexible in how p-score is related to treatment
  • Worry about heterogeneity
  • Bias terms much more difficult to sign
    (non-linear p-score bias)

31
Matching Diff-in-Diff
  • Worry that unobservables causing selection
    because matching on X not sufficient
  • Can combine this with difference and difference
    estimates
  • Take control group J for each individual i
  • Estimate difference before treatment
  • If the groups are truly as if random should be
    zero
  • If its not zero can assume fixed differences
    over time and take before after difference in
    treatment and control groups

32
Next Time
  • Comparing Non-Experimental Methods to the
    experiments they are trying to replicate
  • Goal See how well these techniques work to get
    the estimated experimental effect
Write a Comment
User Comments (0)
About PowerShow.com