0% found this document useful (0 votes)
2 views28 pages

Matching Method (PSM) - Mbarara. Toko

The document discusses matching methods, particularly propensity score matching (PSM), which is used to create a comparison group for evaluating program impacts when random assignment is not possible. It emphasizes the importance of observed characteristics in matching and highlights limitations such as the assumption of no unobserved differences that could bias results. Additionally, it outlines the steps involved in PSM and the challenges faced when trying to find suitable matches for treatment units.

Uploaded by

LawrieOkujja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views28 pages

Matching Method (PSM) - Mbarara. Toko

The document discusses matching methods, particularly propensity score matching (PSM), which is used to create a comparison group for evaluating program impacts when random assignment is not possible. It emphasizes the importance of observed characteristics in matching and highlights limitations such as the assumption of no unobserved differences that could bias results. Additionally, it outlines the steps involved in PSM and the challenges faced when trying to find suitable matches for treatment units.

Uploaded by

LawrieOkujja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

Matching method &

(PSM)
DME WDK KAMPALA
Jimmy Toko
[email protected]
0701422116/0782399762
Intro
• Matching methods can be applied in the context of almost any
program assignment rules, so long as a group exists that has not
participated in the program.
• Matching methods typically rely on observed characteristics to
construct a comparison group, and so the methods require the
strong assumption of no unobserved differences in the
treatment and comparison populations that are also associated
with the outcomes of interest
Matching Cont.….
• Because of that strong assumption, matching methods are
typically most useful in combination with one of the other
methodologies such as the DiD.
• Matching uses statistical techniques to construct an artificial
comparison group by identifying for every possible observation
under treatment a non-treatment observation (or set of non-
treatment observations) that has the most similar characteristics
possible.
Key Concept:

• Matching uses large data sets and heavy statistical techniques


to construct the best possible artificial comparison group for a
given treatment group.
I.e.
Consider a case in which you are attempting to evaluate the impact of a
program and have a data set that contains both households that enrolled in
the program and households that did not enrol, for example, the
Demographic and Health Survey.
• The program that you are trying to evaluate does not have any clear
assignment rules (such as randomized assignment or an eligibility index)
that explain why some households enrolled in the program and others did
not.
• In such a context, matching methods will enable you to identify the set of
non enrolled households that look most similar to the treatment
households, based on the characteristics that you have available in your
data set.
• These “matched” none enrolled households then become the
comparison group that you use to estimate the counterfactual.
• Finding a good match for each program participant requires
approximating as closely as possible the variables or
determinants that explain that individual’s decision to enrol in
the program. Unfortunately, this is easier said than done.
• If the list of relevant observed characteristics is very large, or if
each characteristic takes on many values, it may be hard to
identify a match for each of the units in the treatment group.
• As you increase the number of characteristics or dimensions
against which you want to match units that enrolled in the
program, you may run into what is called “the curse of
dimensionality.”
• E.g. if you use only three important characteristics to identify
the matched comparison group, such as age, gender, and region
of birth, you will probably find matches for all program
enrolees in the pool of none enrolees, but you run the risk of
leaving out other potentially important characteristics.
• The figure below illustrates matching based on four
characteristics:
• age,
• gender,
• months unemployed, and
• secondary school diploma.
• However, if you increase the list of variables, say, to include
number of children, number of years of education, age of the
mother, age of the father, and so forth, your database may not
contain a good match for most of the program enrolees, unless
it contains a very large number of observations.
• the curse of dimensionality can be quite easily solved using a method
called “propensity score matching” (Rosenbaum and Rubin 1983).
• In this approach, we no longer need to try to match each enrolled unit
to a non enrolled unit that has exactly the same value for all observed
control characteristics.
• Instead, for each unit in the treatment group and in the pool of non
enrolees we compute the probability that a unit will enrol in the
program based on the observed values of its characteristics, the so-
called propensity score.
• Once the propensity score has been computed for all units, then units
in the treatment group can be matched with units in the pool of non
enrolees that have the closest propensity score.
• These “closest units” become the comparison group and are used to
produce an estimate of the counterfactual.
• The propensity score matching method tries to mimic the
randomized assignment to treatment and comparison groups by
choosing for the comparison group those units that have similar
propensities to the units in the treatment group.
• This score is a single number ranging from 0 to 1 that summarizes all
of the observed characteristics of the units as they influence the
likelihood of enrolling in the program.
• Since propensity score matching is not a real randomized
assignment method, but tries to imitate one, it belongs to the
category of quasi-experimental methods.
• The difference in outcomes (Y) between the treatment or
enrolled units and their matched comparison units produces the
estimated impact of the program.
• In summary, the program’s impact is estimated by comparing
the average outcomes of a treatment or enrolled group and the
average outcome among a statistically matched subgroup of
units, the match being based on observed characteristics
available in the data at hand.
• For propensity score matching to produce externally valid
estimates of a program’s impact, all treatment or enrolled units
need to be successfully matched to a none enrolled unit.
• It may happen that for some enrolled units, no units in the pool
of non-enrolees have similar propensity scores. In technical
terms, there may be a “lack of common support,” or lack of
overlap, between the propensity scores of the treatment or
enrolled group and those of the pool of non enrolees.
The figure shows the distribution of propensity scores separately for enrolees and non
enrolees.
• Crucially, these distributions do not overlap perfectly. In the
middle of the distribution, matches are relatively easy to find
because enrolees and non enrolees have similar characteristics.
• However, units with predicted propensity scores close to 1
cannot be matched to any non enrolees with similar propensity
scores.
• Intuitively, units who are highly likely to enrol in the program
are so dissimilar to non enrolling units that we cannot find a
good match for them.
• A lack of common support thus appears at the extremes, or
tails, of the distribution of propensity scores.
Matched Comparison Design PSM
The logic behind matching :

Matching finds for each observation a nearly identical observation in the control group
based on observable characteristics
The project impact is the average of the differences in outcomes between matched
pairs of observations
Intervention Group Outcome
(Cases) Intervention = O1
Matched (PSM)
Non Intervention Outcome
Group (Controls) No Intervention
= O2

Effect Size = O1 - O2
Note: Counterfactual is O3
Steps in PSM
• Jalan and Ravallion (2003a) summarize the steps to be taken
when applying propensity score matching.
• 1. You will need representative and highly comparable surveys to
identify the units that enrolled in the program and those that did
not
• 2. You must pool the two samples and estimate the probability
that each individual enrols in the program, based on individual
characteristics observed in the survey. This step yields the
propensity score.
• 3. you restrict the sample to units for which common support
appears in the propensity score distribution.
Steps Cont.…..
• 4. for each enrolled unit, you locate a subgroup of none rolled
units that have similar propensity scores.
• 5. you compare the outcomes for the treatment or enrolled units
and their matched comparison or none rolled units. The
difference in average outcomes for these two subgroups is the
measure of the impact that can be attributed to the program for
that particular treated observation.
• 6. the mean of these individual impacts yields the estimated
average treatment effect.
• Overall, it is important to remember two crucial issues about
matching.
• First, matching must be done using baseline characteristics.
• Second, the matching method is only as good as the
characteristics that are used for matching, so that having a large
number of background characteristics is crucial.
Limitations of the Matching Method

• Although matching procedures can be applied in many settings,


regardless of a program’s assignment rules, they have several
serious shortcomings.
• First, they require extensive data sets on large samples of units,
and even when those are available, a lack of common support
between the treatment or enrolled group and the pool of
nonparticipants may appear.
• Second, matching can only be performed based on observed
characteristics; by definition, we cannot incorporate
unobserved characteristics in the calculation of the propensity
score.
• So for the matching procedure to identify a valid comparison
group, we must be sure that no systematic differences in
unobserved characteristics between the treatment units and the
matched com­parison units exist that could influence the
outcome (Y).
• Since we cannot prove that no such unobserved characteristics
that affect both participation and outcomes exist, we have to
assume that none exist.
• This is usually a very strong assumption. Although matching
helps to control for observed back­ground characteristics, we
can never rule out bias that stems from unob­served
characteristics.
• In summary, the assumption that no selection bias has occurred
stemming from unobserved characteristics is very strong, and
most problematic, it cannot be tested.
• Matching is generally less robust than the other
evaluation methods.
• For instance, randomized selection methods do not
require the untestable assumption that there are no
unobserved variables that explain both participation in
the program and outcomes.
• They also do not require such large samples or as
extensive background characteristics as propensity score
matching.
• In practice, matching methods are typically used when
randomized selection, regression discontinuity design, and
difference-in-differences options are not possible.
• Many authors use so-called ex-post matching when no baseline
data are available on the outcome of interest or on background
characteristics.
• They use a survey that was collected after the start of the
program (that is, ex-post) to infer what people’s background
characteristics were at baseline (for example, age, marital
status), and then match the treated group to a comparison
group using those inferred characteristics.
• Generally we note that impact evaluations are best
designed before a program begins to be
implemented.
• Once the program has started, if one has no way to
influence how it is allocated and no baseline data
have been collected, very few, or no, solid options
for the evaluation will be available.
Discussion
• Consider a program that provides loans to poor farmers, so that they can buy
fertilizer to increase their maize production.
• In the year before the program started, the farmers who later enrolled in the
program harvested an average of 1,000 kg of maize per hectare (ha). One
year after the program started, their maize yields increased to 1200 kg/ha.
This increase over time is the before-and-after estimator of program impact:
200 (=1,200 – 1,000) kg/ha.
• Before enrolment
• After enrolment
• Before-and-After Estimator of Impact
• 1,000 kg/ha
• 1,200 kg/ha
• 200 (=1,200 – 1,000) kg/ha
• One year after the program started, we observe that farmers who enrolled in
the program harvested an average of 1,100 kg of maize per ha, while
farmers who did not enroll harvested an average of 1,000 kg/ha. The cross-
section (enrolled-and-no enrolled) estimator of a program’s impact is the
difference in the yields of these two groups:
• 100 (=1,100 – 1,000) kg/ha.
• Farmers Before
• After
• Cross-section (enrolled and non-enrolled) Estimator of Program
Impact
• Enrolled
• No measurement taken
• 1,100 kg/ha
• 100 (=1,100-1,000) kg/ha
• Questions for discussion:
• Is this a plausible estimate of the program impact?
• If not, is it likely to underestimate of overestimate the true
impact? why
• To see the problem with the cross-sectional estimator, consider
the following two scenarios:

You might also like