The basics II – propensity scores
In this section, we will discuss propensity scores and how they are sometimes used to address the challenges that we encounter when using matching in multidimensional cases. Finally, we’ll demonstrate why you should not use propensity scores for matching, even if your favorite econometrician does so.
Matching in the wild
Let’s start with a mental experiment. Imagine that you received a new dataset to analyze. This data contains 1,000 observations. What are the chances that you’ll find at least one exact match for each row if there are 18 variables in your dataset?
The answer obviously depends on a number of factors. How many variables are binary? How many are continuous? How many are categorical? What’s the number of levels for categorical variables? Are variables independent or correlated with each other?
To get an idea of what the answer can be, let’s take a look at Figure 9.5:
