0% found this document useful (0 votes)

732 views40 pages

Propensity Score Matching

This document reviews the propensity score method (PSM) for estimating causal effects using observational data. The PSM has been widely used in fields like economics, medicine, and sociology but has not been widely adopted in management research. The summary reviews why estimating causal effects from observational data can be challenging due to endogeneity. It then introduces the counterfactual framework for causal inference and discusses how the PSM can help estimate average treatment effects by overcoming issues like selection bias that plague traditional regression analyses of observational data. The article proceeds to provide an example application of the PSM and discusses its potential use in management research.

Uploaded by

Mizter. A. Knights

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

732 views40 pages

Propensity Score Matching

Uploaded by

Mizter. A. Knights

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 40

Organizational Research Methods

http://orm.sagepub.com/ Using the Propensity Score Method to Estimate Causal Effects: A Review and Practical Guide
Mingxiang Li Organizational Research Methods published online 13 June 2012 DOI: 10.1177/1094428112447816

The online version of this article can be found at: http://orm.sagepub.com/content/early/2012/06/11/1094428112447816 Published by:
http://www.sagepublications.com

On behalf of:

The Research Methods Division of The Academy of Management

Additional services and information for Organizational Research Methods can be found at: Email Alerts: http://orm.sagepub.com/cgi/alerts Subscriptions: http://orm.sagepub.com/subscriptions Reprints: http://www.sagepub.com/journalsReprints.nav Permissions: http://www.sagepub.com/journalsPermissions.nav

>> OnlineFirst Version of Record - Jun 13, 2012 What is This?

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Using the Propensity Score Method to Estimate Causal Effects: A Review and Practical Guide
Mingxiang Li1

Organizational Research Methods 00(0) 1-39 The Author(s) 2012 Reprints and permission: sagepub.com/journalsPermissions.nav DOI: 10.1177/1094428112447816 http://orm.sagepub.com

Abstract Evidence-based management requires management scholars to draw causal inferences. Researchers generally rely on observational data sets and regression models where the independent variables have not been exogenously manipulated to estimate causal effects; however, using such models on observational data sets can produce a biased effect size of treatment intervention. This article introduces the propensity score method (PSM)which has previously been widely employed in social science disciplines such as public health and economicsto the management field. This research reviews the PSM literature, develops a procedure for applying the PSM to estimate the causal effects of intervention, elaborates on the procedure using an empirical example, and discusses the potential application of the PSM in different management fields. The implementation of the PSM in the management field will increase researchers ability to draw causal inferences using observational data sets. Keywords causal effect, propensity score method, matching

Management scholars are interested in drawing causal inferences (Mellor & Mark, 1998). One example of a causal inference that researchers might try to determine is whether a specific management practice, such as group training or a stock option plan, increases organizational performance. Typically, management scholars rely on observational data sets to estimate causal effects of the management practice. Yet, endogeneitywhich occurs when a predictor variable correlates with the error termprevents scholars from drawing correct inferences (Antonakis, Bendahan, Jacquart, & Lalive, 2010; Wooldridge, 2002). Econometricians have proposed a number of techniques to deal

Department of Management and Human Resources, University of Wisconsin-Madison, Madison, WI, USA

Corresponding Author: Mingxiang Li, Department of Management and Human Resources, University of Wisconsin-Madison, 975 University Avenue, 5268 Grainger Hall, Madison, WI 53706, USA Email: [email protected]

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Organizational Research Methods 00(0)

with endogeneityincluding selection models, fixed effects models, and instrumental variables, all of which have been used by management scholars. In this article, I introduce the propensity score method (PSM) as another technique that can be used to calculate causal effects. In management research, many scholars are interested in evidence-based management (Rynes, Giluk, & Brown, 2007), which derives principles from research evidence and translates them into practices that solve organizational problems (Rousseau, 2006, p. 256). To contribute to evidencebased management, scholars must be able to draw correct causal inferences. Cox (1992) defined a cause as an intervention that brings about a change in the variable of interest, compared with the baseline control model. A causal effect can be simply defined as the average effect due to a certain intervention or treatment. For example, researchers might be interested in the extent to which training influences future earnings. While field experiment is one approach that can be used to correctly estimate causal effects, in many situations field experiments are impractical. This has prompted scholars to rely on observational data, which makes it difficult for scholars to gauge unbiased causal effects. The PSM is a technique that, if used appropriately, can increase scholars ability to draw causal inferences using observational data. Though widely implemented in other social science fields, the PSM has generally been overlooked by management scholars. Since it was introduced by Rosenbaum and Rubin (1983), the PSM has been widely used by economists (Dehejia & Wahba, 1999) and medical scientists (Wolfe & Michaud, 2004) to estimate the causal effects. Recently, financial scholars (Campello, Graham, & Harvey, 2010), sociologists (Gangl, 2006; Grodsky, 2007), and political scientists (Arceneaux, Gerber, & Green, 2006) have implemented the PSM in their empirical studies. A Google Scholar search in early 2012 showed that over 7,300 publications cited Rosenbaum and Rubins classic 1983 article that introduced the PSM. An additional Web of Science analysis indicated that over 3,000 academic articles cited this influential article. Of these citations, 20% of the publications were in economics, 14% were in statistics, 10% were in methodological journals, and the remaining 56% were in health-related fields. Despite the widespread use of the PSM across a variety of disciplines, it has not been employed by management scholars, prompting Gerharts (2007) conclusion that to date, there appear to be no applications of propensity score in the management literature (p. 563). This article begins with an overview of a counterfactual model, experiment, regression, and endogeneity. This section illustrates why the counterfactual model is important for estimating causal effects and why regression models sometimes cannot successfully reconstruct counterfactuals. This is followed by a short review of the PSM and a discussion of the reasons for using the PSM. The third section employs a detailed example to illustrate how a treatment effect can be estimated using the PSM. The following section presents a short summary on the empirical studies that used the PSM in other social science fields, along with a description of potential implementation of the PSM in the management field. Finally, this article concludes with a discussion of the pros and cons of using the PSM to estimate causal effects.

Estimating Causal Effects Without the Propensity Score Method

Evidence-based practices use quantitative methods to find reliable effects that can be implemented by practitioners and administrators to develop and adopt effective policy interventions. Because the application of specific recommendations derived from evidence-based research is not costless, it is crucial for social scientists to draw correct causal inferences. As pointed out by King, Keohane, and Verba (1994), we should draw causal inferences where they seem appropriate but also provide the reader with the best and most honest estimate of the uncertainty of that inference (p. 76).

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Counterfactual Model
To better understand causal effect, it is important to discuss counterfactuals. In Rubins causal model (see Rubin, 2004, for a summary), Y1i and Y0i are potential earnings for individual i when i receives (Y1i ) or does not receive training (Y0i . The fundamental problem of making a causal inference is how to reconstruct the outcomes that are not observed, sometimes called counterfactuals, because they are not what happened. Conceptually, either the treatment or the nontreatment is not observed and hence is missing (Morgan & Winship, 2007). Specifically, if i received training at time t, the earnings for i at t 1 is Y1i . But if i also did not receive training at time t, the potential earnings for i at t 1 is Y0i . Then the effect of training can be simply expressed as Y1i Y0i . Yet, because it is impossible for i to simultaneously receive (Y1i ) and not receive (Y0i the training, scholars need to find other ways to overcome this fundamental problem. One can also understand this fundamental issue as the what-if problem. That is, what if individual i does not receive training? Hence, reconstructing the counterfactuals is crucial to estimate unbiased causal effects. The counterfactual model shows that it is impossible to calculate individual-level treatment effects, and therefore scholars have to calculate aggregated treatment effects (Morgan & Winship, 2007). There are two major versions of aggregated treatment effects: the average treatment effect (ATE) and the average treatment effect on the treated group (ATT). A simple definition of the ATE can be written as ATE EY1i jTi 1; 0 EY0i jTi 1; 0; 1:1a

where E(.) represents the expectation in the population. Ti denotes the treatment with the value of 1 for the treated group and the value of 0 for the control group. In other words, the ATE can be defined as the average effect that would be observed if everyone in the treated and the control groups received treatment, compared with if no one in both groups received treatment (Harder, Stuart, & Anthony, 2010). The definition of ATT can be expressed as ATT EY1i jTi 1 EY0i jTi 1: 1:1b

In contrast to the ATE, the ATT refers to the average difference that would be found if everyone in the treated group received treatment compared with if none of these individuals in the treated group received treatment. The value for the ATE will be the same as that for the ATT when the research design is experimental.1

Experiment
There are different ways to estimate treatment effects other than PSM. Of these, the experiment is the gold standard (Antonakis et al., 2010). If the participants are randomly assigned to the treated or the control group, then the treatment effect can simply be estimated by comparing the mean difference between these two groups. Experimental data can generate an unbiased estimator for causal effects because the randomized design ensures the equivalent distributions of the treated and the control groups on all observed and unobserved characteristics. Thus, any observed difference on outcome can be caused only by the treatment difference. Because randomized experiments can successfully reconstruct counterfactuals, the causal effect generated by experiment is unbiased.

Regression
In situations when the causal effects of training cannot be studied using an experimental design, scholars want to examine whether receiving training (T) has any effect on future earnings (Y). In this case, scholars generally rely on potentially biased observational data sets to investigate the causal

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Organizational Research Methods 00(0)

effect. For example, one can use a simple regression model by regressing future earnings (Y) on training (T) and demographic variables such as age (x1 ) and race (x2 ). Y b0 b1 x1 b2 x2 tT e: 1:2

Scholars then interpret the results by saying ceteris paribus, the effect due to training is t. They typically assume t is the causal effect due to management intervention. Indeed, regression or the structural equation models (SEM) (cf. Duncan, 1975; James, Mulaik, & Brett, 1982) is still a dominant approach for estimating treatment effect.2 Yet, regression cannot detect whether the cases are comparable in terms of distribution overlap on observed characteristics. Thus, regression models are unable to reconstruct counterfactuals. One can easily find many empirical studies that seek to estimate causal effects by regressing an outcome variable on an intervention dummy variable. The findings of these studies, which used observational data sets, could be wrong because they did not adjust for the distribution between the treated and control groups.

Endogeneity
In addition to the nonequivalence of distribution between the control and treated groups, another severe error that prevents scholars from calculating unbiased causal effects is endogeneity. This occurs when predictor T correlates with error term e in Equation 1.2. A number of review articles have described the endogeneity problem and warned management scholars of its biasing effects (e.g., Antonakis et al., 2010; Hamilton & Nickerson, 2003). As discussed previously, endogeneity manifests from measurement error, simultaneity, and omitted variables. Measurement error typically attenuates the effect size of regression estimators in explanatory variables. Simultaneity happens when at least one of the predictors is determined simultaneously along with the dependent variable. An example of simultaneity is the estimation of price in a supply and demand model (Greene, 2008). An omitted variable appears when one does not control for additional variables that correlate with explanatory as well as dependent variables. Of these three sources of endogeneity, the omitted variable bias has probably received the most attention from management scholars. Returning to the earlier training example, suppose the researcher only controls for demographic variables but does not control for an individuals ability. If training correlates with ability and ability correlates with future earnings, the result will be biased because of endogeneity. Consequently, omitting ability will cause a correlation between training dummy T and residuals e. This violates the assumption of strict exogeneity for linear regression models. Thus, the estimated causal effect (t in Equation 1.2 will be biased. If the omitted variable is time-invariant, one can use the fixed effects model to deal with endogeneity (Allison, 2009). Beck, Bruderl, and Woywodes (2008) simulation showed that the fixed effects model provided correction for biased estimation due to the omitted variable. One can also view nonrandom sample selection as a special case of the omitted variable problem. Taking the effect of training on earnings as an example, one can only observe earnings for individuals who are employed. Employed individuals could be a nonrandom subset of the population. One can write the nonrandom selection process as Equation 1.3, D aZ u; 1:3

where D is latent selection variable (1 for employed individuals), Z represents a vector of variables (e.g., education level) that predicts selection, and u denotes disturbances. One can call Equation 1.2 the substantive equation and Equation 1.3 the selection equation. Sample selection bias is likely to materialize when there is correlation between the disturbances for substantive (e) and selection equation (u) (Antonakis et al., 2010, p. 1094; Berk, 1983; Heckman, 1979). When there is a correlation between e and u, the Heckman selection model, rather than the PSM, should be used to calculate

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

causal effect (Antonakis et al., 2010). To correct for the sample selection bias, one can first fit the selection model using probit or logit model. Then the predicted values from the selection model will be saved to compute the density and distribution values, from which the inverse Mills ratio (l)the ratio for the density value to the distribution valuewill be calculated. Finally, the inverse Mills ratio will be included in the substantive Equation 1.2 to correct for the bias of t due to selection. For more information on two-stage selection models, readers can consult Berk (1983).

The Propensity Score Method

Having briefly reviewed existing techniques for estimating causal effects, I now discuss how PSM can help scholars to draw correct causal inferences. The PSM is a technique that allows researchers to reconstruct counterfactuals using observational data. It does this by reducing two sources of bias in the observational data: bias due to lack of distribution overlap and bias due to different density weighting (Heckman, Ichimura, Smith, & Todd, 1998). A propensity score can be defined as the probability of study participants receiving a treatment based on observed characteristics. The PSM refers to a special procedure that uses propensity scores and matching algorithm to calculate the causal effect. Before moving on, it is useful to conceptually differentiate PSM from Heckmans (1979) selection model. His selection model deals with the probability of treatment assignment indirectly from instrumental variables. Thus, the probability calculated using the selection model requires one or more variables that are not censored or truncated and that can predict the selection. For example, if one wanted to study how training affects future earnings, one must consider the self-selection problem, because wages can only be observed for individuals who are already employed. Using the predicted probability calculated from the first stage (Equation 1.3), one can compute the inverse Mills ratio and insert this variable to the wage prediction model to correct for selection bias. In contrast to the predicted probability calculated in the Heckman selection model, propensity scores are calculated directly only through observed predictors. Furthermore, the propensity scores and the predicted probabilities calculated using Heckman selection have different purposes in estimating causal effects: The probabilities estimated from the Heckman model generate an inverse Mills ratio that can be used to adjust for bias due to censoring or truncation, whereas the probabilities calculated in the PSM are used to adjust covariate distribution between the treated group and the control group.

Reasons for Using the PSM

Because there are many methods that can estimate causal effects, why should management scholars care about the PSM? One reason is that most publications in the management field rely on observational data. Such large data can be relatively inexpensive to obtain, yet they are almost always observational rather than experimental. By adjusting covariates between the treated and control groups, the PSM allows scholars to reconstruct counterfactuals using observational data. If the strongly ignorable assumption that will be discussed in the next section is satisfied, then the PSM can produce an unbiased causal effect using observational data sets. Second, mis-specified econometric models using observational data sometimes produce biased estimators. One source of such bias is that the two samples lack distribution overlap, and regression analysis cannot tell researchers the distribution overlap between two samples. Cochran (1957, pp. 265-266) illustrated this problem using the following example: Suppose that we were adjusting for differences in parents income in a comparison of private and public school children, and that the private-school incomes ranged from $10,000$12,000, while the public-school incomes ranged from $4,000$6,000. The covariance would adjust results so that they allegedly applied to a mean income of $8,000 in each group, although neither group has any observations in which incomes are

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Organizational Research Methods 00(0)

at or near this level. The PSM can easily detect the lack of covariate distribution between two groups and adjust the distribution accordingly. Third, linear or logistic models have been used to adjust for confounding covariates, but such models rely on assumptions regarding functional form. For example, one assumption required for a linear model to produce an unbiased estimator is that it does not suffer from the aforementioned problem of endogeneity. Although the procedure to calculate propensity scores is parametric, using propensity scores to compute causal effect is largely nonparametric. Thus, using the PSM to calculate the causal effect is less susceptible to the violation of model assumptions. Overall, when one is interested in investigating the effectiveness of a certain management practice but is unable to collect experimental data, the PSM should be used, at least as a robust test to justify the findings estimated by parametric models.

Overview of the PSM

The concept of subclassification is helpful for understanding the PSM. Simply comparing the mean difference of the outcome variables in two groups typically leads to biased estimators, because the distributions of the observational variables in the two groups may differ. Cochrans (1968) subclassification method first divides an observational variable into n subclasses and then estimates the treatment effect by comparing the weighted means of the outcome variable in each subclass. He used two approaches to demonstrate the effectiveness of subclassification in reducing bias in observational studies. First, he used an empirical example (death rate for smoking groups with country of origin and age as covariates) to show that when age was divided into two classes more than half the effect of the age bias was removed. Second, he used a mathematical model to derive the proportion of bias that can be removed through subclassification. For different distribution functions, using five or six subclasses will typically remove 90% or more of the bias shown in the raw comparison. With more than six subclasses, only small amounts of additional bias can be removed. Yet, subclassification is difficult to utilize if many confounding covariates exist (Rubin, 1997). To overcome the difficulty of estimating the treatment effects using Cochrans technique, Rosenbaum and Rubin (1983) developed the PSM. The key objective of the PSM is to replace the many confounding covariates in an observational study with one function of these covariates. The function (or the propensity score) captures the likelihood of study participants receiving a treatment based on observed covariates. The estimated propensity score is then used as the only confounding covariate to adjust for all of the covariates that go into the estimation. Since the propensity score adjusts for all covariates using a simple variable and Cochran found that five blocks can remove 90% of bias due to raw comparison, stratifying the propensity score into five blocks can generally remove much of the difference due to the non-overlap of all observed covariates between the treated group and the control group. Central to understanding the PSM is the balancing score. Rosenbaum and Rubin (1983) defined the balancing score as a function of observable covariates such that the conditional distribution of X given the balancing score is the same for the treated group and the control group. Formally, the balancing score bX satisfies X ?T jbX , where X is a vector of the observed covariates, T represents the treatment assignment, and ? refers to independence. Rosenbaum and Rubin argued that the propensity score is a type of balancing score. They further proved that the finest balancing score is b X X , the coarsest balancing score is the propensity score, and any score that is finer than the propensity score is the balancing score. Rosenbaum and Rubin (1983) also introduced the strongly ignorable assumption, which implies that given the balancing scores, the distributions of the covariates between the treated and the control groups are the same. They further showed that treatment assignment is strongly ignorable if it satisfies the condition of unconfoundedness and overlap. Unconfoundedness means that conditional on

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

observational covariates X, potential outcomes (Y1 and Y0) are not influenced by treatment assignment (Y1 ; Y0 ?T jX ). This assumption simply asserts that the researcher can observe all variables that need to be adjusted. The overlap assumption means that given covariates X, the person with the same X values has positive and equal opportunity of being assigned to the treated group or the control group 0 < prT 1jX < 1. Strongly ignorable assumption rules out the systematic, pretreatment, and unobserved differences between the treated and the control subjects that participate in the study (Joffe & Rosenbaum, 1999). Given the strongly ignorable assumption, the ATT defined in Equation 1.1b can be estimated using the balancing score. Because the propensity score e(x) is one form of balancing score, one can estimate the ATT by subtracting the average treatment effect of the treated group from that of the control group at a particular propensity score. Thus, Equation 1.1b could be rewritten as ATT EfY jT 1; e xg EfY jT 0; e xg. If there are unobserved variables that simultaneously affect the treatment assignment and the outcome variable, the treatment assignment is not strongly ignorable. One can compare the failure of the strongly ignorable assumption with endogeneity in the mis-specified econometric models. One can view this as the omitted or unmeasured variable problem (cf. James, 1980). Specifically, when one calculates the propensity scores, one or more variables that may affect treatment assignment and outcomes are omitted. For example, suppose an unobserved variable partially determines treatment assignment. In this case, two individuals with the same values of observed covariates will receive the same propensity score, despite the fact that they have different values of unobserved covariates and, thus, should receive different propensity scores. If the strongly ignorable assumption is violated, the PSM will produce biased causal effects.

Estimating Causal Effects With the Propensity Score Method

If the treatment assignment is strongly ignorable, scholars can use the PSM to remove the difference in the covariates distributions between the treated and the control groups (Imbens, 2004). This section details how scholars can apply the PSM to compute causal effects. Generally speaking, four major steps need to take place to estimate causal effect (Figure 1): (1) Determine observational covariates and estimate the propensity scores, (2) stratify the propensity scores into different strata and test the balance for each stratum, (3) calculate the treatment effect by selecting appropriate methods such as matched sampling (or matching) and covariance adjustment, and (4) conduct a sensitivity test to justify that the estimated ATT is robust. To demonstrate how scholars can use the proposed procedure listed in Figure 1 to gauge causal effect, I analyze three sources of data sets that have been widely used by economists (Dehejia & Wahba, 1999, 2002; Heckman & Hotz, 1989; Lalonde, 1986; Simith & Todd, 2005). These data sets include both experimental and observational data. Given that the unbiased treatment effect can be computed from the experimental design, it is possible to compare the discrepancy between the estimated ATT using observational data and the unbiased ATT calculated from the experimental design. The National Supported Work Demonstration (NSW) data were collected using an experimental design in which individuals were randomly chosen to provide data on work experience for a period of around 6 to 18 months in the years from 1975 to 1977. This federally funded program randomly selected qualified individuals for training positions so that they could get paying jobs and accumulate work experience. The other set of qualified individuals was randomly assigned to the control group, where they had no opportunity to receive the benefit of the NSW program. To ensure that the earnings information from the experiment included calendar year 1975 earnings, Lalonde (1986) chose participants who were assigned to treatment after December 1975. This procedure reduced the NSW sample to 297 treated individuals and 425 control individuals for the male

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Organizational Research Methods 00(0)

Determine observational covariates

Step 1: estimating propensity score

Estimate PScore: High-order covariates

1. 2. 3. 4. Logit/probit Ordinal probit Multinomial logit Hazard

Stratify PScore to different strata

Step 2: stratifying and balancing propensity score

Covariate is not balanced

Test for balance of covariate

Covariate is balanced

Estimate causal effect: 1. Matched sampling

1) Stratified matching 2) Nearest neighbor matching 3) Radius matching 4) Kernel matching Covariate adjustment

Step 3: estimating causal effect

Sensitivity test: 1. Multiple comparison groups 2. Specification 3. Instrumental variables 4. Rosenbaum bounds

Step 4: sensitivity test

Figure 1. Steps for estimating treatment effects

Note: PScore propensity scores.

participants. Dehejia and Wahba (1999, 2002) reconstructed Lalondes original NSW data by including individuals who attended the program early enough to obtain retrospective 1974 earning information. The final NSW sample includes 185 treated and 265 control individuals. Lalondes (1986) observational data consisted of two distinct comparison groups in the years between 1975 and 1979: the Population Survey of Income Dynamics (PSID-1) and the Current Population SurveySocial Security Administration File (CPS-1). Initiated in 1968, the PSID is a nationally representative longitudinal database that interviewed individuals and families for information on dynamics of employment, income, and earnings. The CPS, a monthly survey conducted by Bureau of the Census for the Bureau of Labor Statistics, provides comprehensive information on the unemployment, income, and poverty of the nations population. Lalonde further extracted four data sets (denoted as PSID-2, PSID-3, CPS-2, and CPS-3) that represent the treatment group based on

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Table 1a. Description of Data Sets and Definition of Variables Data Sets NSW Treated Sample Size 185 Description National Supported Work Demonstration (NSW) data were collected using experimental design, where qualified individuals were randomly assigned to the training position to receive pay and accumulate experience. Experimental control group: The set of qualified individuals were randomly assigned to this control group so that they have no opportunity to receive the benefit of NSW program. Nonexperimental control group: 1975-1979 Population Survey of Income Dynamics (PSID) where all male household heads under age 55 who did not classify as retired in 1975. Data set was selected from PSID-1 who were not working in the spring of 1976. Data set was selected from PSID-2 who were not working in the spring of 1975. Nonexperimental control group: 1975-1979 Current Population Survey (CPS) where all participants with age under 55. Data set was selected from CPS-1 where all men who were not working when surveyed in March 1976. Data set was selected from CPS-2 where all unemployed men in 1976 whose income in 1975 was below the poverty line. Definition Set to 1 if the participant comes from NSW treated data set, 0 otherwise The age of the participants (in years) Number of years of schooling Set to 1 for Black participants, 0 otherwise Set to 1 for Hispanic participants, 0 otherwise Set to 1 for married participants, 0 otherwise Set to 1 for the participants with no high school degree, 0 otherwise Earnings in 1974 Earnings in 1975 Earnings in 1978, the outcome variable

NSW Control

260

PSID-1

2,490

PSID-2 PSID-3 CPS-1 CPS-2 CPS-3 Variables Treatment Age Education Black Hispanic Married Nodegree RE74 RE75 RE78

253 128 15,992 2,369 429

simple pre-intervention characteristics (e.g., age or employment status; see Table 1a for details). Table 1a reports details of data sets and the definitions of the variables.

Step 1: Estimating the Propensity Scores

To calculate a propensity score, one first needs to determine the covariates. Heckman, Ichimura, and Todd (1997) demonstrated that the quality of the observational variables has a significant impact on the estimated results. Having knowledge of relevant theory, institutional settings, and previous research is beneficial for scholars to specify which variables should be included in the model (Simith & Todd, 2005). To appropriately represent the theory, scholars need to specify not only the observational covariates but also the high-order covariates such as quadratic effects and interaction effects. From a methodological perspective, researchers need to add high-order covariates to achieve strata balance. The process of adding high-order covariates will be discussed in the section detailing how to obtain a balance of propensity scores in each stratum. A recent development called boosted regression can also be implemented to calculate propensity scores (McCaffrey, Ridgeway, &

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Organizational Research Methods 00(0)

Table 1b. Summary Statistics

Sample Statistics Age Education 10.35 2.01 10.09 1.61 14.12 12.12 3.08 68.05 11.14 2.59 34.29 49.61 Black 0.84 0.36 0.83 0.38 4.39 0.25 0.43 147.98 0.70 0.49 33.13 77.61 Hispanic Married Nodegree 0.06 0.24 0.11 0.31 17.46 0.03 0.18 12.86 0.05 0.22 3.48 72.94 0.19 0.39 0.15 0.36 9.36 0.87 0.34 184.23 0.45 0.42 64.37 65.06 0.71 0.46 0.83 0.37 30.40 0.31 0.46 87.92 0.41 0.49 62.44 28.98 RE74 RE75 N

NSW Treated M 25.82 SD 7.16 NSW Control M 25.05 SD 7.06 SB 10.73 M 34.85 PSID-1a SD 10.44 SB 100.94 M 30.96 PSID-1Mb SD 9.46 SB 61.35 Percentage 39.22 reduction in SB

2,095.57 1,532.06 185 4,886.62 3,219.25 2,107.03 1,266.91 260 5,687.91 3,102.98 0.22 8.39 19,428.75 19,063.34 2,490 13,406.88 13,596.95 171.78 177.44 1,1386.48 9,528.64 1,103 9,326.64 8,222.72 124.79 128.07 27.36 27.82

Note: SB standardized bias estimated using Formula 2.1; N number of cases. a PSID-1: All male house heads under age 55 who did not classify as retired. b PSID-1M is the subsample of PSID-1 that is matched to the treatment group (NSW treated).

Morral, 2004). Boosted regression can simplify the process of achieving balance in each stratum. Appendix A provides further discussion on this technique. Steiner, Cook, Shadish, and Clark (2010) replicated a prior study to show the importance of appropriately selecting covariates. They summarized three strategies for covariates selection: First, select covariates that are correctly measured and modeled. Second, choose covariates that reduce selection bias. These will be covariates that are highly correlated with the treatment (best predicted treatment) and with the outcomes (best predicted outcomes). Finally, if there was no prior theoretically or empirically sound guidance for the covariates selection (e.g., the research question is very new), scholars can measure a rich set of covariates to increase the likelihood of including covariates that satisfy the strongly ignorable assumption. After specifying the observational covariates, the propensity scores can be estimated using these observational variables. This article summarizes four different approaches that can be used to estimate the propensity scores. If there is only one treatment (e.g., training), then one can use a logistic model, probit model, or prepared program.3 If treatment has more than two versions (e.g., individuals receive several doses of medicine), then an ordinal logistic model can be used (Joffe & Rosenbaum, 1999). The treatment must be ordered based on certain threshold values. If there is more than one treatment and the treatments are discrete choices (e.g., Group 1 receives payment, Group 2 receives training), the propensity scores can be estimated using a multinomial logistic model. Receiving treatment does not need to happen at the same time. For many treatments, a decision needs to be made regarding whether to treat now or to wait and treat later. The decision to treat now versus later is driven by the participants preferences. Under this condition, one can use the Cox proportional hazard model to compute the propensity scores. Li, Propert, and Rosenbaum (2001) demonstrated that the hazard model has properties similar to those of propensity scores. Except for the Cox model that uses partial likelihood (PL) and does not require us to specify the baseline hazard function, the estimating technique used in the aforementioned models is maximum likelihood estimation (MLE) (see Greene, 2008, Chapter 16, for more information on MLE). The logistic models and the hazard model all assume a latent variable (Y*) that represents an underlying propensity or probability to receive treatment. Long (1997) argues that one can view a binary outcome variable as a latent variable. When the estimated probability is greater than a certain threshold or cut point (t), one observes the treatment (Y* > t; T 1). For an ordinal logistical model, one can
Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

understand the latent variable with multiple thresholds and observe the treatment according to the thresholds (e.g., t1 < Y* < t2; T 2). The multinomial logistical model can simply be viewed as the model that simultaneously estimates a binary model for all possible comparisons among outcome categories (Long, 1997), but it is more efficient to use a multinomial logistical model than using multiple binary models. It is somewhat tricky to generate the predicted probability from the Cox model because it is semiparametric with no assumption of the distribution of baseline. Two alternative choices can be used to better derive probability for survival model: (1) One can rely on a parametric survival model that specifies the baseline model; (2) one can transform the data in order to use the discrete-time model. To illustrate how to calculate propensity scores, this study employed treatment group data from the NSW and control group data from the observational data extracted from the PSID-2. Following Dehejia and Wahba (1999), I selected age, education, no degree, Black, Hispanic, RE74, RE75, age square, RE74 square, RE75 square, and RE74 Black as covariates to calculate propensity scores. To compute propensity scores, one can first run a logistic or probit model using a treatment dummy (whether an individual received training) as the dependent variable and the aforementioned covariates as the independent variables. Propensity scores can be obtained by calculating the fitted value from the logistic or probit models (use predict mypscore, p in STATA). Readers can refer Hoetker (2007) for more information on calculating probability from logit or probit models. After calculating propensity scores, Appendix B includes a randomly selected sample (n 50) from the combined data set NSW and PSID-2. Readers can obtain data for Appendix B, NSW treated, and PSID-2 from the author.

Step 2: Stratifying and Balancing the Propensity Scores

After estimating the propensity scores, the next step is to subclassify them into different strata such that these blocks are balanced on propensity scores. The number of balanced propensity score blocks depends on the number of observations in the data set. As discussed previously, five blocks are a good starting point to stratify the propensity scores (Rosenbaum & Rubin, 1983). One then can test the balance of each block by examining the distribution of covariates and the variance of propensity scores. The t test and the test for standardized bias (SB) are two widely used techniques to ensure the balance of the strata (Rosenbaum & Rubin, 1985). The t-test compares whether the means of covariates differ between the treated and the matched control groups. The SB approach calculates the difference of sample means in the treated and the matched control groups as a percentage of the square root of the average sample variance in both groups. To conduct the SB test, scholars need to compare values calculated before and after matching. The formula used to calculate the SB value can be written as jX1M X0M j SBmatch 100 p ; 0:5V1M X V0M X 2:1

where X1M V1M and X0M V0M are the means (variance) for the treated group and the matched control group. In addition to these two widely used tests, the Kolmogorov-Smirnovs two-sample test can also be used to investigate the overlap of the covariates between the treated and the control groups. Balanced strata between the treated and the matched control group ensure the minimal distance in the marginal distributions of the covariates. If any pretreatment variable is not balanced in a particular block, one needs to subclassify the block into additional blocks until all blocks are balanced. To obtain strata balance, researchers sometimes need to add high-order covariates and recalculate the propensity scores. Rosenbaum and Rubin (1984) detailed the process of cycling between checking for balance within strata and reformulating the propensity model. Two guidelines for adding high-order covariates have been proposed: (1) When the variances of a critical covariate are found

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Organizational Research Methods 00(0)

to differ dramatically between the treatment and the control group, the squared terms of the covariate need to be included in the revised propensity score model and (2) when the correlation between two important covariates differs greatly between the groups, the interaction of the covariates can be added to the propensity score model. Appendix B shows a simple example of stratifying data into five blocks after calculating the propensity scores. For this illustration, I stratified the 50 cases into five groups. I first identified the cases with propensity scores smaller than 0.05, which were classified as unmatched. When the propensity scores were smaller than 0.2 but larger than 0.05, I coded this as block 1 (Block ID 1). When the propensity scores were smaller than 0.4 but larger than 0.2, this was coded as block 2. This process was repeated until I had created five blocks, and then I conducted the t-test within each block to detect any significant difference of propensity scores between the treated and control groups. Tvalues for each block were added in the columns next to the column of Block ID. Overall, the t-test reveals that the difference of propensity scores between the treated and control groups is statistically insignificant. If the t-test shows that there are statistically significant differences in propensity scores, one should either change threshold value of propensity scores in each block or change the covariates to recalculate the propensity scores. When the propensity scores in each stratum are balanced, all covariates in each stratum should also achieve equivalence of distribution. To confirm this, one can conduct the t-test for each observational variable. To illustrate how balance of propensity scores within strata helps to achieve distribution overlap for other covariates, Appendix B reports the values for one continuous variable, age. One can conduct the t-test to ensure that there is no age difference between the treated and control groups within each stratum. The column Tage reports the t-test for age within the strata. After balancing each blocks propensity scores, the age difference between the treated and control groups in each block became statistically insignificant. I recommend that readers use a prepared statistic package to stratify propensity scores, as a program can simultaneously categorize propensity scores and conduct balance tests. For instance, one can use the -pscore- program in STATA (Becker & Ichino, 2002) to estimate, stratify, and test the balance of propensity scores. To further illustrate how the PSM can achieve strata balance, I replicated the aforementioned two procedures for the combined experimental data set and each of the observational data sets in Table 1a. Following Dehejia and Wahbas (1999) suggestions on choice of covariates, I first computed propensity scores for each data set. Then, the propensity scores were stratified and tested for the balance within each stratum. When the propensity scores achieved balance within each stratum, I plotted the means of propensity scores in each stratum for each matched data set. Figure 2 provides evidence that the means of the propensity scores are almost the same for each sample within each balanced block. To demonstrate the effectiveness of the PSM in adjusting for the balance of other covariates, Table 1b summarizes the means, standard errors, and SB of the matched sample. Comparing the results between the matched and unmatched samples, one can see that the difference of most observed characteristics between the experimental design and the nonexperimental design reduces dramatically. For instance, PSID-1 of Table 1b reports that the absolute SB values range from 12.86 to 184.23 (before using propensity score matching), but PSID-1M of Table 1b shows that the absolute minimum value of SB is 3.48 and the absolute maximum value of SB is 128.07. Furthermore, the t-test and the Kolmogorov-Smirnov sample test were conducted to examine the balance of each variable. As reported from Table 2, for the PSID-1 sample, except for RE74 in Block 3, one cannot see a p value smaller than 0.1. For simplicity, Table 2 uses only continuous variables that have been included for estimating the propensity scores to illustrate the effectiveness of the PSM in increasing the distribution overlap between the treated group and the matched control group. Overall, Table 2 shows strong evidence that after obtaining balance of propensity scores within a stratum, the covariates achieve overlap in terms of distribution. To preserve space, Table 1b and

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

0 .2 .4 .6 .8 1

Mean of Propensity Score

1 2 3 4 5 6 7

0 .2 .4 .6 .8 1 2 3 4 5 6 7 8

1 2 3 4 5 6 7 8

PSID-1: Control group PSID-1: Treated group

0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1

CPS-1: Control group CPS-1: Treated group

1 2 3 4 5 6 7

PSID-2: Control group

0 .2 .4 .6 .8 1

PSID-2: Treated group

0 .2 .4 .6 .8 1

CPS-2: Control group

CPS-2: Treated group

1 2 3 4 5 6 7 8 9

PSID-3: Control group PSID-3: Treated group

CPS-3: Control group

CPS-3: Treated group

Block ID
Figure 2. Means of propensity scores in balanced strata
Note: PSID Population Survey of Income Dynamics (PSID-1); CPS Current Population SurveySocial Security Administration File (CPS-1).

Table 2. Test of Strata Balance t-test for Matched Sample PSID-1 Block ID 1 2 3 4 5 6 7 Age 0.800 0.856 0.834 0.853 0.341 0.353 0.603 Education 0.995 0.319 0.765 0.378 0.816 0.196 0.574 RE74 0.283 0.632 0.077 0.744 0.711 0.888 0.791 RE75 0.685 0.627 0.641 0.874 0.113 0.956 0.747 Age 0.566 0.998 0.832 0.954 0.613 0.950 0.280 KS Test for Matcheda Education 1.000 0.894 1.000 0.999 0.844 0.942 0.828 RE74 0.697 0.983 0.044 0.949 0.512 0.466 1.000 RE75 0.984 0.998 0.851 0.754 0.026 0.878 1.000

Note: The table reports the p value of each variable for each stratum between National Supported Work Demonstration (NSW) Treated and matched control groups. PSID-1 1975-1979 Population Survey of Income Dynamics (PSID) where all male household heads under age 55 who did not classify as retired in 1975. a KS (Kolmogorov-Smirnov) two-sample test between NSW Treated and matched control groups.

Table 2 report statistics only for PSID-1. Readers can get a full version of these two tables by contacting the author. The aforementioned evidences generally support that the covariates are balanced for the treated and control groups.

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Organizational Research Methods 00(0)

Step 3: Estimating the Causal Effect

Because the data sets include an experimental design, one can compute the unbiased causal effect. Table 3 shows the estimated results of training on earnings in 1978 (RE78). The first row of Table 3 reports the benchmark values calculated using the experimental data. The unadjusted result ($1,794.34) was calculated by subtracting the mean of RE78 in the treated group (NSW Treated) from the mean of RE78 in the control group (NSW Control). The adjusted estimation ($1,676.34) was computed by using regression, controlling for all observational covariates. Because the experimental data compiled by Lalonde (1986) does not achieve the same distribution between the treated and control groups (Table 1b), this article uses the causal effect value calculated by the adjusted estimation as the benchmark value. From Table 3 column 1, it is obvious that if there are substantial differences among the pretreatment variables (as shown in Table 1b), using the mean difference to estimate the causal effect is strongly biased (it ranges from $15,204.78 to $1,069.85). In Table 3 column 2, a simple linear regression model was used to gauge the adjusted training effects. Column 2 shows that the estimated treatment effects (with a range from $699.13 to $1,873.77) are more reliable than those calculated using the mean differences. In addition to mean difference and regression, PSM can also be used to effectively estimate the ATT. When the propensity scores are balanced in all strata, one can use two standard techniques to compute the ATT: matched sampling (e.g., stratified matching, nearest neighbor matching, radius matching, and kernel matching) and covariance adjustment. Matched sampling or matching is a technique used to sample certain covariates from the treated group and the control group to obtain a sample with similar distributions of covariates between the two groups.4 Rosenbaum (2004) concluded that propensity score matching can increase the robustness of the model-based adjustment and avoid unnecessarily detailed description. The quality of the matched samples depends on the covariate balance and the structure of the matched sets (Gu & Rosenbaum, 1993). Ideally, exact matching on all confounding variables is the best matching approach because the sample distribution of all confounding variables would be identical in the treated and control groups. Unfortunately, exact matching on a single confounding variable will reduce the number of final matched cases. Supposing that there are k confounding variables and each variable has three levels, there will be 3k patterns of levels to get perfectly matched samples. Thus, it is impractical to use the exact matching technique to get the identical distribution of confounding variables between the two groups. The PSM is more appropriate than exact matching because it reduces the covariates from k-dimensional to one-dimensional. Rosenbaum and Rubin (1983) also showed that the PSM not only simplified the matching algorithm, but also increased the quality of the matches.

Stratified Matching
After achieving strata balance, one can apply stratified matching to calculate the ATT. In each balanced block, the average differences in the outcomes of the treated group and the matched control group are calculated. The ATT will be estimated by the mean difference weighted by the number of treated cases in each block. The ATT can be expressed as P P C Q T X i2I q YiT Nq j2I q Yj ATT T; 2:2 T C Nq Nq N q1
T C where Q denotes the number of blocks with balanced propensity scores, Nq and Nq refer to the numT C ber of cases in the treated and the control groups for matched block q, Yi andYj represent the observational outcomes for case i in the matched treated group q and case j in the matched control group q, respectively, and N T stands for the total number of cases in the treated group.

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Table 3. Estimation Results Matching Stratified ATTc 3 ATT 5 Nd 4 Nd 6 ATTe 7 Nd 8 ATTf 9 Nd 10 Neighbor Radius Kernel Covariate Adjustment ATTg 11 Nd 12

ATT

Unadjusteda 1

Adjustedb 2

NSW 1,288 308 250 4,563 1,438 508 273 271 79 53 280 102 217 167 231 77 248 37

1,794.34

PSID-1h

15,204.78

1153 297 245 4,144 1,416 493

1,288 308 250 4,563 1,438 508

PSID-2h

3,646.81

PSID-3h

1,069.85

CPS-1i

8,497.52

CPS-2i

3,821.97

CPS-3i

635.03

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Meanj Variancej

5,122.71 3,5078,950.9

1,676.34 (638.68) 751.95 (915.26) 1,873.77 (1,060.56) 1,833.13 (1,159.78) 699.13 (547.64) 1,172.70 (645.86) 1,548.24 (781.28) 1,313.15 270,327.32 1,637.43 (805.43) 1,467.04 (1,461.75) 1,843.20 (981.42) 1,488.29 (716.79) 1,676.43 (796.62) 1,505.49 (1,065.52) 1,602.98 21,084.82 1,654.57 (1,174.63) 1,604.09 (1,092.40) 1,522.23 (1,920.24) 1,600.74 (957.05) 1,638.74 (1,014.64) 1,376.65 (1,129.24) 1,566.17 10,712.11 1,871.44 (5,837.10) 1,519.60 (2,110.71) 1,632.74 (1,598.12) 1,890.13 (1,993.50) 1,775.99 (2,286.23) 1,307.63 (2,821.56) 1,544.47 45,779.09 1,507.10 (826.11) 1,712.18 (1,226.90) 1,776.37 (1,425.32) 1,513.78 (726.47) 1,590.49 (736.85) 1,166.93 (864.38) 1,666.26 51,101.52

1,952.23 (791.45) 1,593.32 (1,476.54) 1,583.41 (1,866.46) 1,634.81 (515.58) 1,550.90 (625.04) 1,572.09 (943.65) 1,647.80 23,016.46

Note: Bootstrap with 100 replications was used to estimate standard errors for the propensity score matching; standard errors in parentheses. a The mean difference between treatment group (NSW Treated) and corresponding control groups (NSW Control, PSID-1 to CSP-3). b Least squares regression: regress RE78 (earning in 1978) on age, treatment dummy, education, no degree, Black, Hispanic, RE74 (earning in 1974), and RE75 (earning in 1975). c Stratifying blocks based on propensity scores, and then use Formula 2.2 to estimate ATT (average treatment effect on treated). d The total number of observations, including observations in NSW Treated and corresponding matched control groups. e For Kernel matching, when the number of cases is small, use narrower bandwidth (.01) instead of .06. f Radius value ranges from .0001 to .0000025. g Use regression, take weights, which are defined by the number of treated observations in each balanced propensity score block. h Observational covariates: age, treatment dummy, education, no degree, Black, Hispanic, RE74, and RE75. Higher order covariates: age2, RE742, RE752, RE74 Black. i Observational covariates: same as h; high-order covariates: age2, education2, RE742, RE752, Education RE74. j Mean and variance are calculated using estimated ATT for each technique.

Organizational Research Methods 00(0)

After stratifying data into different blocks, one can calculate the ATT using data listed in AppenP T Yi (the summation of the outcome variable in each block for each dix B. First, one can compute of the treated cases, denoted as YiT in Appendix B) and
i2I 1

able in each block for each of the control cases, denoted as YjC in Appendix B). For example, in block 1 the summation of the outcome for two treated cases is 49,237.66, and the summation of the T outcome for five control cases is 31,301.69. The number of cases in the treatment (N1 ) and the conC trol group (N1 ) for matched block 1 is 2 and 5, respectively. One then can calculate the ATT for each block. For instance, ATTq1 (for block 1) 49,237.66/2 31,301.69/5 18,388.492. After computing the ATT for each block, one can get weighted ATTs using the weight given by the fraction of treated cases in each block. For example, the weight for block 1 is 0.08 (two treated cases in block 1 divided by 25 treated cases in total). The final ATT is estimated by taking a summation of the weighted ATT ($1,702.321), which means that individuals who received training will, on average, earn around $1,702.321 more per year than their counterparts who did not obtain governmental training. The estimated ATT using simple regression is $2,316.414. Comparing this with the true treatment effect in Table 3 ($1,676.34), one can see that the PSM produces an ATT substantively similar to the actual casual effect, given that the propensity scores of every block are balanced. I also conducted another simulation with 200 randomly selected cases from NSW and PSID-2 for 50 times. The average ATT calculated by the PSM is $1,376.713, whereas the average ATT computed by regression analysis is $709.039. Clearly, the PSM produces an ATT closer to the true causal effects than does the ordinary least squares (OLS). I further examined the balance test for each of these 50 randomly drawn data sets. Thirteen of 50 data sets did not achieve strata balance. The average ATT calculated by the PSM was $979.612, and the average ATT calculated by OLS was $697.626. For the remaining 37 data sets that achieved strata balance, the average ATT calculated by the PSM was $1,516.23, and the average ATT calculated by OLS was $713.04. Therefore, achieving balance of propensity scores in each stratum is very important for obtaining a less biased estimator of causal effect. I also provided SPSS code in Appendix C and STATA code in Appendix D, which readers can adjust appropriately to other statistical packages for stratified matching. The codes show how to fit the model with the logit model, calculate propensity scores, stratify propensity scores, conduct the balance test, and compute the ATT using stratified matching. It is also convenient to implement the procedure in Excel after calculating the propensity scores using other statistical packages. Readers who are interested in Excel calculation can contact the author directly to obtain the original file for the calculation in Appendix B. Moreover, Appendix E also presents a table that reports the PSM prewritten software in R, SAS, SPSS, and STATA for readers to conveniently find appropriate statistical packages. Combining NSW Treated with other observational data sets, column 3 of Table 3 further details the estimated ATT using stratified matching. Column 3 shows that the lowest estimated result is $1,467.04 (PSID-2) and the highest estimation of the treatment effect is $1,843.20 (PSID-3). Overall, stratified matching produces an ATT relatively close to the unbiased ATT ($1,676.34).

j2I 1

YjC (the summation of the outcome vari-

Nearest Neighbor and Radius Matching

Nearest neighbor (NN) matching computes the ATT by selecting n comparison units whose propensity scores are nearest to the treated unit in question. In radius matching, the outcome of the control units matches with the outcome of the treated units only when the propensity scores fall in the predefined radius of the treated units. A simplified formula to compute the estimated treatment effect using the nearest neighbor matching or the radius matching technique can be written as

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

ATT

1 X T 1 X C Yi C Y ; N T i2T Ni j2C j

2:3

where NT is the number of cases in the treated group and NiC is a weighting scheme that equals the number of cases in the control group using a specific algorithm (e.g., nearest neighbor matching, NiC , will be the n comparison units with the closest propensity scores). For more information, readers can consult Heckman et al. (1997). For NN matching, one can randomly draw either backward or forward matches. For example, in Appendix B, for case 7 (propensity score 0.101), one can draw forward matches and find the control case (case 2) with the closest propensity score (0.109). Drawing backward matches, one can find case 1 with the closest propensity score (0.076). After repeating this for each treated case, one can calculate the ATT using Formula 2.3. For radius matching, one needs to specify the radius first. For example, suppose one sets the radius at 0.01, then the only matched case for case 7 is case 2, because the absolute value of the difference of the propensity scores between case 7 and case 2 is 0.008 (|0.101 0.109|), smaller than the radius value 0.01. One can repeat this matching procedure for each of the treated cases and use Formula 2.3 to estimate the ATT. In Table 3, column 5 reports the estimated ATT using NN matching, which produced an ATT with a range from $1,376.65 (CPS-3) to $1,654.57 (PSID-1). Column 7 describes the estimated ATT using the radius matching, which generated an ATT with a range from $1,307.63 (CPS-3) to $1,890.13 (CPS-1).

Kernel Matching
Kernel matching is another nonparametric estimation technique that matches all treated units with a weighted average of all controls. The weighting value is determined by distance of propensity scores, bandwidth parameter hn, and a kernel function K(.). Scholars can specify the Gaussian kernel and the appropriate bandwidth parameter to estimate the treatment effect using the Formula 2.4 ej x ei x X 1 X T X C ek x ei x fYi Yj K K = g; 2:4 ATT T N i2T hn hn j2C k2C where ej x denotes the propensity score of case j in the control group and ei x denotes the propensity score of case i in the treated group, and ej x ei x represents the distance of the propensity scores. When one applies kernel matching, one downweights the case in the control group that has a long distance from the case in the treated group. The weight function K : in Equation 2.4 takes large values when ej x is close to ei x. To show how it happens, suppose one chooses Gaussian density ej x ei x 1 2 and hn 0.005, and wants to match treated function K z p ez =2 where z hn 2p case 14 with control cases 10 and 11 (Appendix B). One then can compute z values for case 10 ([0.282 0.312]/0.05 0.6) and case 11 ([0.313 0.312]/0.05 0.02). The weights for case 10 and 11 are 0.33 (k(0.6)) and 0.40 (k(0.02)), respectively. Clearly, the weight is low for case 10 (0.33) that has a long distance of propensity score with treated case 14 (0.282 0.312 0.04), whereas the weight is relatively large for case 11 (0.40) that has a short distance of propensity score with case 14 (0.313 0.312 0.001). For more information on kernel matching, readers can refer to Heckman et al. (1998). In Table 3, column 9 shows the results for the kernel matching. The estimated ATT using the kernel matching technique ranges from $1,166.93 (CPS-3) to $1,776.37 (PSID-3).

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Organizational Research Methods 00(0)

Covariance Adjustment
Covariance adjustment is a type of regression adjustment that weights the regression using propensity scores. The matching process does not consider the variance in the observational variables because the PSM can balance the difference in the pretreatment variables in each block. Therefore, the observational variables in the balanced strata do not contribute to the treatment assignment and the potential outcome. Although each block has a balanced propensity score, the pretreatment variables may not have exactly the same distributions between the treatment group and the control group. Table 2 provides evidence that although the propensity scores are balanced in each stratum, the distributions of some variables do not fully overlap. For example, RE74 are statistically different between the treated and the matched control group for PSID-1. Covariate adjustment is achieved by using a matched sample to regress the treatment outcome on the covariates with appropriate weights for unmatched cases and duplicated cases. Dehejia and Wahba (1999) estimated the causal effect by conducting within-stratum regression, taking a weighted sum over the strata. Imbens (2000) proposed that one can use the inverse of one minus the propensity scores as the weight for each control case and the inverse of propensity scores as the weight for each treated case. Rubin (2001) provided additional discussion on covariate adjustment. Unlike matched sapling, covariance adjustment is a hybrid technique that combines nonparametric propensity matching with parametric regression. Column 11 of Table 3 reports the results of the covariance adjustment, which were produced by regressing RE78 on all observational variables, weighted by number of treated cases in each block. This approach generates an ATT ranging from $1,550.90 (CPS-2) to $1,925.23 (PSID-1). Researchers have suggested two ways to calculate the variance of the nonparametric estimators of the ATT. First, Imbens (2004) suggested that one can estimate the variance by calculating each of five components5 included in the variance formula. The asymptotic variance can generally be estimated consistently using kernel methods, which can consistently compute each of these five components. The bootstrap is the second nonparametric approach to calculate variance (Efron & Tibshirani, 1997). Efron and Tibshirani (1997) argued that 50 bootstrap replications can produce a good estimator for standard errors, yet a much larger number of replications are needed to determine the bootstrap confidence interval. In Table 3, 100 bootstrap replications were used to calculate the standard errors for the matching technique. In addition to calculating the variance nonparametrically, one can also compute it parametrically if covariance adjustment is used to produce the ATT. In Table 3, for the covariate adjustment technique, the standard errors in Column 11 of Table 3 were generated by linear regression.

Choosing Techniques
This article has reviewed different techniques for gauging the ATT. The performance of these strategies differs case by case and depends on data structure. Dehejia and Wahba (2002) demonstrated that when there is substantial overlap in the distribution of propensity scores (or balanced strata) between the treated and control groups, most matching techniques will produce similar results. Imbens (2004) remarked that there are no fully applicable versions of tools that do not require applied researchers to specify smoothing parameters. Specifically, little is still known about the optimal bandwidth, radius, and number of matches. That being said, scholars still need to consider particular issues in choosing the techniques that their research will employ. For nearest neighbor matching, it is important to determine how many comparison units match each treated unit. Increasing comparison units decreases the variance of the estimator but increases the bias of the estimator. Furthermore, one needs to choose between matching with replacement and

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Number of matched neighbor ; Bias ; Variance Nearest neighbor Match without replacement; Bias ; Variance Radius matching Matched sampling Weighting: kernel function (e.g., Gaussian) Kernel matching Bandwidth ; Bias ; Variance Balanced strata Stratified matching Weighting: fraction of treated cases within strata Maximum value of radius ; Bias ; Variance

Number of treated cases in each stratum Covariate adjustment Weighting Inverse of propensity score for treated case

Figure 3. Choosing techniques

matching without replacement (Dehejia & Wahba, 2002). When there are few comparison units, matching without replacement will force us to match treated units to the comparison ones that are quite different in propensity scores. This enhances the likelihood of bad matches (increase the bias of the estimator), but it could also decrease the variance of the estimator. Thus, matching without replacement decreases the variance of the estimator at the cost of increasing the estimation bias. In contrast, because matching with replacement allows one comparison unit to be matched more than once with each nearest treatment unit, matching with replacement can minimize the distance between the treatment unit and the matched comparison unit. This will reduce bias of the estimator but increase variance of the estimator. In regard to radius matching, it is important to choose the maximum value of the radius. The larger the radius is, the more matches can be found. More matches typically increase the likelihood of finding bad matches, which raises the bias of the estimator but decreases the variance of the estimator. As far as kernel matching is concerned, choosing an appropriate bandwidth is also crucial because a wider bandwidth will produce a smoother function at the cost of tracking data less closely. Typically, wider bandwidth increases chance of bad matches so that the bias of the estimator will also be high. Yet, more comparison units due to wider bandwidth will also decrease the variance of the estimator. Figure 3 summarizes the issues that scholars need to consider before choosing appropriate techniques. For organizational scholars, I recommend using stratified matching and covariate adjustment for the following reasons: First, these two techniques do not require scholars to choose specific smoothing parameters. The estimation of the ATT from these two techniques requires minimum statistical knowledge. Second, the weighting parameters can be easily constructed from the data. One can use a similar version of weighting parameters (the number of treated cases in each block) for both techniques. For stratified matching, one calculates the number of treated cases in each stratum, and then the proportion of treated cases will be computed. For covariate adjustment, one can use the number of treated cases as weights in the regression model. Finally, the performance of these two approaches (Table 3) is relatively close to other matching techniques. Overall, these two techniques are not only relatively simple, but can also produce a reliable ATT.

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Organizational Research Methods 00(0)

Step 4: Sensitivity Test

The sensitivity test is the final step used to investigate whether the causal effect estimated from the PSM is susceptible to the influence of unobserved covariates. Ideally, when an unbiased causal effect is available (e.g., the benchmark ATT estimated from the experimental design), scholars can compare the ATT generated by the PSM with the unbiased ATT to assure the accuracy of the PSM. However, in most empirical settings, an unbiased ATT is not available. Rosenbaum (1987) proposed that multiple comparison groups are valuable in detecting the existence of important unobserved variables. For example, one can use multiple control groups to match the treated group to calculate multiple treatment effects. One can have a sense of the reliability of the estimated ATT comparing the effect size of different treatment effects. Table 3 reports results for such sensitivity test by drawing on multiple groups. One can compare the ATT for between PSID-1 and other data sets to confirm the effectiveness of stratified matching. Alternatively, one can match two control groups. If the results show that causal effects are statistically different between these two control groups, then one can conclude that the strongly ignorable assumption is violated. In practice, however, scholars will ordinarily not have multiple comparison groups or unbiased causal effect gauged from experimental data. How then can one conduct a sensitivity test? Three approacheschanging the specification in the equation, using the instrumental variable, and Rosenbaum Bounding (RB)can be implemented. To conduct a sensitivity test by changing the specification in the equation, scholars first need to change the specification by dropping or adding highorder covariates such as quadratic or interaction terms. After changing the specification, scholars should recalculate the propensity scores and the causal effect. Comparison of the newly calculated causal effect and the originally computed causal effect will reveal how reliable the originally computed causal effect is. This technique is similar to Dehejia and Wahbas (1999) suggestion of selecting based on observables. Selecting based on observables informs researchers whether the treatment assignment is strongly ignorable, the precondition for the PSM to produce an unbiased estimation. Table 4a shows the sensitivity analysis when I dropped higher-order pretreatment variables. By using only the observational variables, column 1 demonstrates that the estimated results of stratifying matching range from $813.20 (PSID-2) to $1,348.56 (CPS-1). Column 3 summarizes the estimated results by using the nearest neighbor technique. The lowest estimated result of the casual effect is $996.59 (PSID-2) and the highest estimated result of the causal effect is $1,855.61 (PSID-3). Column 5 reports the results of radius matching with a range from $835.68 (PSID-1) to $2,110.03 (PSID-2). In column 7 of Table 4a, the estimated ATTs range from $831.12 (PSID-1) to $1,778.12 (PSID-2). Finally, covariate adjustment shows the treatment effects ranging from $1,342.50 (CPS-1) to $2,328.20 (PSID-1). It is important to emphasize that after dropping the high-order covariates, the balancing property is not satisfied for all the matched control samples. When one lacks an unbiased estimator and multiple comparison groups, the instrumental variable (IV) method is another technique that can be used to assess the bias of the causal effects estimated by the PSM. DiPrete and Gangl (2004) argued that the IV estimation can produce a consistent and unbiased estimation of the causal effect when the IVs are appropriately chosen, but this method generally reduces the efficiency of the causal estimators and introduces some uncertainty because of its reliance on additional assumptions. Usually, for public policy studies, a grouping variable that divides samples into a number of disjointed groups can be selected as an instrumental variable.6 For example, Angrist, Imbens, and Rubin (1996) used the lottery number as the instrumental variable to estimate the causal effect of Vietnam War veteran status on mortality. The rationale behind using lottery numbers is that they correlate with the treatment variable (whether to serve in the military) because a low lottery number would potentially get called to serve in the military. On the other hand, a lottery number is a random number that does not correlate with the error term. Thus the lottery number serves as a good instrument for the endogenous variableserving in the Vietnam War. One

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Table 4a. Sensitivity Test Matching Stratified ATT 1 PSID-1 N 2 Neighbor ATT 3 N 4 257 232 229 380 297 284 Radius ATT 5 N 6 Kernel ATT 7 N 8 ATT 9 N 10 1,345 369 270 5,961 1,747 557 Covariate Adjustment

1,342.40 1,345 1,545.52 (763.09) (1,093.77) PSID-2 813.20 369 996.59 (1,081.68) (,1643.11) PSID-3 1,035.09 270 1,855.61 (1,091.28) (1,703.87) CPS-1 1,348.56 5,961 1,765.35 (651.14) (869.69) CPS-2 1,301.86 1,747 1,108.86 (714.36) (995.48) CPS-3 1,077.56 557 1,346.78 (707.68) (1,019.54) Mean 1,153.11 1,436.45 Variance 46,267.12 120,918.64

835.68 21 831.12 1,260 2,328.20 (3,877.08) (805.65) (693.69) 2,110.03 17 1,778.12 357 2,145.41 (2,999.31) (1,000.81) (1,143.55) 1,764.55 219 1,724.97 269 1,535.83 (1,269.51) (1,283.44) (1,400.24) 1,194.55 129 1,186.89 5,851 1,342.50 (1,855.94) (578.68) (470.60) 1,296.92 79 1,049.00 1,742 1,570.37 (2,341.93) (654.90) (478.94) 868.22 53 1,269.21 554 1,357.84 (2,752.29) (704.80) (685.77) 1,306.55 1,344.99 1,713.36 141,108.36 254,592.59 176,117.80

Note: All the sensitivity tests used only observational covariates: age, education, no degree (no high school degree), Black, Hispanic, RE74 (earning in 1974), and RE75 (earning in 1975). No high-order covariates are included; bootstrap with 100 replications was used to estimate standard errors for the propensity score matching; ATT: average treatment effect on treated. Standard errors in parentheses.

Table 4b. Sensitivity Test PSID-1 G 1.00 1.05 1.10 1.15 1.20 1.25 1.30 p-criticala 0.042 0.074 0.119 0.177 0.246 0.325 0.409 Lower Bound 216.997 57.226 26.215 188.640 343.541 455.599 621.988 Upper Bound 1,752.880 1,941.530 2,090.720 2,293.670 2,478.540 2,627.530 2,778.500 p-criticala 0.006 0.013 0.025 0.044 0.072 0.110 0.157 CPS-2 Lower Bound 641.387 468.296 320.627 196.642 43.579 4.340 112.684 Upper Bound 2,089.060 2,262.150 2,413.840 2,545.930 2,741.260 2,894.800 3,039.860

Note: G The odds ratio that individuals will receive treatment. a Wilcoxon signed-rank gives the significance test for upper bound.

can compare the estimate of the causal effect from the PSM with the IV estimators to determine the accuracy of the estimators calculated by the PSM. Unfortunately, the limited number of covariates in these data sets prevents me from using the IV approach to conduct the sensitivity analysis. Readers who are interested in this topic can find examples from Angrist et al. (1996) and DiPrete and Gangl (2004). Wooldridge (2002) provides further theoretical background on how IV can be used when one suspects the failure of a strongly ignorable assumption. Finally, Rosenbaum (2002, Chapter 4) proposed a bounding approach to test the existence of hidden bias, which potentially arises to make the estimated treatment effect biased. Suppose u1i and u0j are unobserved characteristics for individuals i and j in the treated group and the control group. G

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Organizational Research Methods 00(0)

refers to the effect of these unobserved variables on treatment assignment. The odds ratio that individuals receive treatment can be simply written as G exp(u1i u0j). If the unobserved variables u1i and u0j are uninformative, then the assignment process is random (G 1) and the estimated ATT and confidence intervals are unbiased. When the unobserved variables are informative, then the confidence intervals of the ATT become wider and the likelihood of finding support for the null hypothesis increases. Rosenbaum Bounding sensitivity test changes the effect of the unobserved variables on the treatment assignment to determine the end point of the significant test that leads one to accept the null hypothesis. Diprete and Gangl (2004) implemented the procedure in STATA for testing the continuous outcomes, however, their program only works for one to one matching. Becker and Caliendo (2007) also implemented this method in STATA but for testing the dichotomous outcome. Table 4b presents an example of using the RB test. The table reports only the test for PSID-1 and CPS-2 because the t-values for the ATT estimated using stratified matching show strong evidence of treatment effect. By varying the value of G, Table 4b reports the p value as well as the upper and lower bounds of the ATT. The Wilcoxon signed-rank test generates a significance test at a given level of hidden bias specified by parameter G (DiPrete & Gangl, 2004). As reported from Table 4b, the estimated ATT is very sensitive to hidden bias. As far as PSID-1 is concerned, when the critical value of G is between 1.05 and 1.10 (the unobserved variables cause the odds ratio of being assigned to the treated group or the control group to be about 1.10), one needs to question the conclusion of the positive effect of training on salary in the year 1978. In regards to the CPS-2 sample, when the critical value of G is between 1.20 and 1.25, one should question the positive effect of training on future salary. Yet, a value for G of 1.25 in CPS-2 does not mean that one will not observe the positive effect of training on future earnings; it only means that when unobserved variables determine the treatment assignment by a ratio of 1.25, it will be so strong that the salary effect would include zero and that unobserved covariates almost perfectly determine the future salary in each matched case. RB presents a worst-case scenario that assumes treatment assignment is influenced by unobserved covariates. This sensitivity test conveys important information about how the level of uncertainty involved in matching estimators will undermine the conclusions of matched sampling analyses. The simple test in Table 4b generally reveals that the causal effect of training is very sensitive to hidden biases that could influence the odds of treatment assignment.

Future Applications of the Propensity Score Method

To my knowledge, no publications in the management field have implemented the PSM in an empirical setting, yet other social science fields have empirically applied the PSM. Thus, before offering suggestions for applying the PSM to the field, I will provide an overview of how scholars in relevant social science fields (e.g., economics, finance, and sociology) employ the PSM in their empirical studies. Most applications of the PSM come from the evaluation of public policy by economists (e.g., Dehejia & Wahba, 1999; Lechner, 2002). Early implementation of the PSM intended to examine whether this technique effectively reduces bias stemming from the heterogeneity of participants. Economists generally agreed that the PSM is appropriate for examining causal effects using observational data. Recent application by Couch and Placzek (2010), for example, used the PSM to calculate the ATT without any concern regarding the legitimacy of the technique. Combining the PSM and the average difference-in-difference approaches, Couch and Placzek (2010) found that mass layoff decreased earnings at 33%. To provide a concise overview of the PSM in other social science fields, I conducted a Web of Science search calling up articles that cited Rosenbaum and Rubins 1983 paper. Because most citations came from health-related fields, I limited the search to fields such as economics, sociology, and business finance that are relevant to management. Overall, in early 2012, I found 674 articles in these three fields that have cited Rosenbaum and Rubins article. Fewer than 100 articles were

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

published before 2002, yet around 300 articles were published between 2009 and 2011. I first randomly selected one to two empirical studies from these top economics journals: American Economic Review, Econometrica, Quarterly Journal of Economics, and Review of Economic Studies. I then randomly selected one to two empirical articles from two top sociology journals: American Journal of Sociology and American Sociological Review. I finally randomly selected one to two studies from three top financial journals: Journal of Finance, Journal of Financial Economics, and Review of Financial Studies. Table 5 summarizes the data, analytical techniques, and key findings of these empirical articles employing the PSM in their fields. Given that management scholars have relied on observational data sets, using the PSM will be fundamentally helpful in discovering the effectiveness of management interventions, including areas such as strategy, entrepreneurship, and human resource management. For strategy scholars, future research can use the PSM to examine whether firms that adopt long-term incentive plans (e.g., stock options and stock ownership) can increase overall performance. Apparently, the data used in this type of study are not experimental. Future research can use the PSM to adjust the distribution between firms using long-term incentive policies and ones that have not adopted such policies. Indeed, the PSM can be widely used by strategy scholars who want to examine the outcomes of certain strategies. For example, one can examine whether duality (the practice of the CEO also being the Chairman of the Board) has real implications for stock price and longterm performance. The PSM can also be used in entrepreneurship research. Wasserman (2003) documented the paradox of success in that founders were more likely to be replaced by professional managers when founders led firms to an important breakthrough (e.g., the receipt of additional funding from an external resource). Future research can further explore this question by investigating which types of funding lead to turnover in the top management team in newly founded firms. For example, scholars can examine whether funding received from venture capitalists (VCs) has a different effect on executive turnover than that obtained from a Small Business Innovative Research (SBIR) program. Similarly, using the PSM, scholars can examine how other interventions, such as a business plan, can affect entrepreneurial performance. Like strategy scholars, entrepreneurship researchers can implement the PSM in many other questions. The PSM can also be widely implemented by strategic human resource management (SHRM) scholars. A major interest in SHRM literature is whether HR practices contribute to firm performance. One can implement the PSM to investigate whether HR practices (e.g., downsizing) contribute to firm performance. When the strongly ignorable assumption is satisfied, the PSM provides an opportunity for HR scholars to document a less biased effect size between HR practices and firm performance. HR researchers can adjust the distributions of the observational variables and then estimate the ATT of the HR practices on firm performance. In conclusion, the PSM is an effective technique for scholars to reconstruct counterfactuals using observational data sets.

Discussion
Research in other academic fields has documented the effectiveness of the PSM. Yet, like other methods, the PSM has its strength and weakness. The first advantage in using the PSM is that it simplifies the matching procedure. The PSM can reduce k-dimension observable variables into one dimension. Therefore, scholars can match observational data sets with k-dimensional covariates without sacrificing many observations or worrying about computational complexity. Second, the PSM eliminates two sources of bias (Heckman et al., 1998): bias from nonoverlapping supports and bias from different density weighting. The PSM increases the likelihood of achieving distribution overlap between the treated and control groups. Moreover, this technique reweights nonparticipant

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

24 Data Because of the nonrandom selection issues in the labor market, the propensity score matching technique and instrumental variables were used to examine the voluntary military service on earnings. Analytical Technique Key Findings Soldiers serving in the military in the early 1980s were paid more than comparable civilians. Military service increased the employment rate for veterans after service. Military service led to only a modest long-run increase in earnings for non-White veterans, but reduced the civilian earnings of White veterans. Credit constrained firms burned more cash, sold more assets to fund their operation, drew more heavily on lines of credit, and planned deeper cuts in spending. In addition, inability to borrow forced many firms to bypass lucrative investment opportunities. CFOs were asked to report whether their firms were credit constrained or not. Demographics of asset size, ownership form, and credit ratings were used to predict propensity scores. Average treatment effects of constrained credit were estimated by comparing the difference of spending between constrained and unconstrained firms. Propensity score matching on observable variables was used to reduce individual heterogeneity. Propensity score estimators calculating average treatment effects on treated (ATT) and the average difference-indifference showed that earning losses were 33% at the time of mass layoff and 12% 6 years later. (continued)

Table 5. Empirical Studies Applying the Propensity Score Method (PSM)

Author(s)

Angrist (1998), Econometrica

Military data come from Defense Manpower Data Center. Earnings data come from Social Security Administration.

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Campello, Graham, and Harvey (2010), Journal of Financial Economics

1,050 Chief Financial Officers (CFOs) were surveyed

Couch and Placzek (2010), American Economic Review

State administrative files from Connecticut

Table 5. (continued) Data Propensity score matching was used to match non-current loans to currents loans. Propensity score is calculated using observational variables including credit rating, firm industry, and other variables. Analytical Technique Key Findings

Author(s)

Drucker and Puri (2005), Journal of Finance

They combined data sets from multiple databases. They collected data on seasoned equity issuers, including credit rating, stock return, lending history, and insurance history.

Frank, Akresh, and Lu (2010), American Sociological Review

Data were collected from New Immigrant Survey with around 1,000 cases.

They used ordinal logistic model to calculate propensity scores, which were used to estimate the effect of skin color on earnings.

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Gangl (2006), American Sociological Review

Survey of Income and Program Participation (SIPP) and European Community Household Panel (ECHP)

Difference-in-difference propensity score matching

Grodsky (2007), American Journal of Sociology

Data came from a number of sources, including the representative samples of students who completed high school in 1972, 1982, and 1992.

In the first stage, propensity score was used to adjust for selection on observational variables. In the second stage, the author examined the type of college a student will attend controlling for propensity scores.

Overall, underwriters (commercial banks and investment banks) engaged in concurrent lending and provide discounts. In addition, concurrent lending helped underwriters build relationships, which help underwriters increase the probability of receiving current and future business. They found an average difference of $2,435.63 difference between lighter and darker skinned individuals. In other words, darker skin individuals earn around $2,500 less per year than counterparts. Gangl found strong evidence that postunemployment losses are largely permanent, and such effect is particularly significant for older and high-wage workers as well as for female employees. The author found the evidence that a wide range of institutions engage in affirmative action for African American students as well as for Hispanic students. (continued)

26 Data Propensity matching, nonparametric conditional difference-in-difference Analytical Technique Key Findings Multinomial model was used to estimate propensity scores of discrete choices (basic training, further training, employment program, and temporary wage subsidy). After decomposing program evaluation bias into a number of components, it was found that selection bias due to unobservable variable is less important than other components. Matching technique can potentially eliminate much of the bias. The empirical evidence revealed support for the fact that the propensity score matching can be an informative tool to adjust for individual heterogeneity when individuals have multiple programs to be selected. They found that award-winning CEOs underperform over the 3 years following the award: Relative underperformance is between 15% to 26%. Specialist CEOs, defined as CEOs who have promoted from a certain divisions of their firm, negatively affect segment investment efficiency. They used propensity score matching to create counterfactual sample for nonwinning CEOs. Nearest neighbor matching technique, both with and without bias adjustment, was used to identify the counterfactual sample. Ordinary least square was used as the major technique. The propensity score method was used as robust check to address the issue of endogenous selection of CEO.

Table 5. (continued)

Author(s)

Heckman, Ichimura, and Todd (1997), Review of Economic Studies

The National Job Training Partnership Act (JTPA) and Survey of Income and Program Participation (SIPP)

Lechner (2002), Review of Economic Studies

Unemployed individuals in Zurich, a region of Switzerland, in periods 1997-1999.

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Malmendier and Tate (2009), Quarterly Journal of Economics

Hand-collected list of the winners of CEO awards between 1975 and 2002

Xuan (2009), Review of Financial Studies

S&Ps executive compensation between 1993 and 2002

data to obtain equal distribution between the treated and control groups. Third, if treatment assignment is strongly ignorable, scholars can use the PSM on observational data sets to estimate an ATT that is reasonably close to the ATT calculated from experiments. Fourth, the matching technique, by its nature, is nonparametric. Like other nonparametric approaches, this technique will not suffer from problems that are prevalent in most parametric models, such as the assumption of distribution. It generally outperforms simple regression analysis when the true functional form for the regression is nonlinear (Morgan & Harding, 2006). Finally, the PSM is an intuitively sounder method for dealing with covariates than is traditional regression analysis. For example, the idea that covariates in both the treated group and the control group have the same distributions is much easier to understand than the interpretation using control all other variables at mean or ceteris paribus. Even for regression, without appropriately adjusting for the covariate distribution, one can get an ATT with the regression technique despite the fact that no meaningful ATT exists. Despite its many advantages, the PSM also has its limitations. Like other nonparametric techniques, the PSM generally has no test statistics. Although the bootstrap technique can be used to estimate the variance, such techniques are not fully justified or widely accepted by researchers (Imbens, 2004). Hence, the use of the PSM may be limited because while it can help scholars draw causal inferences, it cannot help with drawing statistical inferences. Another key hurdle of this method is that there are currently no established procedures to investigate whether treatment assignment is strongly ignorable. Heckman et al. (1998) demonstrated that the PSM cannot eliminate bias due to unobservable differences across groups. The PSM can reweight observational covariates, but it cannot deal with unobservable variables. Some unobservable variables (e.g., environmental context, region) can increase the bias of the ATT estimated using the PSM. Third, even when the treatment assignment is strongly ignorable, the accuracy of the ATT estimated by the PSM depends on the quality of the observational data. Thus, measurement error (cf. Gerhart, Wright, & McMahan, 2000) and nonrandom missing values can affect the estimated ATT. Finally, although there are a few propensity score matching techniques, one can find little guidance on which types of matching techniques work best for different applications. Overall, despite its shortcomings, the PSM can be employed by management scholars to investigate the ATT of management interventions. Appropriately used, the PSM can eliminate bias due to nonoverlapping distributions between the treatment and the control groups. The PSM can also reduce the problem of unfair comparison. However, scholars must be careful about the quality of the data because the effectiveness of the PSM depends on the observational covariates. Research using objective measures will be an optimal setting for using the PSM. In empirical settings with low quality data, scholars can implement nonparametric PSM as a robust test to justify the parametric findings generated by traditional econometric models. To draw meaningful and honest causal inferences, one must appropriately choose the technique that works best for testing the causal relationship. When one has collected panel data and believes that omitted variable is time-invariant, then the fixed effects model is the best choice for estimating bias due to an omitted variable (Allison, 2009; Beck et al., 2008). When one finds one or more perfect instrumental variables, using two-stage least-squares (2SLS) can also address the bias of causal effects calculated through conventional regression techniques. When the endogenous variable suffers only from measurement error and when one knows the reliability coefficient, one can use regression analysis and correct the bias using the reliability coefficient. Almost no technique is perfect in drawing an unbiased causal inference, including experimental design. Heckman and Vytlacil (2007) remarked that explicitly manipulating treatment assignment cannot always represent the real-world problem because experimentation naturally discards information contained in a real-world context that includes dropout, self-selection, and noncompliance. Sometimes a combination of techniques is also recommended. For example, to alleviate the extrapolation bias in the regression models Imbens and Wooldridge (2009) recommend using matching to

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Organizational Research Methods 00(0)

generate a balanced sample. Similarly, Rosenbaum and Rubin (1983) suggested that differences due to unobserved heterogeneity should be addressed after balancing the observed covariates. Additionally, the PSM can also be incorporated in studies using the longitudinal design. Readers who are interested in estimating the ATT using longitudinal data can also refer to the nonparametric conditional difference-in-difference model (Heckman et al., 1997) and the semiparametric conditional difference-in-difference model (Heckman et al., 1998). To conclude, to draw the best causal inference, one needs to choose the appropriate methods. Of various techniques, the PSM should be a potential choice.

Conclusion
The purpose of this article is to introduce the PSM to the management field. This article makes several contributions to organizational research methods literature. First, it not only advances management scholars understanding of a neglected method to estimate causal effects, but also discusses some of the techniques limitations. Second, by integrating previous work on the PSM, it provides a step-by-step flowchart that management scholars can easily implement in their empirical studies. The attached data set with SPSS and STATA stratified matching codes help management scholars to calculate the ATT. Readers can make context-dependent decisions and choose a matching algorithm that is most beneficial for their objectives. Finally, a brief review of the applications of the PSM in other social science fields and a discussion of potential usage of the PSM in the management field provides an overview of how management scholars can employ the PSM in future empirical studies.

Appendix A Boosted Regression

Boosted regression (or boosting) is a general, automated, data-mining technique that has shown considerable success in using a large number of covariates to predict treatment assignment and fit a nonlinear surface (McCaffrey, Ridgeway, & Morral, 2004). Boosting relies on a regression tree using a recursive algorithm to estimate the function that describes the relationship between a set of covariates and the dependent variable. The regression tree begins with a complete data set and then partitions the data set into two regions by a series of if-then statements (Schonlau, 2005). For example, if age and race are covariates, the algorithm can first split the data set into two regions based on the condition of either of these two variables. The splitting algorithm continues recursively until the regression tree reaches the allowable number of splits. Friedman (2001) has shown that boosted regression outperforms other methods in reducing prediction error. McCaffrey et al. (2004) summarized three important advantages of the boosting technique. First, regression trees are easy and fast to fit. Second, regression trees can handle different types of covariates including continuous, nominal, ordinal, and missing variables. When boosted logistic regression is used to predict propensity scores, the use of different forms of covariates generally produces exactly the same propensity score adjustment. Finally, the boosting technique is capable of handling many covariates, even those unrelated to treatment assignment or correlated with one another. Schonlau (2005) listed factors that favor the use of the boosting technique. These factors include a large data set, suspected nonlinearities, more variables than observations, suspected interactions, correlated data, and ordered categorical covariates. He concludes that the boosting technique does not require scholars to specify interactions and nonlinearities. Thus, the boosting technique can simplify the procedure of computing propensity scores by reducing the burden of adding high-order covariates such as interactions.

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Appendix B. A Small Data Set for Manually Calculating Average Treatment Effect on the Treated Group (ATT)
Step 2 Block ID 1.32 0.167 49,237.660 31,401.687 2 5 18,338.4926 Tpscore Tage YiT YjC NqT NqC ATTq15 Weight 0.08 Step 3: Estimate Causal Effect ATT Weight 1,467.079

Step 1

Case ID

Outcome

Treatment

Age

PScore

1.00

0.136

42,465.315

44,775.121

1,661.30455

0.16

265.809

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

1.86

0.025

33,313.704

39,898.620

348.702

0.16

55.792

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 0.80 1.561 21,607.032 31,978.005 6 3 7,058.163

1,0048.54 0 2,0688.17 0 664.977 36,646.95 12,590.71 24,642.57 10,344.09 9,788.461 0 0 13,167.52 4,321.705 12,558.02 12,418.07 0 17,732.72 4,433.18 0 17,732.72 7,284.986 5,522.788 20,505.93 0 2,364.363 22,165.9 7,447.742 2,164.022 11,141.39 3,462.564 559.443 4,279.613 0

0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 1 0 0 0 0 0 1 1 1 1 0 0 0 1 1 1 1 1 1

50 19 26 44 39 35 33 32 44 41 33 20 22 26 46 46 40 26 30 21 20 41 17 24 27 41 23 24 21 23 29 20 19 23

0.076 0.109 0.128 0.14 0.177 0.075 0.101 0.265 0.268 0.282 0.313 0.365 0.261 0.312 0.361 0.392 0.412 0.456 0.481 0.513 0.558 0.511 0.525 0.547 0.59 0.678 0.727 0.746 0.654 0.739 0.758 0.759 0.764 0.768

1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4

0.24

1,693.959

(continued)

30
Step 2 Block ID 1.23 0.552 65,459.109 5,615.361 9 2 Tpscore Tage YiT YjC NqT NqC Step 3: Estimate Causal Effect ATTq15 4,465.553833 Weight 0.36 ATT Weight 1,607.599 0.923 0.954 0.913 0.948 0.954 0.959 0.961 0.965 0.966 0.97 0.987 0.001 0.003 0.009 0.013 0.016 5 5 5 5 5 5 5 5 5 5 5 Unmatched cases ATT 1,702.321

Appendix B. (continued)

Step 1

Case ID

Outcome

Treatment

Age

PScore

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50

5,615.361 0 13,385.86 8,472.158 0 6,181.88 289.79 17,814.98 9,265.788 1,923.938 8,124.715 11,821.81 24,825.81 33,987.71 33,987.71 54,675.88

0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0

28 23 18 27 18 17 21 37 17 25 25 53 52 28 41 38

Note: PScore propensity scores; Tage/Tpscore t-test for age and propensity scores in each balanced block; YiT summation of outcome variable for treated cases in each block; YiC summation of outcome variable for control cases in each block; NqT total number of treated cases in each block; NqC total number of control cases in each block; ATTq15 YiT/NqT YiC/NqC; average treatment effect for each balanced block; weight total number of treated cases in each block divided by total number of treated cases in the sample.

Appendix C SPSS Code for Stratified Matching

*Step 1: Calculate propensity score. LOGISTIC REGRESSION VARIABLES TREATMENT /METHODENTER X1 X2 X3 /SAVEPRED /CRITERIAPIN(.05) POUT(.10) ITERATE(20) CUT(.5). RENAME VARIABLES (PRE_1pscore).

The above code calculates predicted probability using a number of observation variables (e.g. X1, X2, and X3). Readers can change their variables correspondingly.
*Step 2: Stratify into five blocks. compute blockid. if (pscore< .2) & (pscore > .05) blockid1. if (pscore< .4) & (pscore > .2) blockid2. if (pscore< .6) & (pscore > .4) blockid3. if (pscore< .8) & (pscore > .6) blockid4. if ( pscore > .8) blockid5. execute. *Perform t test for each block. *Split file first, and then excute t test. SORT CASES BY blockid. SPLIT FILE SEPARATE BY blockid. T-TEST GROUPStreatment(0 1) /MISSINGANALYSIS /VARIABLESage pscore /CRITERIACI(.95).

The above code first stratifies variables into five blocks, and then carries on the t-test for each of the blocks. SPSS has no if option for t-test, thus it is important to split the data based on block ID, and then conduct the t-test.
*Step 3: Perform Stratification Matching Procedure. *Caclulate YiT and YjC in Appendix B. AGGREGATE /OUTFILE* MODEADDVARIABLES /BREAKblockid treatment /outcome_sumSUM(outcome). *Calculate NqT and NqC in Appendix B. AGGREGATE /OUTFILE* MODEADDVARIABLES /BREAKblockid treatment /N_BREAKN. *Calculate total number of treatment cases. AGGREGATE /OUTFILE* MODEADDVARIABLES /BREAK /N_Treatmentsum(treatment).

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

32
COMPUTE ATTQoutcome_sum/N_BREAK. EXECUTE. DATASET DECLARE agg_all. AGGREGATE /OUTFILEagg_all /BREAKtreatment blockid /N_Block_TMEAN(N_BREAK) /ATTQ_TMEAN(ATTQ) /N_TreatmentMEAN(N_Treatment). DATASET ACTIVATE agg_all. DATASET COPY agg_treat. DATASET ACTIVATE agg_treat. FILTER OFF. USE ALL. SELECT IF (treatment 1). EXECUTE. DATASET ACTIVATE agg_all. DATASET COPY agg_control. DATASET ACTIVATE agg_control. FILTER OFF. USE ALL. SELECT IF (treatment0&blockid<6). EXECUTE. DATASET ACTIVATE agg_control.

Organizational Research Methods 00(0)

RENAME VARIABLES (N_Block_T ATTQ_T N_Block_C ATTQ_C ). MATCH FILES /FILE* /FILEagg_treat /RENAME (blockid N_Treatment treatment d0 d1 d2) /DROP d0 d1 d2. EXECUTE. COMPUTE ATTQATTQ_T-ATTQ_C. EXECUTE. COMPUTE weightN_Block_T/N_Treatment. EXECUTE. COMPUTE ATTxweightATTQ*weight. EXECUTE. AGGREGATE /OUTFILE* MODEADDVARIABLES OVERWRITEVARSYES /BREAK /ATTxweight_sumSUM(ATTxweight). DATASET CLOSE agg_all. DATASET CLOSE agg_control. DATASET CLOSE agg_treat.

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

This step computes each of the components in Equation 2.2. For example, it first calculates the number of treated cases and the number of control cases in each matched block. Then, it also gauges the summation of outcome in each balanced blocks. The code then extracts each of the necessary components into two different data sets: agg_control and agg_treat. Finally, the code matches these two data sets based on block ID and estimates the ATT. The final result will be displayed in the variable called ATTxweight.

Appendix D STATA Code for Stratified Matching

*STEP 1: get the propensity scores using logistical regression *Choose covariates appropriately logit treatment X1 X2 X3 *Calculate propensity scores predict pscore, p *STEP 2: subclassification gen blockid. replace blockid1 if replace blockid2 if replace blockid3 if replace blockid4 if replace blockid5 if

pscore<.2 pscore<.4 pscore<.6 pscore<.8 pscore>.8

& & & &

pscore>.05 pscore> .2 pscore> .4 pscore> .6

*STEP 2: t test for balance in each block foreach var of varlist age pscore f forvalues i1/5 f ttest var if blockid i, by(treatment) g g *STEP 3: Estimate causal effects using stratified matching sort blockid treatment gen YTQ. *Yic in Appendix B table gen TTN1 *Nqt in Appendix B table gen YCQ. *Yjc in Appendix B table gen TCN1 *Nqc in Appendix B table forvalues i1/5 f *Get sum for outcome in each treated block sum outcome if treatment1 & blockidi replace YTQr(sum) if blockidi *Number of treated cases in each block sum TTN if treatment1 & blockidi replace TTNr(sum) if blockidi *Get sum for outcome in each control block sum outcome if treatment0 & blockidi replace YCQr(sum) if blockidi *Number of treated cases in each block sum TCN if treatment0 & blockidi

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Organizational Research Methods 00(0)

replace TCNr(sum) if blockidi g gen ATTQYTQ/TTN-YCQ/TCN *Weights for ATT sum treatment gen WTTN/r(sum) *Weighted ATT gen ATTATTQ*W bysort blockid: gen id_n sum ATT if id1 display "The ATT is r(sum)"

Appendix E. Software Packages for Applying the Propensity Score Method (PSM)
Environment Software Name R Matching Authors Sekhon (2007) Function and Download Sources Relies on an automated procedure to detect matches based on a number of univariate and multivariate metrics. It performs propensity matching, primarily 1:M matching. The package also allows matching with and without replacement. Download source: http://sekhon.berkeley.edu/matching/ Document: http://cran.r-project.org/web/packages/ Matching/Matching.pdf Provides enriched graphical tools to test within strata balance. It also provides graphical tools to detect covariate distributions across strata. Download source: http://cran.r-project.org/web/packages/ PSAgraphics/index.html Includes propensity score estimating and weighting. Generalized boosted regression is used to estimate propensity scores thus simplifying the procedure to estimate propensity scores. Download source: http://cran.r-project.org/web/packages/ twang/index.html Performs 1:1 nearest neighbor matching. Download source: http:// mayoresearch.mayo.edu/mayo/research/ biostat/upload/gmatch.sas Allows users to specify the propensity score matching from 1:1 or 1:M. Download source: http://www2.sas.com/proceedings/sugi29/ 165-29.pdf (continued)

PSAgraphics

Helmreich and Pruzek (2009)

Twang

Ridgeway, McCaffrey, and Morral (2006)

SAS

Greedy matching

Kosanke and Bergstralh (2004)

OneToManyMTCH

Parsons (2004)

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Appendix E. (continued) Environment Software Name SPSS SPSS Macro for P score matching Authors Painter (2004) Function and Download Sources

STATA

Pscore

Psmatch2

Performs nearest neighbor propensity score matching. It seems to solely do 1:1 matching without replacement. Download source: http://www.unc.edu/*painter/SPSSsyntax/ propen.txt Becker and Ichino (2002) Estimates propensity scores and conducts a number of matching such as radius, nearest neighbor, kernel, and stratified. Download source: http://www.lrz.de/*sobecker/pscore.html Leuven and Sianesi Allows a number of matching procedures, (2003) including kernel matching and k:1 matching. It also supports common support graphs and balance testing. Download source: http://ideas.repec.org/c/boc/bocode/ s432001.html

Acknowledgments
Special thanks to Barry Gerhart for his invaluable support and to Associate Editor James LeBreton and anonymous reviewers for their constructive feedbacks. This article has also benefited from suggestions by Russ Coff, Jose Cortina, Cindy Devers, Jon Eckhardt, Phil Kim, and seminar participants at 2011 AOM conference.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.

Notes
1. Harder, Stuart, and Anthony (2010) argued that propensity score method (PSM) can be used to estimate the average treatment effect on the treated group (ATT), and subclassifying the propensity score can be used to calculate the average treatment effect (ATE). However, economists typically viewed the PSM as a technique to estimate the ATT (Dehejia & Wahba, 1999, 2002). Following Dehejia and Wahba (1999, 2002), the remaining section regards the PSM as a way to calculate the ATT. The remaining sections use causal effects, treatment effects, and ATT interchangeably. 2. Psychology scholars also extended this to develop the causal steps approach to draw mediating causal inference (e.g., Baron & Kenny, 1986). It is beyond the scope of this article to fully discuss mediation. Interested readers can read LeBreton, Wu, and Bing (2008) and Wood, Goodman, Beckmann, and Cook (2008) for surveys. 3. Becker and Ichino (2002) have written a nice STATA program (pscore) to estimate the propensity score. The convenience of using pscore is that the program can stratify propensity scores to a specified number of blocks and test the balance of propensity scores in each block. However, when there is more than one treatment, it is inappropriate to use pscore to estimate the propensity score.

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Organizational Research Methods 00(0)

4. Propensity score matching is one technique of many matched sampling technique. One can use exact matching simply based on one or more covariates. For example, scholars may match sample based on standard industry classification (SIC) and firm size rather than matching using propensity scores. 5. These components are: the variance of the covariates in the control groups, the variance of the covariates in the treated groups, the mean of the covariates in the control groups, the mean of the covariates in the treated groups, and the estimated propensity score. The variance of the covariates in the treated and the control groups are weighted by the propensity score. 6. Instrumental variable (IV) is typically used by scholars under the condition of simultaneity. Because of the difficulty in finding an IV, it is not viewed as a general remedy for endogeneity issues.

References
Allison, P. (2009). Fixed effects regression models. Newbury Park, CA: Sage. Angrist, J. (1998). Estimating the labor market impact of voluntary military service using social security data on military applicants. Econometrica, 66, 249-288. Angrist, J. D., Imbens, G. W., & Rubin, D. B. (1996). Identification of causal effects using instrumental variables. Journal of the American Statistical Association, 9, 444-455. Antonakis, J., Bendahan, S., Jacquart, P., & Lalive, R. (2010). On making causal claims: A review and recommendations. The Leadership Quarterly, 21(6), 1086-1120. Arceneaux, K., Gerber, A., & Green, D. (2006). Comparing experimental and matching methods using a largescale voter mobilization experiment. Political Analysis, 14, 1-26. Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51(6), 1173-1182. Beck, N., Bruderl, J., & Woywode, M. (2008). Momentum or deceleration? Theoretical and methodolo gical reflections on the analysis of organizational change. Academyof Management Journal, 51(3), 413-435. Becker, S., & Caliendo, M. (2007). Sensitivity analysis for average treatment effects. Stata Journal, 7(1), 71-83. Becker, S., & Ichino, A. (2002). Estimation of average treatment effects based on propensity scores. The Stata Journal, 2, 358-377. Berk, R. A. (1983). An introduction to sample selection bias in sociological data. American Sociological Review, 48(3), 386-398. Campello, M., Graham, J., & Harvey, C. (2010). The real effects of financial constraints: Evidence from a financial crisis. Journal of Financial Economics, 97, 470-487. Cochran, W. (1957). Analysis of covariance: Its nature and uses. Biometrics, 13(3), 261-281. Cochran, W. (1968). The effectiveness of adjustment by subclassification in removing bias in observational studies. Biometrics, 24, 295-313. Couch, K. A., & Placzek, D. W. (2010). Earnings losses of displaced workers revisited. American Economic Review, 100, 572-589. Cox, D. (1992). Causality: Some statistical aspects. Journal of the Royal Statistical Society, Series A (Statistics in Society), 155, 291-301. Dehejia, R., & Wahba, S. (1999). Causal effects in nonexperimental studies: Reevaluating the evaluation of training programs. Journal of the American Statistical Association, 94, 1053-1062. Dehejia, R., & Wahba, S. (2002). Propensity score-matching methods for nonexperimental causal studies. Review of Economics and Statistics, 84, 151-161. DiPrete, T. A., & Gangl, M. (2004). Assessing bias in the estimation of causal effects: Rosenbaum bounds on matching estimators and instrumental variables estimation with imperfect instruments. Sociological Methodology, 34, 271-310. Duncan, O. D. (1975). Introduction to structural equation models. San Diego, CA: Academic Press.

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Drucker, S., & Puri, M. (2005). On the benefits of concurrent lending and underwriting. Journal of Finance, 60(6), 2763-2799. Efron, B., & Tibshirani, R. (1997). An introduction to the bootstrap. London: Chapman & Hall. Frank, R., Akresh, I. R., & Lu, B. (2010). Latino Immigrants and the US racial order: How and where do they fit in? American Sociological Review, 75(3), 378-401. Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29, 1189-1232. Gangl, M. (2006). Scar effects of unemployment: An assessment of institutional complementarities. American Sociological Review, 71(6), 986-1013. Gerhart, B. (2007). Modeling human resource management and performance linkages. In P. Boxall, J. Purcell, & P. Wright (Eds.), The Oxford handbook of human resource management (pp. 552-580). Oxford: Oxford University Press. Gerhart, B., Wright, P., & McMahan, G. (2000). Measurement error in research on the human resources and firm performance relationship: Further evidence and analysis. Personnel Psychology, 53, 855-872. Greene, W. (2008). Econometric analysis (6th ed.). Upper Saddle River, NJ: Prentice Hall. Grodsky, E. (2007). Compensatory sponsorship in higher education. American Journal of Sociology, 112(6), 1662-1712. Gu, X., & Rosenbaum, P. (1993). Comparison of multivariate matching methods: Structures, distances, and algorithms. Journal of Computational and Graphical Statistics, 2, 405-420. Harder, V. S., Stuart, E. A., & Anthony, J. C. (2010). Propensity score techniques and the assessment of measured covariate balance to test causal associations in psychological research. Psychological Methods, 15, 234-249. Hamilton, B. H., & Nickerson, J. A. (2003). Correcting for endogeneity in strategic management research. Strategic Organization, 1, 51-78. Heckman, J. (1979). Sample selection bias as a specification error. Econometrica, 47, 153-161. Heckman, J., & Hotz, V. (1989). Choosing among alternative nonexperimental methods for estimating the impact of social programs: The case of manpower training. Journal of the American Statistical Association, 84, 862-874. Heckman, J., Ichimura, H., Smith, J., & Todd, P. (1998). Characterizing selection bias using experimental data. Econometrica, 66, 1017-1098. Heckman, J., Ichimura, H., & Todd, P. E. (1997). Matching as an econometric evaluation estimator: Evidence from evaluating job training program. Review of Economic Studies, 64, 605-654. Heckman, J. J., & Vytlacil, E. J. (2007). Econometric evaluation of social programs, part II: Using the marginal treatment effect to organize alternative econometric estimators to evaluate social programs, and to forecast their effects in new environments. Handbook of Econometrics, 6, 4875-5143. Helmreich, J. E., & Pruzek, R. M. (2009). PSAgraphics: An R package to support propensity score analysis. Journal of Statistical Software, 29, 1-23. Hoetker, G. (2007). The use of logit and probit models in strategic management research: Critical issues. Strategic Management Journal, 28(4), 331-343. Imbens, G. (2000). The role of the propensity score in estimating dose-response functions. Biometrika, 87(3), 706-710. Imbens, G. W. (2004). Nonparametric estimation of average treatment effects under exogeneity: A review. The Review of Economics and Statistics, 86, 4-29. Imbens, G. W., & Wooldridge, J. M. (2009). Recent developments in the econometrics of program evaluation. Journal of Economic Literature, 47(1), 5-86. James, L. R. (1980). The unmeasured variables problem in path analysis. Journal of Applied Psychology, 65(4), 415-421. James, L. R., Mulaik, S. A., & Brett, J. M. (1982). Causal analysis: Assumptions, models, and data. Thousand Oaks, CA: Sage.

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Organizational Research Methods 00(0)

Joffe, M. M., & Rosenbaum, P. R. (1999). Invited commentary: Propensity scores. American Journal of Epidemiology, 150, 327-333. King, G., Keohane, R. O., & Verba, S. (1994). Designing social inquiry: Scientific inference in qualitative research. Princeton, NJ: Princeton University Press. Kosanke, J., & Bergstralh, E. (2004). gmatch: Match 1 or more controls to cases using the GREEDY algorithm. Retrieved from http://mayoresearch.mayo.edu/mayo/research/biostat/upload/gmatch.sas (accessed May 15, 2012) Lalonde, R. J. (1986). Evaluating the econometric evaluations of training programs with experimental data. American Economic Review, 76, 604-620. LeBreton, J. M., Wu, J., & Bing, M. N. (2008). The truth(s) on testing for mediation in the social and organizational sciences. In C. E. Lance, & R. J. Vandenberg (Eds.), Statistical and methodological myths and urban legends (pp. 107-140). New York, NY: Routledge. Lechner, M. (2002). Program heterogeneity and propensity score matching: An application to the evaluation of active labor market policies. Review of Economics and Statistics, 84, 205-220. Leuven, E., & Sianesi, B. (2003). PSMATCH2: Stata module to perform full Mahalanobis and propensity score matching, common support graphing, and covariate imbalance testing [Statistical software components]. Boston, MA: Boston College. Li, Y., Propert, K., & Rosenbaum, P. (2001). Balanced risk set matching. Journal of the American Statistical Association, 96, 870-882. Long, J. S. (1997). Regression models for categorical and limited dependent variables. Thousand Oaks, CA: Sage. Malmendier, U., & Tate, G. (2009). Superstar CEOs. The Quarterly Journal of Economics, 124(4), 1593-1638. McCaffrey, D. F., Ridgeway, G., & Morral, A. R. (2004). Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychological Methods, 9, 403-425. Mellor, S., & Mark, M. M. (1998). A quasi-experimental design for studies on the impact of administrative decisions: Applications and extensions of the regression-discontinuity design. Organizational Research Methods, 1(3), 315-333. Morgan, S. L., & Harding, D. J. (2006). Matching estimators of causal effectsProspects and pitfalls in theory and practice. Sociological Methods & Research, 35, 3-60. Morgan, S. L., & Winship, C. (2007). Counterfactuals and causal inference: Methods and principles for social research. Cambridge, UK: Cambridge University Press. Painter, J. (2004). SPSS Syntax for nearest neighbor propensity score matching. Retrieved from http://www. unc.edu/~painter/SPSSsyntax/propen.txt (accessed May 15, 2012) Parsons, L. (2004). Performing a 1: N case-control match on propensity score. Proceedings of the 29th Annual SAS Users Group International Conference, SAS Institute, Montreal, Canada. Ridgeway, G., McCaffrey, D., & Morral, A. (2006). Toolkit for weighting and analysis of nonequivalent groups: A tutorial for the twang package. Santa Monica, CA: RAND Corporation. Rosenbaum, P. (1987). The role of a second control group in an observational study. Statistical Science, 2, 292-306. Rosenbaum, P. (2002). Observational studies. New York, NY: Springer-Verlag. Rosenbaum, P. (2004). Matching in observational studies. In A. Gelman & X. Meng (Eds.), Applied Bayesian modeling and causal inference from an incomplete-data perspective (pp. 15-24). New York, NY: Wiley. Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of propensity score in observational studies for causal effects. Biometrika, 70, 41-55. Rosenbaum, P., & Rubin, D. (1984). Reducing bias in observational studies using subclassification on the propensity score. Journal of the American Statistical Association, 79, 516-524. Rosenbaum, P., & Rubin, D. (1985). Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. American Statistician, 39, 33-38.

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

Rousseau, D. (2006). Is there such a thing as evidence-based management. Academy of Management Review, 31, 256-269. Rubin, D. (1997). Estimating causal effects from large data sets using propensity scores. Annals of Internal Medicine, 127, 757-763. Rubin, D. (2001). Using propensity scores to help design observational studies: Application to the tobacco litigation. Health Services and Outcomes Research Methodology, 2(3), 169-188. Rubin, D. (2004). Teaching statistical inference for causal effects in experiments and observational studies. Journal of Educational and Behavioral Statistics, 29, 343-367. Rynes, S., Giluk, T., & Brown, K. (2007). The very separate worlds of academic and practitioner periodicals in human resource management: Implications for evidence-based management. Academy of Management Journal, 50(5), 987-1008. Schonlau, M. (2005). Boosted regression (boosting): An introductory tutorial and a Stata plugin. Stata Journal, 5, 330-354. Sekhon, J. S. (2007). Multivariate and propensity score matching software with automated balance optimization: The matching package for R. Journal of Statistical Software, 10(2), 1-51. Simith, J., & Todd, P. E. (2005). Does matching overcome Lalondes critique of nonexperimental estimators. Journal of Econometrics, 125, 305-353. Steiner, P. M., Cook, T. D., Shadish, W. R., & Clark, M. H. (2010). The importance of covariate selection in controlling for selection bias in observational studies. Psychological Methods, 15, 250-267. Wasserman, N. (2003). Founder-CEO succession and the paradox of entrepreneurial success. Organization Science, 14(2), 149-172. Wolfe, F., & Michaud, K. (2004). Heart failure in rheumatoid arthritis: Rates, predictors, and the effect of anti-tumor necrosis factor therapy. American Journal of Medicine, 116, 305-311. Wood, R. E., Goodman, J. S., Beckmann, N., & Cook, A. (2008). Mediation testing in management research: A review and proposals. Organizational Research Methods, 11(2), 270-295. Wooldridge, J. (2002). Econometric analysis of cross section and panel data. Cambridge, MA: MIT Press. Xuan, Y. (2009). Empire-building or bridge-building? Evidence from new CEOs internal capital allocation decisions. Review of Financial Studies, 22, 4919-4918.

Bio
Mingxiang Li is a doctoral candidate at the Wisconsin School of Business, University of Wisconsin-Madison. In addition to research methods, his current research interests include corporate governance, social network, and entrepreneurship.

Downloaded from orm.sagepub.com at SUNY BINGHAMTON on August 8, 2012

An Introduction To Linear Algebra For Science and Engineering - 3rd Ed - Norman
100% (16)
An Introduction To Linear Algebra For Science and Engineering - 3rd Ed - Norman
592 pages
Bayesian Statistical Methods
100% (10)
Bayesian Statistical Methods
288 pages
Mathematical Statistics With Applications PDF
100% (16)
Mathematical Statistics With Applications PDF
644 pages
Probability and Statistical Inference-CRC (2021)
89% (9)
Probability and Statistical Inference-CRC (2021)
444 pages
Epidemiology - An Introduction
100% (13)
Epidemiology - An Introduction
281 pages
(Cambridge Series in Statistical and Probabilistic Mathematics) Gerhard Tutz, Ludwig-Maximilians-Universität Munchen - Regression For Categorical Data-Cambridge University Press (2012)
100% (3)
(Cambridge Series in Statistical and Probabilistic Mathematics) Gerhard Tutz, Ludwig-Maximilians-Universität Munchen - Regression For Categorical Data-Cambridge University Press (2012)
574 pages
Statistics Books With Answers
80% (10)
Statistics Books With Answers
210 pages
2013 Book BayesianAndFrequentistRegressi PDF
No ratings yet
2013 Book BayesianAndFrequentistRegressi PDF
700 pages
Modelos Lineales Generalizados Con Ejemplos en R
No ratings yet
Modelos Lineales Generalizados Con Ejemplos en R
573 pages
2.marketing Analytics Notes Book
No ratings yet
2.marketing Analytics Notes Book
231 pages
Missing and Modified Data in Nonparametric Estimation
100% (2)
Missing and Modified Data in Nonparametric Estimation
465 pages
Statistical Regression Modeling With R: Ding-Geng (Din) Chen Jenny K. Chen
No ratings yet
Statistical Regression Modeling With R: Ding-Geng (Din) Chen Jenny K. Chen
239 pages
Essentials of Probability Theory For Statisticians
67% (3)
Essentials of Probability Theory For Statisticians
419 pages
Chapter-1 Introduction To Research Methodology.
89% (9)
Chapter-1 Introduction To Research Methodology.
42 pages
School of Money by Olumide Emmanuel
93% (14)
School of Money by Olumide Emmanuel
701 pages
Anderson F. Survival Analysis by Example. Hands On Approach Using R 2016
No ratings yet
Anderson F. Survival Analysis by Example. Hands On Approach Using R 2016
42 pages
Survival Analysis-Debby Raden
No ratings yet
Survival Analysis-Debby Raden
98 pages
1 Statistics Introduction
No ratings yet
1 Statistics Introduction
36 pages
Multiple Logistic Regression
No ratings yet
Multiple Logistic Regression
71 pages
Analysis of Epidemiological Data Using R
No ratings yet
Analysis of Epidemiological Data Using R
285 pages
4.1 Point and Interval Estimation
No ratings yet
4.1 Point and Interval Estimation
11 pages
Survival Analysis
No ratings yet
Survival Analysis
13 pages
STATA Basics Regression and Panal Data
100% (1)
STATA Basics Regression and Panal Data
26 pages
Statistics For Health Data Science An Organic Approach
No ratings yet
Statistics For Health Data Science An Organic Approach
238 pages
Stata - Tips PDF
100% (1)
Stata - Tips PDF
114 pages
Survival Analysis Dengan Pendekatan R
No ratings yet
Survival Analysis Dengan Pendekatan R
32 pages
CT3 Past Exams 2005 - 2009
No ratings yet
CT3 Past Exams 2005 - 2009
175 pages
Rhandbookprogramme Evaluation PDF
100% (1)
Rhandbookprogramme Evaluation PDF
775 pages
Sophia Rabe-Hesketh, Brian S. Everitt, - A Handbook of Statistical Analyses Using Stata, Fourth Edition-Chapman and Hall - CRC (2006)
No ratings yet
Sophia Rabe-Hesketh, Brian S. Everitt, - A Handbook of Statistical Analyses Using Stata, Fourth Edition-Chapman and Hall - CRC (2006)
345 pages
Understanding and Interpreting Educational Research PDF Ebook With Full Chapters
No ratings yet
Understanding and Interpreting Educational Research PDF Ebook With Full Chapters
16 pages
Reliability and Survival Analysis
100% (2)
Reliability and Survival Analysis
259 pages
Statistical Inference
No ratings yet
Statistical Inference
148 pages
Me Mba 1ST Sem
No ratings yet
Me Mba 1ST Sem
230 pages
List of Books
No ratings yet
List of Books
92 pages
Statistics and Basic Distribution - Mabe
No ratings yet
Statistics and Basic Distribution - Mabe
103 pages
22supply Results Dec-2023
No ratings yet
22supply Results Dec-2023
13 pages
Epi Concepts Cheat Sheet
No ratings yet
Epi Concepts Cheat Sheet
8 pages
Analysis of Survival Data - LN - D Zhang - 05
100% (1)
Analysis of Survival Data - LN - D Zhang - 05
264 pages
AnalyticsEdge Rmanual PDF
100% (1)
AnalyticsEdge Rmanual PDF
44 pages
Mark Crane, Michael C. Newman, Peter F. Chapman
No ratings yet
Mark Crane, Michael C. Newman, Peter F. Chapman
192 pages
Business Statistics' Important Questions - Dr. Vishal Saxena Sir
No ratings yet
Business Statistics' Important Questions - Dr. Vishal Saxena Sir
6 pages
Econometrics in R: Grant V. Farnsworth October 26, 2008
No ratings yet
Econometrics in R: Grant V. Farnsworth October 26, 2008
50 pages
PROBABILITY It Is A Numerical Measure Which Indicates The Chance
No ratings yet
PROBABILITY It Is A Numerical Measure Which Indicates The Chance
20 pages
Confounding and Effect Measure Modification
No ratings yet
Confounding and Effect Measure Modification
49 pages
Quality Management Practices of Food Man
No ratings yet
Quality Management Practices of Food Man
26 pages
Comparison of Advanced Semi Probabilistic Methods For Design and Assessment of Concrete Structures - Novák L
No ratings yet
Comparison of Advanced Semi Probabilistic Methods For Design and Assessment of Concrete Structures - Novák L
17 pages
Introduction To Survival Analysis: Lecture Notes
No ratings yet
Introduction To Survival Analysis: Lecture Notes
28 pages
Investment On Land Report
No ratings yet
Investment On Land Report
27 pages
Session 7
No ratings yet
Session 7
27 pages
Agresti Ordinal Tutorial
No ratings yet
Agresti Ordinal Tutorial
75 pages
Lesson 7
No ratings yet
Lesson 7
74 pages
IE306 Systems Simulation: Ali Rıza Kaylan Kaylan@boun - Edu.tr
No ratings yet
IE306 Systems Simulation: Ali Rıza Kaylan Kaylan@boun - Edu.tr
39 pages
Introduction To Survival Analysis: BIOST 515 February 26, 2004
No ratings yet
Introduction To Survival Analysis: BIOST 515 February 26, 2004
30 pages
Maths Class Xi Chapter 13 Statistics Practice Paper 19 2024 Answers
No ratings yet
Maths Class Xi Chapter 13 Statistics Practice Paper 19 2024 Answers
7 pages
Survival Analysis
100% (1)
Survival Analysis
15 pages
Probability Distribution
No ratings yet
Probability Distribution
33 pages
Project Proposal
No ratings yet
Project Proposal
40 pages
Implementing Propensity Score Matching Estimators With STATA
100% (1)
Implementing Propensity Score Matching Estimators With STATA
15 pages
Michael N. Mitchell - Data Management Using Stata - A Practical Handbook-STATA Press (2010)
100% (1)
Michael N. Mitchell - Data Management Using Stata - A Practical Handbook-STATA Press (2010)
405 pages
Gaver and Utke (2019)
No ratings yet
Gaver and Utke (2019)
35 pages
IIARF CBOK Interacting With Audit Committees FEB 2016 PDF
No ratings yet
IIARF CBOK Interacting With Audit Committees FEB 2016 PDF
16 pages
Lecture - Final Survival Analysis)
No ratings yet
Lecture - Final Survival Analysis)
86 pages
Logistics Regression
No ratings yet
Logistics Regression
14 pages
Propensity Scores: A Practical Introduction Using R
No ratings yet
Propensity Scores: A Practical Introduction Using R
21 pages
ML - Full Slides Srikanth Allamshatty
No ratings yet
ML - Full Slides Srikanth Allamshatty
369 pages
Steam and - Leaf Display
100% (1)
Steam and - Leaf Display
25 pages
Hangal - Frailty Models
No ratings yet
Hangal - Frailty Models
307 pages
Aldrich - R. A. Fisher On Bayes and Bayes' Theorem
No ratings yet
Aldrich - R. A. Fisher On Bayes and Bayes' Theorem
10 pages
Causal Inference Intro
No ratings yet
Causal Inference Intro
16 pages
East West University: Computer Science and Engineering
No ratings yet
East West University: Computer Science and Engineering
8 pages
Challenges, Opportunities, Attitude, and Implementation of Blended Learningin Basic Education in The New Normal
No ratings yet
Challenges, Opportunities, Attitude, and Implementation of Blended Learningin Basic Education in The New Normal
25 pages
On The Theory of Scales of Measurement - S. S. Stevens
100% (3)
On The Theory of Scales of Measurement - S. S. Stevens
5 pages
SAS Training and Certification 2.0 - Curriculum PDF
0% (1)
SAS Training and Certification 2.0 - Curriculum PDF
6 pages
Regression Analysis Project
100% (1)
Regression Analysis Project
4 pages
Aiken & West (1991) Chap07 PDF
No ratings yet
Aiken & West (1991) Chap07 PDF
14 pages
Regression Analysis
No ratings yet
Regression Analysis
7 pages
Data Analytics Engineering Ms
No ratings yet
Data Analytics Engineering Ms
6 pages
Guide To PHD S in Economics
No ratings yet
Guide To PHD S in Economics
18 pages
4 32e
No ratings yet
4 32e
2 pages
Bruno Lecture Notes PDF
No ratings yet
Bruno Lecture Notes PDF
251 pages
Statistics
100% (7)
Statistics
897 pages
Survival Analysis Presentation
No ratings yet
Survival Analysis Presentation
18 pages
Master IELTS General Training Volume 1&2 Question
No ratings yet
Master IELTS General Training Volume 1&2 Question
44 pages
Introduction To Cox Regression: Kristin Sainani Ph.D. Stanford University Department of Health Research and Policy
No ratings yet
Introduction To Cox Regression: Kristin Sainani Ph.D. Stanford University Department of Health Research and Policy
62 pages
DDE 602 Research Method
No ratings yet
DDE 602 Research Method
9 pages
Multivariate Statistics Made Simple A Practical Approach by K. v. S. Sarma, R. Vishnu Vardhan
100% (1)
Multivariate Statistics Made Simple A Practical Approach by K. v. S. Sarma, R. Vishnu Vardhan
259 pages
Performance Analysis
0% (1)
Performance Analysis
2 pages
Hypothesis Testing: Cee 3040 - Uncertainty Analysis in Engineering
No ratings yet
Hypothesis Testing: Cee 3040 - Uncertainty Analysis in Engineering
1 page
Cox Regression
No ratings yet
Cox Regression
51 pages
Stochastic Epidemic Modelling
No ratings yet
Stochastic Epidemic Modelling
15 pages
3 Aplication of Matrice Operations (Larson)
No ratings yet
3 Aplication of Matrice Operations (Larson)
14 pages
Cox Proportional Hazard Model
No ratings yet
Cox Proportional Hazard Model
34 pages
An Evolutionary Approach To Management Control Systems Research
No ratings yet
An Evolutionary Approach To Management Control Systems Research
4 pages
Lectures
No ratings yet
Lectures
766 pages
Statistics and Probability (Topic 5) Revision Answers
No ratings yet
Statistics and Probability (Topic 5) Revision Answers
4 pages
Essentials of Statistics
No ratings yet
Essentials of Statistics
272 pages
Presentation 2
No ratings yet
Presentation 2
39 pages
What Is A Cox Model?: Sponsored by An Educational Grant From Aventis Pharma
No ratings yet
What Is A Cox Model?: Sponsored by An Educational Grant From Aventis Pharma
8 pages
Survival Analysis Overview
No ratings yet
Survival Analysis Overview
23 pages
Learning Statistics
100% (27)
Learning Statistics
408 pages
Linear Regression and Correlation - A Beginner's Guide
100% (5)
Linear Regression and Correlation - A Beginner's Guide
220 pages
A Statistical Process Control Case Study: Thomas K. Ross, PHD
No ratings yet
A Statistical Process Control Case Study: Thomas K. Ross, PHD
16 pages
Hypothesis Testing - A Visual Introduction To Statistical Significance
100% (4)
Hypothesis Testing - A Visual Introduction To Statistical Significance
137 pages
STEP SPSS ANALYSIS COHEN KAPPA and ICC
No ratings yet
STEP SPSS ANALYSIS COHEN KAPPA and ICC
5 pages
A Concise Introduction To Statistical Inference
100% (5)
A Concise Introduction To Statistical Inference
231 pages
Statistical Data Analysis Explained
93% (27)
Statistical Data Analysis Explained
359 pages
Advance Statistical Methods in Data Science Chen
100% (5)
Advance Statistical Methods in Data Science Chen
229 pages
Research Methods & Statistics For Public & Nonprofit Administrators-Practical Guide - Nishishiba 2014 PDF
92% (13)
Research Methods & Statistics For Public & Nonprofit Administrators-Practical Guide - Nishishiba 2014 PDF
393 pages
Strategic Analysis for Healthcare: Concepts and Practical Applications
From Everand
Strategic Analysis for Healthcare: Concepts and Practical Applications
Michael S. Wayland
3/5 (2)
Exploring What It Effectively Means to Manage Carpal Tunnel Syndrome’S: Physical, Social, and Emotional Crucibles in a Return to Work Program
From Everand
Exploring What It Effectively Means to Manage Carpal Tunnel Syndrome’S: Physical, Social, and Emotional Crucibles in a Return to Work Program
Dr. Stella Marie Rostkowski
No ratings yet
Health Systems Engineering: Building A Better Healthcare Delivery System
From Everand
Health Systems Engineering: Building A Better Healthcare Delivery System
Mbuso Mabuza
No ratings yet
Intervention Set Selection
From Everand
Intervention Set Selection
Simone G. Symonette
No ratings yet
Statistical Analysis of Adverse Impact: A Practitioner’S Guide
From Everand
Statistical Analysis of Adverse Impact: A Practitioner’S Guide
Stephanie R. Thomas Ph.D.
No ratings yet
Environmental Health Noncompliance: A Sanitarian's Search for a New System
From Everand
Environmental Health Noncompliance: A Sanitarian's Search for a New System
David Mikkola
No ratings yet
Glossary of Research Methods
From Everand
Glossary of Research Methods
Dr. Awadhesh Kishore
No ratings yet
Healthcare Simulation in Practice
From Everand
Healthcare Simulation in Practice
Mark Hellaby
5/5 (1)
Clinical Trials Design and Methodology: Clinical Trials Mastery Series, #3
From Everand
Clinical Trials Design and Methodology: Clinical Trials Mastery Series, #3
Dr. Nilesh Panchal
No ratings yet
Ethics, Qualitative And Quantitative Methods In Public Health Research
From Everand
Ethics, Qualitative And Quantitative Methods In Public Health Research
Mbuso Mabuza
No ratings yet
Social Research Methods. A Complete Guide
From Everand
Social Research Methods. A Complete Guide
Mutea Rukwaru
5/5 (1)