In observational studies the assignment of units to treatments is not under control. Consequently, estimation and comparison of treatment effects based on the empirical distribution of the responses under the various treatments can be biased since units exposed to one treatment could differ in unknown characteristics, related to the response, from units exposed to other treatments.
In this article we study the plausibility of analyzing observational data by deriving the parametric distribution of the observed response under a given treatment as a function of the distribution that would be obtained under a strongly ignorable assignment and the assignment process, which is modelled as a function of the observed data (the response and covariate values). The use of this approach is founded by showing that the sample distribution of the observed responses is identifiable under some general conditions. The goodness of fit of this distribution can be tested using standard test statistics since it refers to the observed data, and we also develop a new test.
We assess the performance of this approach and compare it to existing approaches using data collected in the year 2000 by OECD for the Programme for International Student Assessment (PISA). In the present application we compare students’ scores in mathematics between public and private schools in Ireland and conclude, somewhat surprisingly, that the public schools perform better than the private schools. A similar conclusion is reached by use of instrumental variables.
KEY WORDS: Average treatment effect; Goodness of fit; Identifiability; Instrumental variables; Propensity scores; Sample distribution.
* Joint work with Victoria Landsman (Hebrew University of Jerusalem, Israel)