Often, inverse probability weighting is described as making “pseudo-observations.” Here is a different intuition, using change of measure, as in importance sampling.
Let be the outcome,
the confounders, and
a binary treatment. Assume no unmeasured confounders, or that
contains everything that impacts
and
. Let
Now, focus on
This is a treatment distribution—it is how treatments
are assigned based on baseline covariates
It is the “true” treatment distribution—in other words, it generates the data. For example, in a randomized trial,
The distribution
is often called a “policy,” since it maps covariates to treatments.
Now consider a different treatment policy, which we would like to evaluate in terms of the expected outcome it gives. I am going to subscript expectations with respect to these policies; i.e.,
Define
Now, imagine the distribution
In other words, imagine a policy in which we always draw treatment
regardless of
This corresponds to a mutilated graph, where we break the edge between
and
This is of interest in observational data analysis. We ask: what would the (counterfactual) expected outcome be, if we had always chosen treatment
?
Note that Now, let’s show how we can obtain the inverse-probability estimator for
by using change of measure. Note,
which can be written in terms of the observed data,
as
which is an inverse-probability estimator with estimated propensity
plugged in for
In my opinion, this interpretation of inverse-probability weighting is present in Causality.1 In particular, it is written in Section 1.3.1, Definition 1.3.1, item (ii). However, from that point on, it is not mentioned.
Either way, notation aside, to grapple with this problem, we still always need some kind of causal model in the background, I think.
- Pearl, Judea. Causality. Cambridge university press, 2009. ↩︎
Leave a reply to Policies, adjustment, decisions, sharks – Statistics, medicine, and humanity Cancel reply