Inverse probability weighting as change of measure

Often, inverse probability weighting is described as making “pseudo-observations.” Here is a different intuition, using change of measure, as in importance sampling.

Let $Y$ be the outcome, $X$ the confounders, and $A$ a binary treatment. Assume no unmeasured confounders, or that $X$ contains everything that impacts $Y$ and $A$ . Let $Y,A,X \sim P_0(Y,A,X)=P(Y|A,X)P_0(A|X)P(X).$ Now, focus on $P_0(A|X).$ This is a treatment distribution—it is how treatments $A$ are assigned based on baseline covariates $X.$ It is the “true” treatment distribution—in other words, it generates the data. For example, in a randomized trial, $P_0(A=1|X)=0.5.$ The distribution $P_0(A|X)$ is often called a “policy,” since it maps covariates to treatments.

Now consider a different treatment policy, $\pi(A|X),$ which we would like to evaluate in terms of the expected outcome it gives. I am going to subscript expectations with respect to these policies; i.e., $E_{\pi(A|X)}Y.$ Define $P(Y,A,X)=P(Y|A,X)\pi(A|X)P(X).$ Now, imagine the distribution $\pi(A=a|X=x)=I(A=a).$ In other words, imagine a policy in which we always draw treatment $a,$ regardless of $X.$ This corresponds to a mutilated graph, where we break the edge between $X$ and $A.$ This is of interest in observational data analysis. We ask: what would the (counterfactual) expected outcome be, if we had always chosen treatment $a$ ?

Note that $EY(a)=E_{\pi(A|X)}Y=\int y P(y|a,x)\pi(A=a|x)p(x)dP.$ Now, let’s show how we can obtain the inverse-probability estimator for $EY(a)$ by using change of measure. Note, $E_{\pi(A|X)}Y=E_{\pi(A|X)}\frac{P_0(Y,A,X)}{P_0(Y,A,X)} Y = E_{P_0(A|X)}\frac{P(Y|A,X)\pi(A|X)P(X)}{P(Y|A,X)P_0(A|X)P(X)} Y,$ which can be written in terms of the observed data, $(Y_i,A_i,X_i), i=1,\dots, n,$ as $\frac{1}{n} \sum \frac{I(A_i=a)}{P_{0n}(A_i|X_i)} Y_i,$ which is an inverse-probability estimator with estimated propensity $P_{0n}(A_i|X_i)$ plugged in for $P_0(A_i|X_i).$

In my opinion, this interpretation of inverse-probability weighting is present in Causality.¹ In particular, it is written in Section 1.3.1, Definition 1.3.1, item (ii). However, from that point on, it is not mentioned.

Either way, notation aside, to grapple with this problem, we still always need some kind of causal model in the background, I think.

Pearl, Judea. Causality. Cambridge university press, 2009. ↩︎

2 responses to “Inverse probability weighting as change of measure”

Propensity vs outcome model – Statistics, medicine, humanity

March 20, 2025

[…] into account. I therefore might tend toward a method like inverse-probability weighting (see this post) in this case. In other areas of study, it’s much harder to specify a propensity, since nature […]

Policies, adjustment, decisions, sharks – Statistics, medicine, and humanity

January 3, 2026

[…] us to derive the adjustment formula using operations from probability alone (I discussed this a bit before, too). Second, I claim that this allows us to think about decision-making in a natural […]

Inverse probability weighting as change of measure

2 responses to “Inverse probability weighting as change of measure”

Leave a reply to Propensity vs outcome model – Statistics, medicine, humanity Cancel reply