The added utility from ordering a test

I think you can choose whether to order a medical test by comparing the expected utility, $E_{\pi}R,$ under $\pi(A=a|X,test)$ and under $\pi(A=a|X).$ If the former is higher, order the test.

Suppose the test is the patient’s systolic blood pressure, which is either high, 130, or normal, 120 (I know – not super realistic). Let’s call the test for blood pressure S_0. Suppose the prevalence of high systolic blood pressure in the population is 5%.

Let S_1 be the patient’s post-treatment blood pressure. The reward is 1 if $S_1$ is 120 and 0 otherwise (it is 0 if $S_1$ <120 or $S_1$ >120 – I know – also not super realistic). We have a treatment, A, that lowers the patient’s systolic blood pressure by 10 units.

If we don’t use the test, i.e., we don’t condition on S_0 in the policy, we have the policy $\pi(A).$ In this case, to maximize $E_{\pi} R,$ $\pi(A)$ would be $\pi^*(A=treat)=0,$ which means never treat.

This is true because only 5% of the people in the population have high $S_0,$ so the best thing to do is to never treat.

Otherwise if we choose to always treat, we do treat the 5% with high blood pressure, and they improve, but we also treat the 95% with normal blood pressures, and they all end up with $S_1$ that is too low.

(Note that it might seem best to set $\pi^*(A=treat)=0.05,$ but this would not necessarily treat the correct people). How do we treat the correct people?

If, however, we use the policy $\pi(A|S_0),$ we can set $\pi^*(A=treat|S_0=high)=1$ and $\pi^*(A=treat|S_0=low)=0.$ Then we will only treat the 5% and leave the other 95% alone, and 100% of the people will have $S_1$ at 120.

More generally, if $X$ are covariates, we have $\pi^*(A|X,test)=\arg\max_{\pi} ER=\arg\max_{\pi}E(E(R|X,test)).$ Then $\pi(A|X)^*=\arg\max_{\pi} ER=\arg\max_{\pi} E(E(R|X)),$ and then we order the test if $E(E_{\pi}^*(A|X,test)(R|X,test)) > E(E_{\pi}^*(A|X)(R|X)).$