Decision analysis (a probabilistic treatment)

In the following post, I will list the steps for using probability to describe medical decisions. I call it a “probabilistic treatment,” because very few texts on decision analysis define random variables and write expected utility as a function of them. In dynamic treatment regimes and reinforcement learning, they do this quite a bit, and I think that it streamlines the exposition and showcases the expressive power of the framework.

Gather information. Suppose we have some patient covariates, S, which take value s (i.e., S is a random variable and s its realization). Patient covariates contain information about the patient, such as the patient’s blood pressure and heart rate. So, for example, if S is blood pressure (e.g., mean arterial pressure), it might take the value s, which is 60. The random variable S is called “random,” because it might randomly assume a value of 60 with some probability, a value of 63 with some probability, etc. Ideally, we have some notion of what these probabilities are.

Decide how to treat. Suppose we have a binary treatment, A (for example, vasopressors), which will be given according to s. In other words, we give a vasopressor according to the patient’s baseline pressure. This treatment is “binary,” because it can be either 1 or 0 (e.g., one can give a vasopressor or not). Draw A from the conditional distribution \pi(A|S=s), which we call a “treatment policy.” This notation is a probabilistic way of saying that our propensity to give vasopressors is based on what we know about the patient (the vertical line symbol means “conditional upon”). For example, we might be more likely to give a vasopressor when the patient’s blood pressure is low.

Describe how baseline covariates and treatments affect the patient outcomes. Suppose we have some future, post-treatment covariates, S', which depend on the initial covariates and the treatment. These post-treatment covariates are sometimes known as “outcomes.” For example, what will the patient’s final blood pressure be, given their initial blood pressure and whether they received a vasopressor? Draw S' from p(S'|A,S=s). This notation means that a patient’s outcome depends in a probabilistic way on their initial characteristics and the treatment they received.

Think about the patient’s wellbeing. Define a reward, R(S,A,S'). This is the absolute key: thinking not just about outcomes but also about their meaning with respect to the patient. For example, R might pick out one covariate, which is an informal mathematical way of saying that the final blood pressure is what really matters. We might also write a function that maps a final blood pressure to a degree of “goodness,” such as an indicator that blood pressure is in a certain healthy range.

We can use a weighted sum of different outcomes to express a more complex reward. For example, we might want higher blood pressure while avoiding side effects. The reward (a term taken from reinforcement learning) is also known as “utility” in the decision analysis literature.

Relate the treatment policy, \pi, to the reward, R. Write the expectation

E_{\pi} (R|S=s)= \int_{s'} \sum_a R(s,a,s') p(s'|a,s) \pi(a|s) ds'.

This expression, the average1 reward, is where things come together; in other words, it is a single equation in which we see all the pieces that were described above: the baseline covariates, S=s, the treatment policy, \pi, the outcome, S', and the reward function, R.

The average reward is what one would expect if one acted according to the treatment policy over many decisions, for many different patients, and then averaged the results. The equation essentially says that the patient’s average reward, or utility, will depend on their characteristics, how those characteristics lead to a treatment, and the outcome, as it depends on both their characteristics and the treatment.

The integrals and sums are effectively saying that the average reward should depend on the potential rewards for different possible combinations of initial blood pressures, vasopressor administrations, and final blood pressures, where these combinations are weighted according to how likely they are. Concretely, if it is extremely likely that a vasopressor increases blood pressure, then the expected (average) reward will look most like the reward that occurs with an increased blood pressure.

Now, given the equation above, if you give me a treatment policy, I can tell you its average reward. This can be useful for the next step.

Find the treatment policy that gives the most reward, on average. Estimate \pi^*, where

\pi^*=\arg\max_{\pi} E_{\pi} (R|S=s).

This is the “best” treatment policy, because it will give the highest reward on average. In other words, we are saying that the best policy, \pi^*, is the policy that gives the highest “expected” reward. Since I can tell you the expected reward for any policy, we can use optimization to flip the problem around and determine the policy that gives the best expected reward.

Now you know the steps to use decision analysis to describe a medical decision and arrive at a policy that will output an optimal medical decision, on average. The expected utility framework has been criticized at times. However, when one considers utility as a function of random variables, it is so flexible that is can describe almost any decision-making process.

Hypotension was the example above, but decision analysis can be applied to many different problems. For example, for a patient with cellulitis, the initial state might represent the rash morphology (e.g., purulent vs. non-purulent), sepsis status, and the patient’s baseline kidney function. The action might be whether to prescribe vancomycin versus ceftriaxone. The outcomes might be whether the patient has resolution of the rash and any side effects of the chosen antibiotics. Finally, the reward might be a function describing how these outcomes impact the patient.

For more on using probability to describe medical decisions, see papers on dynamic treatment regimes (e.g., Murphy, Robins), reinforcement learning (e.g., Sutton), or decision theory/analysis (e.g., Dawid, Pauker).

  1. The average is just one choice (one might also consider the median reward, for example), but the average is convenient, because we know a lot about maximizing averages. ↩︎

2 responses to “Decision analysis (a probabilistic treatment)”

  1. […] previously described decision analysis, a well-known framework for making an “optimal” healthcare decision. As a […]

  2. […] expounded on the benefits of decision analysis for medical decision making. I’ve discussed how it helps define the roles in a decision, which […]

Leave a reply to Decision analysis and medical research – Statistics, medicine, humanity Cancel reply