Should chatbots make medical decisions?

I wrote a new manuscript on language models and medical decision making! Below are some takeaways.

Overall, medical decision-making must start with optimizing patient utility. In practice, this requires balancing experience, knowledge, imitation, and evidence, where the latter can be derived from trials or observational studies.

Beware: chatting, which is mostly about the imitation of conversation, is fundamentally different from treating disease, which is mostly about the optimization of patient utility. Despite this, if their engineers wish, chatbots might do good things in medicine, such as helping providers search the literature or supporting patients. However, this is not directly related to solving the treatment problem.

Designing a system to solve the true treatment problem is complex (imitating medical notes will not work). The barriers are less about technology and more about the ethics of randomization. One might be able to use observational data to work around this, but this is an already-established research area within evidence-based medicine. We should investigate how language models might help with this, but we cannot let the hype associated with chatbots—and artificial intelligence more generally—overshadow other problems in evidence-based medicine, many of which are quite pressing.

For more detail, see my substack post and manuscript.

One response to “Should chatbots make medical decisions?”

GPT-5 and medical decisions – Statistics, medicine, humanity

December 1, 2025

[…] I give more support for what I’ve written above in this manuscript and also in this substack (and for a quick summary, see this post). […]

Should chatbots make medical decisions?

One response to “Should chatbots make medical decisions?”

Leave a comment Cancel reply