Treatment Policy Learning in Multiobjective Settings with Fully Observed Outcomes

© 2020 Owner/Author. In several medical decision-making problems, such as antibiotic prescription, laboratory testing can provide precise indications for how a patient will respond to different treatment options. This enables us to "fully observe" all potential treatment outcomes, but whil...

Full description

Bibliographic Details
Main Authors: Boominathan, Soorajnath (Author), Oberst, Michael (Author), Zhou, Helen (Author), Kanjilal, Sanjat (Author), Sontag, David (Author)
Format: Article
Language:English
Published: ACM, 2021-11-08T16:46:37Z.
Subjects:
Online Access:Get fulltext
LEADER 02070 am a22002053u 4500
001 137708
042 |a dc 
100 1 0 |a Boominathan, Soorajnath  |e author 
700 1 0 |a Oberst, Michael  |e author 
700 1 0 |a Zhou, Helen  |e author 
700 1 0 |a Kanjilal, Sanjat  |e author 
700 1 0 |a Sontag, David  |e author 
245 0 0 |a Treatment Policy Learning in Multiobjective Settings with Fully Observed Outcomes 
260 |b ACM,   |c 2021-11-08T16:46:37Z. 
856 |z Get fulltext  |u https://hdl.handle.net/1721.1/137708 
520 |a © 2020 Owner/Author. In several medical decision-making problems, such as antibiotic prescription, laboratory testing can provide precise indications for how a patient will respond to different treatment options. This enables us to "fully observe" all potential treatment outcomes, but while present in historical data, these results are infeasible to produce in real-time at the point of the initial treatment decision. Moreover, treatment policies in these settings often need to trade off between multiple competing objectives, such as effectiveness of treatment and harmful side effects. We present, compare, and evaluate three approaches for learning individualized treatment policies in this setting: First, we consider two indirect approaches, which use predictive models of treatment response to construct policies optimal for different trade-offs between objectives. Second, we consider a direct approach that constructs such a set of policies without intermediate models of outcomes. Using a medical dataset of Urinary Tract Infection (UTI) patients, we show that all approaches learn policies that achieve strictly better performance on all outcomes than clinicians, while also trading off between different objectives. We demonstrate additional benefits of the direct approach, including flexibly incorporating other goals such as deferral to physicians on simple cases. 
546 |a en 
655 7 |a Article 
773 |t 10.1145/3394486.3403245 
773 |t Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining