Patient risk stratification with time-varying parameters: A multitask learning approach

The proliferation of electronic health records (EHRs) frames opportunities for using machine learning to build models that help healthcare providers improve patient outcomes. However, building useful risk stratification models presents many technical challenges including the large number of factors...

Full description

Bibliographic Details
Main Authors: Horvitz, Eric (Author), Wiens, Jenna Anne Marleau (Contributor), Guttag, John V (Contributor)
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory (Contributor), Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Contributor)
Format: Article
Language:English
Published: JMLR, Inc., 2018-07-02T17:08:19Z.
Subjects:
Online Access:Get fulltext
LEADER 02430 am a22002053u 4500
001 116717
042 |a dc 
100 1 0 |a Horvitz, Eric  |e author 
100 1 0 |a Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory  |e contributor 
100 1 0 |a Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science  |e contributor 
100 1 0 |a Wiens, Jenna Anne Marleau  |e contributor 
100 1 0 |a Guttag, John V  |e contributor 
700 1 0 |a Wiens, Jenna Anne Marleau  |e author 
700 1 0 |a Guttag, John V  |e author 
245 0 0 |a Patient risk stratification with time-varying parameters: A multitask learning approach 
260 |b JMLR, Inc.,   |c 2018-07-02T17:08:19Z. 
856 |z Get fulltext  |u http://hdl.handle.net/1721.1/116717 
520 |a The proliferation of electronic health records (EHRs) frames opportunities for using machine learning to build models that help healthcare providers improve patient outcomes. However, building useful risk stratification models presents many technical challenges including the large number of factors (both intrinsic and extrinsic) influencing a patient's risk of an adverse outcome and the inherent evolution of that risk over time. We address these challenges in the context of learning a risk stratification model for predicting which patients are at risk of acquiring a Clostridium difficile infection (CDI). We take a novel data-centric approach, leveraging the contents of EHRs from nearly 50,000 hospital admissions. We show how, by adapting techniques from multitask learning, we can learn models for patient risk stratification with unprecedented classification performance. Our model, based on thousands of variables, both time-varying and time-invariant, changes over the course of a patient admission. Applied to a held out set of approximately 25,000 patient admissions, we achieve an area under the receiver operating characteristic curve of 0.81 (95% CI 0.78-0.84). The model has been integrated into the health record system at a large hospital in the US, and can be used to produce daily risk estimates for each inpatient. While more complex than traditional risk stratification methods, the widespread development and use of such data-driven models could ultimately enable cost-effective, targeted prevention strategies that lead to better patient outcomes. 
655 7 |a Article 
773 |t Journal of Machine Learning Research