Machine Learning to Predict Mortality and Critical Events in a Cohort of Patients With COVID-19 in New York City: Model Development and Validation
BackgroundCOVID-19 has infected millions of people worldwide and is responsible for several hundred thousand fatalities. The COVID-19 pandemic has necessitated thoughtful resource allocation and early identification of high-risk patients. However, effective methods to meet th...
Main Authors: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
JMIR Publications
2020-11-01
|
Series: | Journal of Medical Internet Research |
Online Access: | https://www.jmir.org/2020/11/e24018 |
id |
doaj-d26bd88c400844cfbae830c0f468bbfa |
---|---|
record_format |
Article |
spelling |
doaj-d26bd88c400844cfbae830c0f468bbfa2021-04-02T21:35:58ZengJMIR PublicationsJournal of Medical Internet Research1438-88712020-11-012211e2401810.2196/24018Machine Learning to Predict Mortality and Critical Events in a Cohort of Patients With COVID-19 in New York City: Model Development and ValidationVaid, AkhilSomani, SulaimanRussak, Adam JDe Freitas, Jessica KChaudhry, Fayzan FParanjpe, IshanJohnson, Kipp WLee, Samuel JMiotto, RiccardoRichter, FelixZhao, ShanBeckmann, Noam DNaik, NidhiKia, ArashTimsina, PremLala, AnuradhaParanjpe, ManishGolden, EddyeDanieletto, MatteoSingh, ManbirMeyer, DaraO'Reilly, Paul FHuckins, LauraKovatch, PatriciaFinkelstein, JosephFreeman, Robert M.Argulian, EdgarKasarskis, AndrewPercha, BethanyAberg, Judith ABagiella, EmiliaHorowitz, Carol RMurphy, BarbaraNestler, Eric JSchadt, Eric ECho, Judy HCordon-Cardo, CarlosFuster, ValentinCharney, Dennis SReich, David LBottinger, Erwin PLevin, Matthew ANarula, JagatFayad, Zahi AJust, Allan CCharney, Alexander WNadkarni, Girish NGlicksberg, Benjamin S BackgroundCOVID-19 has infected millions of people worldwide and is responsible for several hundred thousand fatalities. The COVID-19 pandemic has necessitated thoughtful resource allocation and early identification of high-risk patients. However, effective methods to meet these needs are lacking. ObjectiveThe aims of this study were to analyze the electronic health records (EHRs) of patients who tested positive for COVID-19 and were admitted to hospitals in the Mount Sinai Health System in New York City; to develop machine learning models for making predictions about the hospital course of the patients over clinically meaningful time horizons based on patient characteristics at admission; and to assess the performance of these models at multiple hospitals and time points. MethodsWe used Extreme Gradient Boosting (XGBoost) and baseline comparator models to predict in-hospital mortality and critical events at time windows of 3, 5, 7, and 10 days from admission. Our study population included harmonized EHR data from five hospitals in New York City for 4098 COVID-19–positive patients admitted from March 15 to May 22, 2020. The models were first trained on patients from a single hospital (n=1514) before or on May 1, externally validated on patients from four other hospitals (n=2201) before or on May 1, and prospectively validated on all patients after May 1 (n=383). Finally, we established model interpretability to identify and rank variables that drive model predictions. ResultsUpon cross-validation, the XGBoost classifier outperformed baseline models, with an area under the receiver operating characteristic curve (AUC-ROC) for mortality of 0.89 at 3 days, 0.85 at 5 and 7 days, and 0.84 at 10 days. XGBoost also performed well for critical event prediction, with an AUC-ROC of 0.80 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. In external validation, XGBoost achieved an AUC-ROC of 0.88 at 3 days, 0.86 at 5 days, 0.86 at 7 days, and 0.84 at 10 days for mortality prediction. Similarly, the unimputed XGBoost model achieved an AUC-ROC of 0.78 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. Trends in performance on prospective validation sets were similar. At 7 days, acute kidney injury on admission, elevated LDH, tachypnea, and hyperglycemia were the strongest drivers of critical event prediction, while higher age, anion gap, and C-reactive protein were the strongest drivers of mortality prediction. ConclusionsWe externally and prospectively trained and validated machine learning models for mortality and critical events for patients with COVID-19 at different time horizons. These models identified at-risk patients and uncovered underlying relationships that predicted outcomes.https://www.jmir.org/2020/11/e24018 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Vaid, Akhil Somani, Sulaiman Russak, Adam J De Freitas, Jessica K Chaudhry, Fayzan F Paranjpe, Ishan Johnson, Kipp W Lee, Samuel J Miotto, Riccardo Richter, Felix Zhao, Shan Beckmann, Noam D Naik, Nidhi Kia, Arash Timsina, Prem Lala, Anuradha Paranjpe, Manish Golden, Eddye Danieletto, Matteo Singh, Manbir Meyer, Dara O'Reilly, Paul F Huckins, Laura Kovatch, Patricia Finkelstein, Joseph Freeman, Robert M. Argulian, Edgar Kasarskis, Andrew Percha, Bethany Aberg, Judith A Bagiella, Emilia Horowitz, Carol R Murphy, Barbara Nestler, Eric J Schadt, Eric E Cho, Judy H Cordon-Cardo, Carlos Fuster, Valentin Charney, Dennis S Reich, David L Bottinger, Erwin P Levin, Matthew A Narula, Jagat Fayad, Zahi A Just, Allan C Charney, Alexander W Nadkarni, Girish N Glicksberg, Benjamin S |
spellingShingle |
Vaid, Akhil Somani, Sulaiman Russak, Adam J De Freitas, Jessica K Chaudhry, Fayzan F Paranjpe, Ishan Johnson, Kipp W Lee, Samuel J Miotto, Riccardo Richter, Felix Zhao, Shan Beckmann, Noam D Naik, Nidhi Kia, Arash Timsina, Prem Lala, Anuradha Paranjpe, Manish Golden, Eddye Danieletto, Matteo Singh, Manbir Meyer, Dara O'Reilly, Paul F Huckins, Laura Kovatch, Patricia Finkelstein, Joseph Freeman, Robert M. Argulian, Edgar Kasarskis, Andrew Percha, Bethany Aberg, Judith A Bagiella, Emilia Horowitz, Carol R Murphy, Barbara Nestler, Eric J Schadt, Eric E Cho, Judy H Cordon-Cardo, Carlos Fuster, Valentin Charney, Dennis S Reich, David L Bottinger, Erwin P Levin, Matthew A Narula, Jagat Fayad, Zahi A Just, Allan C Charney, Alexander W Nadkarni, Girish N Glicksberg, Benjamin S Machine Learning to Predict Mortality and Critical Events in a Cohort of Patients With COVID-19 in New York City: Model Development and Validation Journal of Medical Internet Research |
author_facet |
Vaid, Akhil Somani, Sulaiman Russak, Adam J De Freitas, Jessica K Chaudhry, Fayzan F Paranjpe, Ishan Johnson, Kipp W Lee, Samuel J Miotto, Riccardo Richter, Felix Zhao, Shan Beckmann, Noam D Naik, Nidhi Kia, Arash Timsina, Prem Lala, Anuradha Paranjpe, Manish Golden, Eddye Danieletto, Matteo Singh, Manbir Meyer, Dara O'Reilly, Paul F Huckins, Laura Kovatch, Patricia Finkelstein, Joseph Freeman, Robert M. Argulian, Edgar Kasarskis, Andrew Percha, Bethany Aberg, Judith A Bagiella, Emilia Horowitz, Carol R Murphy, Barbara Nestler, Eric J Schadt, Eric E Cho, Judy H Cordon-Cardo, Carlos Fuster, Valentin Charney, Dennis S Reich, David L Bottinger, Erwin P Levin, Matthew A Narula, Jagat Fayad, Zahi A Just, Allan C Charney, Alexander W Nadkarni, Girish N Glicksberg, Benjamin S |
author_sort |
Vaid, Akhil |
title |
Machine Learning to Predict Mortality and Critical Events in a Cohort of Patients With COVID-19 in New York City: Model Development and Validation |
title_short |
Machine Learning to Predict Mortality and Critical Events in a Cohort of Patients With COVID-19 in New York City: Model Development and Validation |
title_full |
Machine Learning to Predict Mortality and Critical Events in a Cohort of Patients With COVID-19 in New York City: Model Development and Validation |
title_fullStr |
Machine Learning to Predict Mortality and Critical Events in a Cohort of Patients With COVID-19 in New York City: Model Development and Validation |
title_full_unstemmed |
Machine Learning to Predict Mortality and Critical Events in a Cohort of Patients With COVID-19 in New York City: Model Development and Validation |
title_sort |
machine learning to predict mortality and critical events in a cohort of patients with covid-19 in new york city: model development and validation |
publisher |
JMIR Publications |
series |
Journal of Medical Internet Research |
issn |
1438-8871 |
publishDate |
2020-11-01 |
description |
BackgroundCOVID-19 has infected millions of people worldwide and is responsible for several hundred thousand fatalities. The COVID-19 pandemic has necessitated thoughtful resource allocation and early identification of high-risk patients. However, effective methods to meet these needs are lacking.
ObjectiveThe aims of this study were to analyze the electronic health records (EHRs) of patients who tested positive for COVID-19 and were admitted to hospitals in the Mount Sinai Health System in New York City; to develop machine learning models for making predictions about the hospital course of the patients over clinically meaningful time horizons based on patient characteristics at admission; and to assess the performance of these models at multiple hospitals and time points.
MethodsWe used Extreme Gradient Boosting (XGBoost) and baseline comparator models to predict in-hospital mortality and critical events at time windows of 3, 5, 7, and 10 days from admission. Our study population included harmonized EHR data from five hospitals in New York City for 4098 COVID-19–positive patients admitted from March 15 to May 22, 2020. The models were first trained on patients from a single hospital (n=1514) before or on May 1, externally validated on patients from four other hospitals (n=2201) before or on May 1, and prospectively validated on all patients after May 1 (n=383). Finally, we established model interpretability to identify and rank variables that drive model predictions.
ResultsUpon cross-validation, the XGBoost classifier outperformed baseline models, with an area under the receiver operating characteristic curve (AUC-ROC) for mortality of 0.89 at 3 days, 0.85 at 5 and 7 days, and 0.84 at 10 days. XGBoost also performed well for critical event prediction, with an AUC-ROC of 0.80 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. In external validation, XGBoost achieved an AUC-ROC of 0.88 at 3 days, 0.86 at 5 days, 0.86 at 7 days, and 0.84 at 10 days for mortality prediction. Similarly, the unimputed XGBoost model achieved an AUC-ROC of 0.78 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. Trends in performance on prospective validation sets were similar. At 7 days, acute kidney injury on admission, elevated LDH, tachypnea, and hyperglycemia were the strongest drivers of critical event prediction, while higher age, anion gap, and C-reactive protein were the strongest drivers of mortality prediction.
ConclusionsWe externally and prospectively trained and validated machine learning models for mortality and critical events for patients with COVID-19 at different time horizons. These models identified at-risk patients and uncovered underlying relationships that predicted outcomes. |
url |
https://www.jmir.org/2020/11/e24018 |
work_keys_str_mv |
AT vaidakhil machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT somanisulaiman machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT russakadamj machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT defreitasjessicak machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT chaudhryfayzanf machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT paranjpeishan machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT johnsonkippw machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT leesamuelj machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT miottoriccardo machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT richterfelix machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT zhaoshan machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT beckmannnoamd machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT naiknidhi machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT kiaarash machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT timsinaprem machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT lalaanuradha machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT paranjpemanish machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT goldeneddye machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT danielettomatteo machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT singhmanbir machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT meyerdara machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT oreillypaulf machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT huckinslaura machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT kovatchpatricia machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT finkelsteinjoseph machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT freemanrobertm machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT argulianedgar machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT kasarskisandrew machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT perchabethany machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT abergjuditha machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT bagiellaemilia machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT horowitzcarolr machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT murphybarbara machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT nestlerericj machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT schadterice machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT chojudyh machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT cordoncardocarlos machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT fustervalentin machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT charneydenniss machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT reichdavidl machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT bottingererwinp machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT levinmatthewa machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT narulajagat machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT fayadzahia machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT justallanc machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT charneyalexanderw machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT nadkarnigirishn machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation AT glicksbergbenjamins machinelearningtopredictmortalityandcriticaleventsinacohortofpatientswithcovid19innewyorkcitymodeldevelopmentandvalidation |
_version_ |
1721545108237582336 |