Leveraging electronic health records data to predict multiple sclerosis disease activity

Abstract Objective No relapse risk prediction tool is currently available to guide treatment selection for multiple sclerosis (MS). Leveraging electronic health record (EHR) data readily available at the point of care, we developed a clinical tool for predicting MS relapse risk. Methods Using data f...

Full description

Bibliographic Details
Main Authors: Yuri Ahuja, Nicole Kim, Liang Liang, Tianrun Cai, Kumar Dahal, Thany Seyok, Chen Lin, Sean Finan, Katherine Liao, Guergana Savovoa, Tanuja Chitnis, Tianxi Cai, Zongqi Xia
Format: Article
Language:English
Published: Wiley 2021-04-01
Series:Annals of Clinical and Translational Neurology
Online Access:https://doi.org/10.1002/acn3.51324
id doaj-b8d3ddfae31444d8848d2b7bb21c45cb
record_format Article
spelling doaj-b8d3ddfae31444d8848d2b7bb21c45cb2021-08-09T12:00:31ZengWileyAnnals of Clinical and Translational Neurology2328-95032021-04-018480081010.1002/acn3.51324Leveraging electronic health records data to predict multiple sclerosis disease activityYuri Ahuja0Nicole Kim1Liang Liang2Tianrun Cai3Kumar Dahal4Thany Seyok5Chen Lin6Sean Finan7Katherine Liao8Guergana Savovoa9Tanuja Chitnis10Tianxi Cai11Zongqi Xia12Department of Biostatistics Harvard T. H. Chan School of Public Health Boston MAUSADepartment of Biostatistics Harvard T. H. Chan School of Public Health Boston MAUSADepartment of Biostatistics Harvard T. H. Chan School of Public Health Boston MAUSADivision of Rheumatology Department of Medicine Brigham and Women’s Hospital Boston MAUSADivision of Rheumatology Department of Medicine Brigham and Women’s Hospital Boston MAUSADivision of Rheumatology Department of Medicine Brigham and Women’s Hospital Boston MAUSAClinical Natural Language Processing Program Boston Children’s Hospital Boston MAUSAClinical Natural Language Processing Program Boston Children’s Hospital Boston MAUSADivision of Rheumatology Department of Medicine Brigham and Women’s Hospital Boston MAUSAClinical Natural Language Processing Program Boston Children’s Hospital Boston MAUSADepartment of Neurology Brigham and Women’s Hospital Boston MAUSADepartment of Biostatistics Harvard T. H. Chan School of Public Health Boston MAUSADepartment of Neurology and Biomedical Informatics University of Pittsburgh Pittsburgh PAUSAAbstract Objective No relapse risk prediction tool is currently available to guide treatment selection for multiple sclerosis (MS). Leveraging electronic health record (EHR) data readily available at the point of care, we developed a clinical tool for predicting MS relapse risk. Methods Using data from a clinic‐based research registry and linked EHR system between 2006 and 2016, we developed models predicting relapse events from the registry in a training set (n = 1435) and tested the model performance in an independent validation set of MS patients (n = 186). This iterative process identified prior 1‐year relapse history as a key predictor of future relapse but ascertaining relapse history through the labor‐intensive chart review is impractical. We pursued two‐stage algorithm development: (1) L1‐regularized logistic regression (LASSO) to phenotype past 1‐year relapse status from contemporaneous EHR data, (2) LASSO to predict future 1‐year relapse risk using imputed prior 1‐year relapse status and other algorithm‐selected features. Results The final model, comprising age, disease duration, and imputed prior 1‐year relapse history, achieved a predictive AUC and F score of 0.707 and 0.307, respectively. The performance was significantly better than the baseline model (age, sex, race/ethnicity, and disease duration) and noninferior to a model containing actual prior 1‐year relapse history. The predicted risk probability declined with disease duration and age. Conclusion Our novel machine‐learning algorithm predicts 1‐year MS relapse with accuracy comparable to other clinical prediction tools and has applicability at the point of care. This EHR‐based two‐stage approach of outcome prediction may have application to neurological disease beyond MS.https://doi.org/10.1002/acn3.51324
collection DOAJ
language English
format Article
sources DOAJ
author Yuri Ahuja
Nicole Kim
Liang Liang
Tianrun Cai
Kumar Dahal
Thany Seyok
Chen Lin
Sean Finan
Katherine Liao
Guergana Savovoa
Tanuja Chitnis
Tianxi Cai
Zongqi Xia
spellingShingle Yuri Ahuja
Nicole Kim
Liang Liang
Tianrun Cai
Kumar Dahal
Thany Seyok
Chen Lin
Sean Finan
Katherine Liao
Guergana Savovoa
Tanuja Chitnis
Tianxi Cai
Zongqi Xia
Leveraging electronic health records data to predict multiple sclerosis disease activity
Annals of Clinical and Translational Neurology
author_facet Yuri Ahuja
Nicole Kim
Liang Liang
Tianrun Cai
Kumar Dahal
Thany Seyok
Chen Lin
Sean Finan
Katherine Liao
Guergana Savovoa
Tanuja Chitnis
Tianxi Cai
Zongqi Xia
author_sort Yuri Ahuja
title Leveraging electronic health records data to predict multiple sclerosis disease activity
title_short Leveraging electronic health records data to predict multiple sclerosis disease activity
title_full Leveraging electronic health records data to predict multiple sclerosis disease activity
title_fullStr Leveraging electronic health records data to predict multiple sclerosis disease activity
title_full_unstemmed Leveraging electronic health records data to predict multiple sclerosis disease activity
title_sort leveraging electronic health records data to predict multiple sclerosis disease activity
publisher Wiley
series Annals of Clinical and Translational Neurology
issn 2328-9503
publishDate 2021-04-01
description Abstract Objective No relapse risk prediction tool is currently available to guide treatment selection for multiple sclerosis (MS). Leveraging electronic health record (EHR) data readily available at the point of care, we developed a clinical tool for predicting MS relapse risk. Methods Using data from a clinic‐based research registry and linked EHR system between 2006 and 2016, we developed models predicting relapse events from the registry in a training set (n = 1435) and tested the model performance in an independent validation set of MS patients (n = 186). This iterative process identified prior 1‐year relapse history as a key predictor of future relapse but ascertaining relapse history through the labor‐intensive chart review is impractical. We pursued two‐stage algorithm development: (1) L1‐regularized logistic regression (LASSO) to phenotype past 1‐year relapse status from contemporaneous EHR data, (2) LASSO to predict future 1‐year relapse risk using imputed prior 1‐year relapse status and other algorithm‐selected features. Results The final model, comprising age, disease duration, and imputed prior 1‐year relapse history, achieved a predictive AUC and F score of 0.707 and 0.307, respectively. The performance was significantly better than the baseline model (age, sex, race/ethnicity, and disease duration) and noninferior to a model containing actual prior 1‐year relapse history. The predicted risk probability declined with disease duration and age. Conclusion Our novel machine‐learning algorithm predicts 1‐year MS relapse with accuracy comparable to other clinical prediction tools and has applicability at the point of care. This EHR‐based two‐stage approach of outcome prediction may have application to neurological disease beyond MS.
url https://doi.org/10.1002/acn3.51324
work_keys_str_mv AT yuriahuja leveragingelectronichealthrecordsdatatopredictmultiplesclerosisdiseaseactivity
AT nicolekim leveragingelectronichealthrecordsdatatopredictmultiplesclerosisdiseaseactivity
AT liangliang leveragingelectronichealthrecordsdatatopredictmultiplesclerosisdiseaseactivity
AT tianruncai leveragingelectronichealthrecordsdatatopredictmultiplesclerosisdiseaseactivity
AT kumardahal leveragingelectronichealthrecordsdatatopredictmultiplesclerosisdiseaseactivity
AT thanyseyok leveragingelectronichealthrecordsdatatopredictmultiplesclerosisdiseaseactivity
AT chenlin leveragingelectronichealthrecordsdatatopredictmultiplesclerosisdiseaseactivity
AT seanfinan leveragingelectronichealthrecordsdatatopredictmultiplesclerosisdiseaseactivity
AT katherineliao leveragingelectronichealthrecordsdatatopredictmultiplesclerosisdiseaseactivity
AT guerganasavovoa leveragingelectronichealthrecordsdatatopredictmultiplesclerosisdiseaseactivity
AT tanujachitnis leveragingelectronichealthrecordsdatatopredictmultiplesclerosisdiseaseactivity
AT tianxicai leveragingelectronichealthrecordsdatatopredictmultiplesclerosisdiseaseactivity
AT zongqixia leveragingelectronichealthrecordsdatatopredictmultiplesclerosisdiseaseactivity
_version_ 1721214054371950592