Centralized and distributed learning methods for predictive health analytics

The U.S. health care system is considered costly and highly inefficient, devoting substantial resources to the treatment of acute conditions in a hospital setting rather than focusing on prevention and keeping patients out of the hospital. The potential for cost savings is large; in the U.S. more th...

Full description

Bibliographic Details
Main Author:	Brisimi, Theodora
Language:	en_US
Published:	2018
Subjects:	Computer science Centralized and distributed methods Data analytics Diabetes hospitalizations Heart hospitalizations Machine learning Predictive health analytics
Online Access:	https://hdl.handle.net/2144/27007

id	ndltd-bu.edu-oai-open.bu.edu-2144-27007
record_format	oai_dc
spelling	ndltd-bu.edu-oai-open.bu.edu-2144-270072019-04-02T06:54:49Z Centralized and distributed learning methods for predictive health analytics Brisimi, Theodora Computer science Centralized and distributed methods Data analytics Diabetes hospitalizations Heart hospitalizations Machine learning Predictive health analytics The U.S. health care system is considered costly and highly inefficient, devoting substantial resources to the treatment of acute conditions in a hospital setting rather than focusing on prevention and keeping patients out of the hospital. The potential for cost savings is large; in the U.S. more than $30 billion are spent each year on hospitalizations deemed preventable, 31% of which is attributed to heart diseases and 20% to diabetes. Motivated by this, our work focuses on developing centralized and distributed learning methods to predict future heart- or diabetes- related hospitalizations based on patient Electronic Health Records (EHRs). We explore a variety of supervised classification methods and we present a novel likelihood ratio based method (K-LRT) that predicts hospitalizations and offers interpretability by identifying the K most significant features that lead to a positive prediction for each patient. Next, assuming that the positive class consists of multiple clusters (hospitalized patients due to different reasons), while the negative class is drawn from a single cluster (non-hospitalized patients healthy in every aspect), we present an alternating optimization approach, which jointly discovers the clusters in the positive class and optimizes the classifiers that separate each positive cluster from the negative samples. We establish the convergence of the method and characterize its VC dimension. Last, we develop a decentralized cluster Primal-Dual Splitting (cPDS) method for large-scale problems, that is computationally efficient and privacy-aware. Such a distributed learning scheme is relevant for multi-institutional collaborations or peer-to-peer applications, allowing the agents to collaborate, while keeping every participant's data private. cPDS is proved to have an improved convergence rate compared to existing centralized and decentralized methods. We test all methods on real EHR data from the Boston Medical Center and compare results in terms of prediction accuracy and interpretability. 2018-02-13T16:27:28Z 2018-02-13T16:27:28Z 2017 2017-11-02T22:14:40Z Thesis/Dissertation https://hdl.handle.net/2144/27007 en_US Attribution 4.0 International http://creativecommons.org/licenses/by/4.0/
collection	NDLTD
language	en_US
sources	NDLTD
topic	Computer science Centralized and distributed methods Data analytics Diabetes hospitalizations Heart hospitalizations Machine learning Predictive health analytics
spellingShingle	Computer science Centralized and distributed methods Data analytics Diabetes hospitalizations Heart hospitalizations Machine learning Predictive health analytics Brisimi, Theodora Centralized and distributed learning methods for predictive health analytics
description	The U.S. health care system is considered costly and highly inefficient, devoting substantial resources to the treatment of acute conditions in a hospital setting rather than focusing on prevention and keeping patients out of the hospital. The potential for cost savings is large; in the U.S. more than $30 billion are spent each year on hospitalizations deemed preventable, 31% of which is attributed to heart diseases and 20% to diabetes. Motivated by this, our work focuses on developing centralized and distributed learning methods to predict future heart- or diabetes- related hospitalizations based on patient Electronic Health Records (EHRs). We explore a variety of supervised classification methods and we present a novel likelihood ratio based method (K-LRT) that predicts hospitalizations and offers interpretability by identifying the K most significant features that lead to a positive prediction for each patient. Next, assuming that the positive class consists of multiple clusters (hospitalized patients due to different reasons), while the negative class is drawn from a single cluster (non-hospitalized patients healthy in every aspect), we present an alternating optimization approach, which jointly discovers the clusters in the positive class and optimizes the classifiers that separate each positive cluster from the negative samples. We establish the convergence of the method and characterize its VC dimension. Last, we develop a decentralized cluster Primal-Dual Splitting (cPDS) method for large-scale problems, that is computationally efficient and privacy-aware. Such a distributed learning scheme is relevant for multi-institutional collaborations or peer-to-peer applications, allowing the agents to collaborate, while keeping every participant's data private. cPDS is proved to have an improved convergence rate compared to existing centralized and decentralized methods. We test all methods on real EHR data from the Boston Medical Center and compare results in terms of prediction accuracy and interpretability.
author	Brisimi, Theodora
author_facet	Brisimi, Theodora
author_sort	Brisimi, Theodora
title	Centralized and distributed learning methods for predictive health analytics
title_short	Centralized and distributed learning methods for predictive health analytics
title_full	Centralized and distributed learning methods for predictive health analytics
title_fullStr	Centralized and distributed learning methods for predictive health analytics
title_full_unstemmed	Centralized and distributed learning methods for predictive health analytics
title_sort	centralized and distributed learning methods for predictive health analytics
publishDate	2018
url	https://hdl.handle.net/2144/27007
work_keys_str_mv	AT brisimitheodora centralizedanddistributedlearningmethodsforpredictivehealthanalytics
_version_	1719008914557632512

Centralized and distributed learning methods for predictive health analytics

Similar Items