Maintaining proper health records improves machine learning predictions for novel 2019-nCoV

Abstract Background An ongoing outbreak of a novel coronavirus (2019-nCoV) pneumonia continues to affect the whole world including major countries such as China, USA, Italy, France and the United Kingdom. We present outcome (‘recovered’, ‘isolated’ or ‘death’) risk estimates of 2019-nCoV over ‘early...

Full description

Bibliographic Details
Main Authors: Koffka Khan, Emilie Ramsahai
Format: Article
Language:English
Published: BMC 2021-05-01
Series:BMC Medical Informatics and Decision Making
Subjects:
Online Access:https://doi.org/10.1186/s12911-021-01537-3
id doaj-2c6b742a098a46b79bbb6eea7291316f
record_format Article
spelling doaj-2c6b742a098a46b79bbb6eea7291316f2021-05-30T11:44:11ZengBMCBMC Medical Informatics and Decision Making1472-69472021-05-0121111310.1186/s12911-021-01537-3Maintaining proper health records improves machine learning predictions for novel 2019-nCoVKoffka Khan0Emilie Ramsahai1Department of Computing and Information Technology, The University of the West IndiesUWI School of Business & Applied Studies Ltd (UWI-ROYTEC)Abstract Background An ongoing outbreak of a novel coronavirus (2019-nCoV) pneumonia continues to affect the whole world including major countries such as China, USA, Italy, France and the United Kingdom. We present outcome (‘recovered’, ‘isolated’ or ‘death’) risk estimates of 2019-nCoV over ‘early’ datasets. A major consideration is the likelihood of death for patients with 2019-nCoV. Method Accounting for the impact of the variations in the reporting rate of 2019-nCoV, we used machine learning techniques (AdaBoost, bagging, extra-trees, decision trees and k-nearest neighbour classifiers) on two 2019-nCoV datasets obtained from Kaggle on March 30, 2020. We used ‘country’, ‘age’ and ‘gender’ as features to predict outcome for both datasets. We included the patient’s ‘disease’ history (only present in the second dataset) to predict the outcome for the second dataset. Results The use of a patient’s ‘disease’ history improves the prediction of ‘death’ by more than sevenfold. The models ignoring a patent’s ‘disease’ history performed poorly in test predictions. Conclusion Our findings indicate the potential of using a patient’s ‘disease’ history as part of the feature set in machine learning techniques to improve 2019-nCoV predictions. This development can have a positive effect on predictive patient treatment and can result in easing currently overburdened healthcare systems worldwide, especially with the increasing prevalence of second and third wave re-infections in some countries.https://doi.org/10.1186/s12911-021-01537-32019-nCoVPneumoniaMachine learningAdaBoostBaggingClassifiers
collection DOAJ
language English
format Article
sources DOAJ
author Koffka Khan
Emilie Ramsahai
spellingShingle Koffka Khan
Emilie Ramsahai
Maintaining proper health records improves machine learning predictions for novel 2019-nCoV
BMC Medical Informatics and Decision Making
2019-nCoV
Pneumonia
Machine learning
AdaBoost
Bagging
Classifiers
author_facet Koffka Khan
Emilie Ramsahai
author_sort Koffka Khan
title Maintaining proper health records improves machine learning predictions for novel 2019-nCoV
title_short Maintaining proper health records improves machine learning predictions for novel 2019-nCoV
title_full Maintaining proper health records improves machine learning predictions for novel 2019-nCoV
title_fullStr Maintaining proper health records improves machine learning predictions for novel 2019-nCoV
title_full_unstemmed Maintaining proper health records improves machine learning predictions for novel 2019-nCoV
title_sort maintaining proper health records improves machine learning predictions for novel 2019-ncov
publisher BMC
series BMC Medical Informatics and Decision Making
issn 1472-6947
publishDate 2021-05-01
description Abstract Background An ongoing outbreak of a novel coronavirus (2019-nCoV) pneumonia continues to affect the whole world including major countries such as China, USA, Italy, France and the United Kingdom. We present outcome (‘recovered’, ‘isolated’ or ‘death’) risk estimates of 2019-nCoV over ‘early’ datasets. A major consideration is the likelihood of death for patients with 2019-nCoV. Method Accounting for the impact of the variations in the reporting rate of 2019-nCoV, we used machine learning techniques (AdaBoost, bagging, extra-trees, decision trees and k-nearest neighbour classifiers) on two 2019-nCoV datasets obtained from Kaggle on March 30, 2020. We used ‘country’, ‘age’ and ‘gender’ as features to predict outcome for both datasets. We included the patient’s ‘disease’ history (only present in the second dataset) to predict the outcome for the second dataset. Results The use of a patient’s ‘disease’ history improves the prediction of ‘death’ by more than sevenfold. The models ignoring a patent’s ‘disease’ history performed poorly in test predictions. Conclusion Our findings indicate the potential of using a patient’s ‘disease’ history as part of the feature set in machine learning techniques to improve 2019-nCoV predictions. This development can have a positive effect on predictive patient treatment and can result in easing currently overburdened healthcare systems worldwide, especially with the increasing prevalence of second and third wave re-infections in some countries.
topic 2019-nCoV
Pneumonia
Machine learning
AdaBoost
Bagging
Classifiers
url https://doi.org/10.1186/s12911-021-01537-3
work_keys_str_mv AT koffkakhan maintainingproperhealthrecordsimprovesmachinelearningpredictionsfornovel2019ncov
AT emilieramsahai maintainingproperhealthrecordsimprovesmachinelearningpredictionsfornovel2019ncov
_version_ 1721420015630024704