Performance improvement of machine learning techniques predicting the association of exacerbation of peak expiratory flow ratio with short term exposure level to indoor air quality using adult asthmatics clustered data.

Large-scale data sources, remote sensing technologies, and superior computing power have tremendously benefitted to environmental health study. Recently, various machine-learning algorithms were introduced to provide mechanistic insights about the heterogeneity of clustered data pertaining to the sy...

Full description

Bibliographic Details
Main Authors: Wan D Bae, Sungroul Kim, Choon-Sik Park, Shayma Alkobaisi, Jongwon Lee, Wonseok Seo, Jong Sook Park, Sujung Park, Sangwoon Lee, Jong Wook Lee
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2021-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0244233
id doaj-601e9822b0cc4c1385fe25f8bdb56bb2
record_format Article
spelling doaj-601e9822b0cc4c1385fe25f8bdb56bb22021-04-27T04:30:29ZengPublic Library of Science (PLoS)PLoS ONE1932-62032021-01-01161e024423310.1371/journal.pone.0244233Performance improvement of machine learning techniques predicting the association of exacerbation of peak expiratory flow ratio with short term exposure level to indoor air quality using adult asthmatics clustered data.Wan D BaeSungroul KimChoon-Sik ParkShayma AlkobaisiJongwon LeeWonseok SeoJong Sook ParkSujung ParkSangwoon LeeJong Wook LeeLarge-scale data sources, remote sensing technologies, and superior computing power have tremendously benefitted to environmental health study. Recently, various machine-learning algorithms were introduced to provide mechanistic insights about the heterogeneity of clustered data pertaining to the symptoms of each asthma patient and potential environmental risk factors. However, there is limited information on the performance of these machine learning tools. In this study, we compared the performance of ten machine-learning techniques. Using an advanced method of imbalanced sampling (IS), we improved the performance of nine conventional machine learning techniques predicting the association between exposure level to indoor air quality and change in patients' peak expiratory flow rate (PEFR). We then proposed a deep learning method of transfer learning (TL) for further improvement in prediction accuracy. Our selected final prediction techniques (TL1_IS or TL2-IS) achieved a balanced accuracy median (interquartile range) of 66(56~76) % for TL1_IS and 68(63~78) % for TL2_IS. Precision levels for TL1_IS and TL2_IS were 68(62~72) % and 66(62~69) % while sensitivity levels were 58(50~67) % and 59(51~80) % from 25 patients which were approximately 1.08 (accuracy, precision) to 1.28 (sensitivity) times increased in terms of performance outcomes, compared to NN_IS. Our results indicate that the transfer machine learning technique with imbalanced sampling is a powerful tool to predict the change in PEFR due to exposure to indoor air including the concentration of particulate matter of 2.5 μm and carbon dioxide. This modeling technique is even applicable with small-sized or imbalanced dataset, which represents a personalized, real-world setting.https://doi.org/10.1371/journal.pone.0244233
collection DOAJ
language English
format Article
sources DOAJ
author Wan D Bae
Sungroul Kim
Choon-Sik Park
Shayma Alkobaisi
Jongwon Lee
Wonseok Seo
Jong Sook Park
Sujung Park
Sangwoon Lee
Jong Wook Lee
spellingShingle Wan D Bae
Sungroul Kim
Choon-Sik Park
Shayma Alkobaisi
Jongwon Lee
Wonseok Seo
Jong Sook Park
Sujung Park
Sangwoon Lee
Jong Wook Lee
Performance improvement of machine learning techniques predicting the association of exacerbation of peak expiratory flow ratio with short term exposure level to indoor air quality using adult asthmatics clustered data.
PLoS ONE
author_facet Wan D Bae
Sungroul Kim
Choon-Sik Park
Shayma Alkobaisi
Jongwon Lee
Wonseok Seo
Jong Sook Park
Sujung Park
Sangwoon Lee
Jong Wook Lee
author_sort Wan D Bae
title Performance improvement of machine learning techniques predicting the association of exacerbation of peak expiratory flow ratio with short term exposure level to indoor air quality using adult asthmatics clustered data.
title_short Performance improvement of machine learning techniques predicting the association of exacerbation of peak expiratory flow ratio with short term exposure level to indoor air quality using adult asthmatics clustered data.
title_full Performance improvement of machine learning techniques predicting the association of exacerbation of peak expiratory flow ratio with short term exposure level to indoor air quality using adult asthmatics clustered data.
title_fullStr Performance improvement of machine learning techniques predicting the association of exacerbation of peak expiratory flow ratio with short term exposure level to indoor air quality using adult asthmatics clustered data.
title_full_unstemmed Performance improvement of machine learning techniques predicting the association of exacerbation of peak expiratory flow ratio with short term exposure level to indoor air quality using adult asthmatics clustered data.
title_sort performance improvement of machine learning techniques predicting the association of exacerbation of peak expiratory flow ratio with short term exposure level to indoor air quality using adult asthmatics clustered data.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2021-01-01
description Large-scale data sources, remote sensing technologies, and superior computing power have tremendously benefitted to environmental health study. Recently, various machine-learning algorithms were introduced to provide mechanistic insights about the heterogeneity of clustered data pertaining to the symptoms of each asthma patient and potential environmental risk factors. However, there is limited information on the performance of these machine learning tools. In this study, we compared the performance of ten machine-learning techniques. Using an advanced method of imbalanced sampling (IS), we improved the performance of nine conventional machine learning techniques predicting the association between exposure level to indoor air quality and change in patients' peak expiratory flow rate (PEFR). We then proposed a deep learning method of transfer learning (TL) for further improvement in prediction accuracy. Our selected final prediction techniques (TL1_IS or TL2-IS) achieved a balanced accuracy median (interquartile range) of 66(56~76) % for TL1_IS and 68(63~78) % for TL2_IS. Precision levels for TL1_IS and TL2_IS were 68(62~72) % and 66(62~69) % while sensitivity levels were 58(50~67) % and 59(51~80) % from 25 patients which were approximately 1.08 (accuracy, precision) to 1.28 (sensitivity) times increased in terms of performance outcomes, compared to NN_IS. Our results indicate that the transfer machine learning technique with imbalanced sampling is a powerful tool to predict the change in PEFR due to exposure to indoor air including the concentration of particulate matter of 2.5 μm and carbon dioxide. This modeling technique is even applicable with small-sized or imbalanced dataset, which represents a personalized, real-world setting.
url https://doi.org/10.1371/journal.pone.0244233
work_keys_str_mv AT wandbae performanceimprovementofmachinelearningtechniquespredictingtheassociationofexacerbationofpeakexpiratoryflowratiowithshorttermexposureleveltoindoorairqualityusingadultasthmaticsclustereddata
AT sungroulkim performanceimprovementofmachinelearningtechniquespredictingtheassociationofexacerbationofpeakexpiratoryflowratiowithshorttermexposureleveltoindoorairqualityusingadultasthmaticsclustereddata
AT choonsikpark performanceimprovementofmachinelearningtechniquespredictingtheassociationofexacerbationofpeakexpiratoryflowratiowithshorttermexposureleveltoindoorairqualityusingadultasthmaticsclustereddata
AT shaymaalkobaisi performanceimprovementofmachinelearningtechniquespredictingtheassociationofexacerbationofpeakexpiratoryflowratiowithshorttermexposureleveltoindoorairqualityusingadultasthmaticsclustereddata
AT jongwonlee performanceimprovementofmachinelearningtechniquespredictingtheassociationofexacerbationofpeakexpiratoryflowratiowithshorttermexposureleveltoindoorairqualityusingadultasthmaticsclustereddata
AT wonseokseo performanceimprovementofmachinelearningtechniquespredictingtheassociationofexacerbationofpeakexpiratoryflowratiowithshorttermexposureleveltoindoorairqualityusingadultasthmaticsclustereddata
AT jongsookpark performanceimprovementofmachinelearningtechniquespredictingtheassociationofexacerbationofpeakexpiratoryflowratiowithshorttermexposureleveltoindoorairqualityusingadultasthmaticsclustereddata
AT sujungpark performanceimprovementofmachinelearningtechniquespredictingtheassociationofexacerbationofpeakexpiratoryflowratiowithshorttermexposureleveltoindoorairqualityusingadultasthmaticsclustereddata
AT sangwoonlee performanceimprovementofmachinelearningtechniquespredictingtheassociationofexacerbationofpeakexpiratoryflowratiowithshorttermexposureleveltoindoorairqualityusingadultasthmaticsclustereddata
AT jongwooklee performanceimprovementofmachinelearningtechniquespredictingtheassociationofexacerbationofpeakexpiratoryflowratiowithshorttermexposureleveltoindoorairqualityusingadultasthmaticsclustereddata
_version_ 1714655686746439680