Proposition of New Ensemble Data-Intelligence Models for Surface Water Quality Prediction
An accurate prediction of water quality (WQ) related parameters is considered as pivotal decisive tool in sustainable water resources management. In this study, five different ensemble machine learning (ML) models including Quantile regression forest (QRF), Random Forest (RF), radial support vector...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9497111/ |
id |
doaj-c3411a181de944a5a5b557c8922024e7 |
---|---|
record_format |
Article |
spelling |
doaj-c3411a181de944a5a5b557c8922024e72021-08-09T23:00:58ZengIEEEIEEE Access2169-35362021-01-01910852710854110.1109/ACCESS.2021.31004909497111Proposition of New Ensemble Data-Intelligence Models for Surface Water Quality PredictionAli Omran Al-Sulttani0https://orcid.org/0000-0001-8734-5287Mustafa Al-Mukhtar1https://orcid.org/0000-0002-8850-0899Ali B. Roomi2https://orcid.org/0000-0002-5107-5550Aitazaz Ahsan Farooque3https://orcid.org/0000-0002-5353-6752Khaled Mohamed Khedher4https://orcid.org/0000-0002-4167-1690Zaher Mundher Yaseen5https://orcid.org/0000-0003-3647-7137Department of Water Resources Engineering, College of Engineering, University of Baghdad, Baghdad, IraqCivil Engineering Department, University of Technology, Baghdad, IraqMinistry of Education, Directorate of Education Thi-Qar, Thi-Qar, IraqFaculty of Sustainable Design Engineering, University of Prince Edward Island, Charlottetown, CanadaDepartment of Civil Engineering, College of Engineering, King Khalid University, Abha, Saudi ArabiaNew Era and Development in Civil Engineering Research Group, Scientific Research Center, Al-Ayen University, Thi-Qar, IraqAn accurate prediction of water quality (WQ) related parameters is considered as pivotal decisive tool in sustainable water resources management. In this study, five different ensemble machine learning (ML) models including Quantile regression forest (QRF), Random Forest (RF), radial support vector machine (SVM), Stochastic Gradient Boosting (GBM) and Gradient Boosting Machines (GBM_H2O) were developed to predict the monthly biochemical oxygen demand (BOD) values of the Euphrates River, Iraq. For this aim, monthly average data of water temperature (T), Turbidity, pH, Electrical Conductivity (EC), Alkalinity (Alk), Calcium (Ca), chemical oxygen demand (COD), Sulfate (SO<sub>4</sub>), total dissolved solids (TDS), total suspended solids (TSS), and BOD measured for ten years period were used in this study. The performances of these standalone models were compared with integrative models developed by coupling the applied ML models with two different feature extraction algorithms i.e., Genetic Algorithm (GA) and Principal Components Analysis (PCA). The reliability of the applied models was evaluated based on the statistical performance criteria of determination coefficient (R<sup>2</sup>), root mean square error (RMSE), mean absolute error (MAE), Nash-Sutcliffe model efficiency coefficient (NSE), Willmott index (d), and percent bias (PBIAS). Results showed that among the developed models, QRF model attained the superior performance. The performance of the evaluated models presented in this study proved that the developed integrative PCA-QRF model presented much better performance compared with the standalone ones and with those integrated with GA. The statistical criteria of R<sup>2</sup>, RMSE, MAE, NSE, d, and PBIAS of PCA-QRF were 0.94, 0.12, 0.05, 0.93, 0.98, and 0.3, respectively.https://ieeexplore.ieee.org/document/9497111/Semi-arid regionriver water qualitybiochemical oxygen demandprincipal component analysis |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Ali Omran Al-Sulttani Mustafa Al-Mukhtar Ali B. Roomi Aitazaz Ahsan Farooque Khaled Mohamed Khedher Zaher Mundher Yaseen |
spellingShingle |
Ali Omran Al-Sulttani Mustafa Al-Mukhtar Ali B. Roomi Aitazaz Ahsan Farooque Khaled Mohamed Khedher Zaher Mundher Yaseen Proposition of New Ensemble Data-Intelligence Models for Surface Water Quality Prediction IEEE Access Semi-arid region river water quality biochemical oxygen demand principal component analysis |
author_facet |
Ali Omran Al-Sulttani Mustafa Al-Mukhtar Ali B. Roomi Aitazaz Ahsan Farooque Khaled Mohamed Khedher Zaher Mundher Yaseen |
author_sort |
Ali Omran Al-Sulttani |
title |
Proposition of New Ensemble Data-Intelligence Models for Surface Water Quality Prediction |
title_short |
Proposition of New Ensemble Data-Intelligence Models for Surface Water Quality Prediction |
title_full |
Proposition of New Ensemble Data-Intelligence Models for Surface Water Quality Prediction |
title_fullStr |
Proposition of New Ensemble Data-Intelligence Models for Surface Water Quality Prediction |
title_full_unstemmed |
Proposition of New Ensemble Data-Intelligence Models for Surface Water Quality Prediction |
title_sort |
proposition of new ensemble data-intelligence models for surface water quality prediction |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2021-01-01 |
description |
An accurate prediction of water quality (WQ) related parameters is considered as pivotal decisive tool in sustainable water resources management. In this study, five different ensemble machine learning (ML) models including Quantile regression forest (QRF), Random Forest (RF), radial support vector machine (SVM), Stochastic Gradient Boosting (GBM) and Gradient Boosting Machines (GBM_H2O) were developed to predict the monthly biochemical oxygen demand (BOD) values of the Euphrates River, Iraq. For this aim, monthly average data of water temperature (T), Turbidity, pH, Electrical Conductivity (EC), Alkalinity (Alk), Calcium (Ca), chemical oxygen demand (COD), Sulfate (SO<sub>4</sub>), total dissolved solids (TDS), total suspended solids (TSS), and BOD measured for ten years period were used in this study. The performances of these standalone models were compared with integrative models developed by coupling the applied ML models with two different feature extraction algorithms i.e., Genetic Algorithm (GA) and Principal Components Analysis (PCA). The reliability of the applied models was evaluated based on the statistical performance criteria of determination coefficient (R<sup>2</sup>), root mean square error (RMSE), mean absolute error (MAE), Nash-Sutcliffe model efficiency coefficient (NSE), Willmott index (d), and percent bias (PBIAS). Results showed that among the developed models, QRF model attained the superior performance. The performance of the evaluated models presented in this study proved that the developed integrative PCA-QRF model presented much better performance compared with the standalone ones and with those integrated with GA. The statistical criteria of R<sup>2</sup>, RMSE, MAE, NSE, d, and PBIAS of PCA-QRF were 0.94, 0.12, 0.05, 0.93, 0.98, and 0.3, respectively. |
topic |
Semi-arid region river water quality biochemical oxygen demand principal component analysis |
url |
https://ieeexplore.ieee.org/document/9497111/ |
work_keys_str_mv |
AT aliomranalsulttani propositionofnewensembledataintelligencemodelsforsurfacewaterqualityprediction AT mustafaalmukhtar propositionofnewensembledataintelligencemodelsforsurfacewaterqualityprediction AT alibroomi propositionofnewensembledataintelligencemodelsforsurfacewaterqualityprediction AT aitazazahsanfarooque propositionofnewensembledataintelligencemodelsforsurfacewaterqualityprediction AT khaledmohamedkhedher propositionofnewensembledataintelligencemodelsforsurfacewaterqualityprediction AT zahermundheryaseen propositionofnewensembledataintelligencemodelsforsurfacewaterqualityprediction |
_version_ |
1721213458695847936 |