The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver Failure
This paper focuses on the comparison of dimensionality reduction effect between LightGBM and XGBoost-FA. With respect to XGBoost, LightGBM can be built in the effect of dimensionality reduction via both Gradient-based One-Side Sampling(GOSS) and Exclusive Feature Bundling(EFB) algorithms, while XGBo...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9284642/ |
id |
doaj-d1f4250375794cadb8a29873aa4fed81 |
---|---|
record_format |
Article |
spelling |
doaj-d1f4250375794cadb8a29873aa4fed812021-03-30T04:46:08ZengIEEEIEEE Access2169-35362020-01-01822099022100310.1109/ACCESS.2020.30428489284642The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver FailureDongyang Zhang0https://orcid.org/0000-0002-9546-307XYicheng Gong1Department of Mathematics and Statistics, Science College, Wuhan University of Science and Technology, Wuhan, ChinaDepartment of Mathematics and Statistics, Science College, Wuhan University of Science and Technology, Wuhan, ChinaThis paper focuses on the comparison of dimensionality reduction effect between LightGBM and XGBoost-FA. With respect to XGBoost, LightGBM can be built in the effect of dimensionality reduction via both Gradient-based One-Side Sampling(GOSS) and Exclusive Feature Bundling(EFB) algorithms, while XGBoost coupling with traditional dimensionality reduction tool Factor Analysis (XGBoost-FA) may also have dimensionality reduction effect. To present the empirical comparison, the prediagnosis dataset for the 2018 Kaggle competition Acute Liver Failure has been chosen as the research object. And pairwise comparison has been conducted among XGBoost, LightGBM, XGBoost-FA and LightGBM-FA. Concerning the test set, the vector (accuracy, log loss function, training time) of the above first four prediagnostic models are (0.75014, 0.569707, 10.5s), (0.75811, 0.576059,15.1s), (0.67786,0.663924,5.7s) and (0.67274,0.676019, 4.1s) respectively. It's been found that the training time of XGBoost-FA (external dimensionality reduction) is shorter than that of LightGBM (build-in dimensionality reduction). Considering (accuracy, training time) being (0.82, 3.1s) published on Kaggle, the algorithm (logogram as K2a) is better than the four XGBoost-FA and LightGBM in both training time and accuracy. However, K2a removes more than 50% samples with missing values and only performs binary classification. For multi-class classification or data with a large number of missing values, XGBoost-FA is more suggested if higher operational time is required, while LightGBM is preferred if higher predictive accuracy is required. With XGBoost-FA or LightGBM being employed in AI medical services, doctors are more productive in diagnosis and treatment due to much more data support and less workload. Both complement each other.https://ieeexplore.ieee.org/document/9284642/LightGBMXGBoostfactor analysisprediagnosisacute liver failure |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Dongyang Zhang Yicheng Gong |
spellingShingle |
Dongyang Zhang Yicheng Gong The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver Failure IEEE Access LightGBM XGBoost factor analysis prediagnosis acute liver failure |
author_facet |
Dongyang Zhang Yicheng Gong |
author_sort |
Dongyang Zhang |
title |
The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver Failure |
title_short |
The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver Failure |
title_full |
The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver Failure |
title_fullStr |
The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver Failure |
title_full_unstemmed |
The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver Failure |
title_sort |
comparison of lightgbm and xgboost coupling factor analysis and prediagnosis of acute liver failure |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2020-01-01 |
description |
This paper focuses on the comparison of dimensionality reduction effect between LightGBM and XGBoost-FA. With respect to XGBoost, LightGBM can be built in the effect of dimensionality reduction via both Gradient-based One-Side Sampling(GOSS) and Exclusive Feature Bundling(EFB) algorithms, while XGBoost coupling with traditional dimensionality reduction tool Factor Analysis (XGBoost-FA) may also have dimensionality reduction effect. To present the empirical comparison, the prediagnosis dataset for the 2018 Kaggle competition Acute Liver Failure has been chosen as the research object. And pairwise comparison has been conducted among XGBoost, LightGBM, XGBoost-FA and LightGBM-FA. Concerning the test set, the vector (accuracy, log loss function, training time) of the above first four prediagnostic models are (0.75014, 0.569707, 10.5s), (0.75811, 0.576059,15.1s), (0.67786,0.663924,5.7s) and (0.67274,0.676019, 4.1s) respectively. It's been found that the training time of XGBoost-FA (external dimensionality reduction) is shorter than that of LightGBM (build-in dimensionality reduction). Considering (accuracy, training time) being (0.82, 3.1s) published on Kaggle, the algorithm (logogram as K2a) is better than the four XGBoost-FA and LightGBM in both training time and accuracy. However, K2a removes more than 50% samples with missing values and only performs binary classification. For multi-class classification or data with a large number of missing values, XGBoost-FA is more suggested if higher operational time is required, while LightGBM is preferred if higher predictive accuracy is required. With XGBoost-FA or LightGBM being employed in AI medical services, doctors are more productive in diagnosis and treatment due to much more data support and less workload. Both complement each other. |
topic |
LightGBM XGBoost factor analysis prediagnosis acute liver failure |
url |
https://ieeexplore.ieee.org/document/9284642/ |
work_keys_str_mv |
AT dongyangzhang thecomparisonoflightgbmandxgboostcouplingfactoranalysisandprediagnosisofacuteliverfailure AT yichenggong thecomparisonoflightgbmandxgboostcouplingfactoranalysisandprediagnosisofacuteliverfailure AT dongyangzhang comparisonoflightgbmandxgboostcouplingfactoranalysisandprediagnosisofacuteliverfailure AT yichenggong comparisonoflightgbmandxgboostcouplingfactoranalysisandprediagnosisofacuteliverfailure |
_version_ |
1724181267904200704 |