The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver Failure

This paper focuses on the comparison of dimensionality reduction effect between LightGBM and XGBoost-FA. With respect to XGBoost, LightGBM can be built in the effect of dimensionality reduction via both Gradient-based One-Side Sampling(GOSS) and Exclusive Feature Bundling(EFB) algorithms, while XGBo...

Full description

Bibliographic Details
Main Authors: Dongyang Zhang, Yicheng Gong
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9284642/
id doaj-d1f4250375794cadb8a29873aa4fed81
record_format Article
spelling doaj-d1f4250375794cadb8a29873aa4fed812021-03-30T04:46:08ZengIEEEIEEE Access2169-35362020-01-01822099022100310.1109/ACCESS.2020.30428489284642The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver FailureDongyang Zhang0https://orcid.org/0000-0002-9546-307XYicheng Gong1Department of Mathematics and Statistics, Science College, Wuhan University of Science and Technology, Wuhan, ChinaDepartment of Mathematics and Statistics, Science College, Wuhan University of Science and Technology, Wuhan, ChinaThis paper focuses on the comparison of dimensionality reduction effect between LightGBM and XGBoost-FA. With respect to XGBoost, LightGBM can be built in the effect of dimensionality reduction via both Gradient-based One-Side Sampling(GOSS) and Exclusive Feature Bundling(EFB) algorithms, while XGBoost coupling with traditional dimensionality reduction tool Factor Analysis (XGBoost-FA) may also have dimensionality reduction effect. To present the empirical comparison, the prediagnosis dataset for the 2018 Kaggle competition Acute Liver Failure has been chosen as the research object. And pairwise comparison has been conducted among XGBoost, LightGBM, XGBoost-FA and LightGBM-FA. Concerning the test set, the vector (accuracy, log loss function, training time) of the above first four prediagnostic models are (0.75014, 0.569707, 10.5s), (0.75811, 0.576059,15.1s), (0.67786,0.663924,5.7s) and (0.67274,0.676019, 4.1s) respectively. It's been found that the training time of XGBoost-FA (external dimensionality reduction) is shorter than that of LightGBM (build-in dimensionality reduction). Considering (accuracy, training time) being (0.82, 3.1s) published on Kaggle, the algorithm (logogram as K2a) is better than the four XGBoost-FA and LightGBM in both training time and accuracy. However, K2a removes more than 50% samples with missing values and only performs binary classification. For multi-class classification or data with a large number of missing values, XGBoost-FA is more suggested if higher operational time is required, while LightGBM is preferred if higher predictive accuracy is required. With XGBoost-FA or LightGBM being employed in AI medical services, doctors are more productive in diagnosis and treatment due to much more data support and less workload. Both complement each other.https://ieeexplore.ieee.org/document/9284642/LightGBMXGBoostfactor analysisprediagnosisacute liver failure
collection DOAJ
language English
format Article
sources DOAJ
author Dongyang Zhang
Yicheng Gong
spellingShingle Dongyang Zhang
Yicheng Gong
The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver Failure
IEEE Access
LightGBM
XGBoost
factor analysis
prediagnosis
acute liver failure
author_facet Dongyang Zhang
Yicheng Gong
author_sort Dongyang Zhang
title The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver Failure
title_short The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver Failure
title_full The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver Failure
title_fullStr The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver Failure
title_full_unstemmed The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver Failure
title_sort comparison of lightgbm and xgboost coupling factor analysis and prediagnosis of acute liver failure
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description This paper focuses on the comparison of dimensionality reduction effect between LightGBM and XGBoost-FA. With respect to XGBoost, LightGBM can be built in the effect of dimensionality reduction via both Gradient-based One-Side Sampling(GOSS) and Exclusive Feature Bundling(EFB) algorithms, while XGBoost coupling with traditional dimensionality reduction tool Factor Analysis (XGBoost-FA) may also have dimensionality reduction effect. To present the empirical comparison, the prediagnosis dataset for the 2018 Kaggle competition Acute Liver Failure has been chosen as the research object. And pairwise comparison has been conducted among XGBoost, LightGBM, XGBoost-FA and LightGBM-FA. Concerning the test set, the vector (accuracy, log loss function, training time) of the above first four prediagnostic models are (0.75014, 0.569707, 10.5s), (0.75811, 0.576059,15.1s), (0.67786,0.663924,5.7s) and (0.67274,0.676019, 4.1s) respectively. It's been found that the training time of XGBoost-FA (external dimensionality reduction) is shorter than that of LightGBM (build-in dimensionality reduction). Considering (accuracy, training time) being (0.82, 3.1s) published on Kaggle, the algorithm (logogram as K2a) is better than the four XGBoost-FA and LightGBM in both training time and accuracy. However, K2a removes more than 50% samples with missing values and only performs binary classification. For multi-class classification or data with a large number of missing values, XGBoost-FA is more suggested if higher operational time is required, while LightGBM is preferred if higher predictive accuracy is required. With XGBoost-FA or LightGBM being employed in AI medical services, doctors are more productive in diagnosis and treatment due to much more data support and less workload. Both complement each other.
topic LightGBM
XGBoost
factor analysis
prediagnosis
acute liver failure
url https://ieeexplore.ieee.org/document/9284642/
work_keys_str_mv AT dongyangzhang thecomparisonoflightgbmandxgboostcouplingfactoranalysisandprediagnosisofacuteliverfailure
AT yichenggong thecomparisonoflightgbmandxgboostcouplingfactoranalysisandprediagnosisofacuteliverfailure
AT dongyangzhang comparisonoflightgbmandxgboostcouplingfactoranalysisandprediagnosisofacuteliverfailure
AT yichenggong comparisonoflightgbmandxgboostcouplingfactoranalysisandprediagnosisofacuteliverfailure
_version_ 1724181267904200704