Personal Credit Default Discrimination Model Based on Super Learner Ensemble
Assessing the default of customers is an essential basis for personal credit issuance. This paper considers developing a personal credit default discrimination model based on Super Learner heterogeneous ensemble to improve the accuracy and robustness of default discrimination. First, we select six k...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi Limited
2021-01-01
|
Series: | Mathematical Problems in Engineering |
Online Access: | http://dx.doi.org/10.1155/2021/5586120 |
id |
doaj-4e8b187750344273aa8f902a84c7749f |
---|---|
record_format |
Article |
spelling |
doaj-4e8b187750344273aa8f902a84c7749f2021-04-12T01:23:49ZengHindawi LimitedMathematical Problems in Engineering1563-51472021-01-01202110.1155/2021/5586120Personal Credit Default Discrimination Model Based on Super Learner EnsembleGang Li0Mengdi Shen1Meixuan Li2Jingyi Cheng3School of Business AdministrationSchool of Business AdministrationSchool of Business AdministrationSchool of Business AdministrationAssessing the default of customers is an essential basis for personal credit issuance. This paper considers developing a personal credit default discrimination model based on Super Learner heterogeneous ensemble to improve the accuracy and robustness of default discrimination. First, we select six kinds of single classifiers such as logistic regression, SVM, and three kinds of homogeneous ensemble classifiers such as random forest to build a base classifier candidate library for Super Learner. Then, we use the ten-fold cross-validation method to exercise the base classifier to improve the base classifier’s robustness. We compute the base classifier’s total loss using the difference between the predicted and actual values and establish a base classifier-weighted optimization model to solve for the optimal weight of the base classifier, which minimizes the weighted total loss of all base classifiers. Thus, we obtain the heterogeneous ensembled Super Learner classifier. Finally, we use three real credit datasets in the UCI database regarding Australia, Japanese, and German and the large credit dataset GMSC published by Kaggle platform to test the ensembled Super Learner model’s effectiveness. We also employ four commonly used evaluation indicators, the accuracy rate, type I error rate, type II error rate, and AUC. Compared with the base classifier’s classification results and heterogeneous models such as Stacking and Bstacking, the results show that the ensembled Super Learner model has higher discrimination accuracy and robustness.http://dx.doi.org/10.1155/2021/5586120 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Gang Li Mengdi Shen Meixuan Li Jingyi Cheng |
spellingShingle |
Gang Li Mengdi Shen Meixuan Li Jingyi Cheng Personal Credit Default Discrimination Model Based on Super Learner Ensemble Mathematical Problems in Engineering |
author_facet |
Gang Li Mengdi Shen Meixuan Li Jingyi Cheng |
author_sort |
Gang Li |
title |
Personal Credit Default Discrimination Model Based on Super Learner Ensemble |
title_short |
Personal Credit Default Discrimination Model Based on Super Learner Ensemble |
title_full |
Personal Credit Default Discrimination Model Based on Super Learner Ensemble |
title_fullStr |
Personal Credit Default Discrimination Model Based on Super Learner Ensemble |
title_full_unstemmed |
Personal Credit Default Discrimination Model Based on Super Learner Ensemble |
title_sort |
personal credit default discrimination model based on super learner ensemble |
publisher |
Hindawi Limited |
series |
Mathematical Problems in Engineering |
issn |
1563-5147 |
publishDate |
2021-01-01 |
description |
Assessing the default of customers is an essential basis for personal credit issuance. This paper considers developing a personal credit default discrimination model based on Super Learner heterogeneous ensemble to improve the accuracy and robustness of default discrimination. First, we select six kinds of single classifiers such as logistic regression, SVM, and three kinds of homogeneous ensemble classifiers such as random forest to build a base classifier candidate library for Super Learner. Then, we use the ten-fold cross-validation method to exercise the base classifier to improve the base classifier’s robustness. We compute the base classifier’s total loss using the difference between the predicted and actual values and establish a base classifier-weighted optimization model to solve for the optimal weight of the base classifier, which minimizes the weighted total loss of all base classifiers. Thus, we obtain the heterogeneous ensembled Super Learner classifier. Finally, we use three real credit datasets in the UCI database regarding Australia, Japanese, and German and the large credit dataset GMSC published by Kaggle platform to test the ensembled Super Learner model’s effectiveness. We also employ four commonly used evaluation indicators, the accuracy rate, type I error rate, type II error rate, and AUC. Compared with the base classifier’s classification results and heterogeneous models such as Stacking and Bstacking, the results show that the ensembled Super Learner model has higher discrimination accuracy and robustness. |
url |
http://dx.doi.org/10.1155/2021/5586120 |
work_keys_str_mv |
AT gangli personalcreditdefaultdiscriminationmodelbasedonsuperlearnerensemble AT mengdishen personalcreditdefaultdiscriminationmodelbasedonsuperlearnerensemble AT meixuanli personalcreditdefaultdiscriminationmodelbasedonsuperlearnerensemble AT jingyicheng personalcreditdefaultdiscriminationmodelbasedonsuperlearnerensemble |
_version_ |
1714683080533344256 |