Personal Credit Default Discrimination Model Based on Super Learner Ensemble

Assessing the default of customers is an essential basis for personal credit issuance. This paper considers developing a personal credit default discrimination model based on Super Learner heterogeneous ensemble to improve the accuracy and robustness of default discrimination. First, we select six k...

Full description

Bibliographic Details
Main Authors: Gang Li, Mengdi Shen, Meixuan Li, Jingyi Cheng
Format: Article
Language:English
Published: Hindawi Limited 2021-01-01
Series:Mathematical Problems in Engineering
Online Access:http://dx.doi.org/10.1155/2021/5586120
id doaj-4e8b187750344273aa8f902a84c7749f
record_format Article
spelling doaj-4e8b187750344273aa8f902a84c7749f2021-04-12T01:23:49ZengHindawi LimitedMathematical Problems in Engineering1563-51472021-01-01202110.1155/2021/5586120Personal Credit Default Discrimination Model Based on Super Learner EnsembleGang Li0Mengdi Shen1Meixuan Li2Jingyi Cheng3School of Business AdministrationSchool of Business AdministrationSchool of Business AdministrationSchool of Business AdministrationAssessing the default of customers is an essential basis for personal credit issuance. This paper considers developing a personal credit default discrimination model based on Super Learner heterogeneous ensemble to improve the accuracy and robustness of default discrimination. First, we select six kinds of single classifiers such as logistic regression, SVM, and three kinds of homogeneous ensemble classifiers such as random forest to build a base classifier candidate library for Super Learner. Then, we use the ten-fold cross-validation method to exercise the base classifier to improve the base classifier’s robustness. We compute the base classifier’s total loss using the difference between the predicted and actual values and establish a base classifier-weighted optimization model to solve for the optimal weight of the base classifier, which minimizes the weighted total loss of all base classifiers. Thus, we obtain the heterogeneous ensembled Super Learner classifier. Finally, we use three real credit datasets in the UCI database regarding Australia, Japanese, and German and the large credit dataset GMSC published by Kaggle platform to test the ensembled Super Learner model’s effectiveness. We also employ four commonly used evaluation indicators, the accuracy rate, type I error rate, type II error rate, and AUC. Compared with the base classifier’s classification results and heterogeneous models such as Stacking and Bstacking, the results show that the ensembled Super Learner model has higher discrimination accuracy and robustness.http://dx.doi.org/10.1155/2021/5586120
collection DOAJ
language English
format Article
sources DOAJ
author Gang Li
Mengdi Shen
Meixuan Li
Jingyi Cheng
spellingShingle Gang Li
Mengdi Shen
Meixuan Li
Jingyi Cheng
Personal Credit Default Discrimination Model Based on Super Learner Ensemble
Mathematical Problems in Engineering
author_facet Gang Li
Mengdi Shen
Meixuan Li
Jingyi Cheng
author_sort Gang Li
title Personal Credit Default Discrimination Model Based on Super Learner Ensemble
title_short Personal Credit Default Discrimination Model Based on Super Learner Ensemble
title_full Personal Credit Default Discrimination Model Based on Super Learner Ensemble
title_fullStr Personal Credit Default Discrimination Model Based on Super Learner Ensemble
title_full_unstemmed Personal Credit Default Discrimination Model Based on Super Learner Ensemble
title_sort personal credit default discrimination model based on super learner ensemble
publisher Hindawi Limited
series Mathematical Problems in Engineering
issn 1563-5147
publishDate 2021-01-01
description Assessing the default of customers is an essential basis for personal credit issuance. This paper considers developing a personal credit default discrimination model based on Super Learner heterogeneous ensemble to improve the accuracy and robustness of default discrimination. First, we select six kinds of single classifiers such as logistic regression, SVM, and three kinds of homogeneous ensemble classifiers such as random forest to build a base classifier candidate library for Super Learner. Then, we use the ten-fold cross-validation method to exercise the base classifier to improve the base classifier’s robustness. We compute the base classifier’s total loss using the difference between the predicted and actual values and establish a base classifier-weighted optimization model to solve for the optimal weight of the base classifier, which minimizes the weighted total loss of all base classifiers. Thus, we obtain the heterogeneous ensembled Super Learner classifier. Finally, we use three real credit datasets in the UCI database regarding Australia, Japanese, and German and the large credit dataset GMSC published by Kaggle platform to test the ensembled Super Learner model’s effectiveness. We also employ four commonly used evaluation indicators, the accuracy rate, type I error rate, type II error rate, and AUC. Compared with the base classifier’s classification results and heterogeneous models such as Stacking and Bstacking, the results show that the ensembled Super Learner model has higher discrimination accuracy and robustness.
url http://dx.doi.org/10.1155/2021/5586120
work_keys_str_mv AT gangli personalcreditdefaultdiscriminationmodelbasedonsuperlearnerensemble
AT mengdishen personalcreditdefaultdiscriminationmodelbasedonsuperlearnerensemble
AT meixuanli personalcreditdefaultdiscriminationmodelbasedonsuperlearnerensemble
AT jingyicheng personalcreditdefaultdiscriminationmodelbasedonsuperlearnerensemble
_version_ 1714683080533344256