Feature Selection in a Credit Scoring Model

This paper proposes different classification algorithms—logistic regression, support vector machine, K-nearest neighbors, and random forest—in order to identify which candidates are likely to default for a credit scoring model. Three different feature selection methods are used in order to mitigate...

Full description

Bibliographic Details
Main Authors: Juan Laborda, Seyong Ryoo
Format: Article
Language:English
Published: MDPI AG 2021-03-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/9/7/746
id doaj-e8199b5b3e6f4b9b97a13c1875d5fcf2
record_format Article
spelling doaj-e8199b5b3e6f4b9b97a13c1875d5fcf22021-03-31T23:03:30ZengMDPI AGMathematics2227-73902021-03-01974674610.3390/math9070746Feature Selection in a Credit Scoring ModelJuan Laborda0Seyong Ryoo1Department of Business Administration, University Carlos III, 28903 Madrid, SpainLeuven Statistics Research Centre, KU Leuven, 3000 Leuven, BelgiumThis paper proposes different classification algorithms—logistic regression, support vector machine, K-nearest neighbors, and random forest—in order to identify which candidates are likely to default for a credit scoring model. Three different feature selection methods are used in order to mitigate the overfitting in the curse of dimensionality of these classification algorithms: one filter method (Chi-squared test and correlation coefficients) and two wrapper methods (forward stepwise selection and backward stepwise selection). The performances of these three methods are discussed using two measures, the mean absolute error and the number of selected features. The methodology is applied for a valuable database of Taiwan. The results suggest that forward stepwise selection yields superior performance in each one of the classification algorithms used. The conclusions obtained are related to those in the literature, and their managerial implications are analyzed.https://www.mdpi.com/2227-7390/9/7/746operational research in bankingmachine learningcredit scoringclassification algorithmsfeature selection methods
collection DOAJ
language English
format Article
sources DOAJ
author Juan Laborda
Seyong Ryoo
spellingShingle Juan Laborda
Seyong Ryoo
Feature Selection in a Credit Scoring Model
Mathematics
operational research in banking
machine learning
credit scoring
classification algorithms
feature selection methods
author_facet Juan Laborda
Seyong Ryoo
author_sort Juan Laborda
title Feature Selection in a Credit Scoring Model
title_short Feature Selection in a Credit Scoring Model
title_full Feature Selection in a Credit Scoring Model
title_fullStr Feature Selection in a Credit Scoring Model
title_full_unstemmed Feature Selection in a Credit Scoring Model
title_sort feature selection in a credit scoring model
publisher MDPI AG
series Mathematics
issn 2227-7390
publishDate 2021-03-01
description This paper proposes different classification algorithms—logistic regression, support vector machine, K-nearest neighbors, and random forest—in order to identify which candidates are likely to default for a credit scoring model. Three different feature selection methods are used in order to mitigate the overfitting in the curse of dimensionality of these classification algorithms: one filter method (Chi-squared test and correlation coefficients) and two wrapper methods (forward stepwise selection and backward stepwise selection). The performances of these three methods are discussed using two measures, the mean absolute error and the number of selected features. The methodology is applied for a valuable database of Taiwan. The results suggest that forward stepwise selection yields superior performance in each one of the classification algorithms used. The conclusions obtained are related to those in the literature, and their managerial implications are analyzed.
topic operational research in banking
machine learning
credit scoring
classification algorithms
feature selection methods
url https://www.mdpi.com/2227-7390/9/7/746
work_keys_str_mv AT juanlaborda featureselectioninacreditscoringmodel
AT seyongryoo featureselectioninacreditscoringmodel
_version_ 1724177095833157632