Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity Measures

One of the four basic machine learning tasks is pattern classification. The selection of the proper learning algorithm for a given problem is a challenging task, formally known as the algorithm selection problem (ASP). In particular, we are interested in the behavior of the associative classifiers d...

Full description

Bibliographic Details
Published in:Mathematics
Main Authors: Francisco J. Camacho-Urriolagoitia, Yenny Villuendas-Rey, Itzamá López-Yáñez, Oscar Camacho-Nieto, Cornelio Yáñez-Márquez
Format: Article
Language:English
Published: MDPI AG 2022-04-01
Subjects:
Online Access:https://www.mdpi.com/2227-7390/10/9/1460
_version_ 1850336554702077952
author Francisco J. Camacho-Urriolagoitia
Yenny Villuendas-Rey
Itzamá López-Yáñez
Oscar Camacho-Nieto
Cornelio Yáñez-Márquez
author_facet Francisco J. Camacho-Urriolagoitia
Yenny Villuendas-Rey
Itzamá López-Yáñez
Oscar Camacho-Nieto
Cornelio Yáñez-Márquez
author_sort Francisco J. Camacho-Urriolagoitia
collection DOAJ
container_title Mathematics
description One of the four basic machine learning tasks is pattern classification. The selection of the proper learning algorithm for a given problem is a challenging task, formally known as the algorithm selection problem (ASP). In particular, we are interested in the behavior of the associative classifiers derived from Alpha-Beta models applied to the financial field. In this paper, the behavior of four associative classifiers was studied: the One-Hot version of the Hybrid Associative Classifier with Translation (CHAT-OHM), the Extended Gamma (EG), the Naïve Associative Classifier (NAC), and the Assisted Classification for Imbalanced Datasets (ACID). To establish the performance, we used the area under the curve (AUC), F-score, and geometric mean measures. The four classifiers were applied over 11 datasets from the financial area. Then, the performance of each one was analyzed, considering their correlation with the measures of data complexity, corresponding to six categories based on specific aspects of the datasets: feature, linearity, neighborhood, network, dimensionality, and class imbalance. The correlations that arise between the measures of complexity of the datasets and the measures of performance of the associative classifiers are established; these results are expressed with Spearman’s Rho coefficient. The experimental results correctly indicated correlations between data complexity measures and the performance of the associative classifiers.
format Article
id doaj-art-ebe11466a2c2483e8803f3e4e3f13b97
institution Directory of Open Access Journals
issn 2227-7390
language English
publishDate 2022-04-01
publisher MDPI AG
record_format Article
spelling doaj-art-ebe11466a2c2483e8803f3e4e3f13b972025-08-19T23:15:53ZengMDPI AGMathematics2227-73902022-04-01109146010.3390/math10091460Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity MeasuresFrancisco J. Camacho-Urriolagoitia0Yenny Villuendas-Rey1Itzamá López-Yáñez2Oscar Camacho-Nieto3Cornelio Yáñez-Márquez4Instituto Politécnico Nacional, Centro de Innovación y Desarrollo Tecnológico en Cómputo, Av. Juan de Dios Bátiz s/n, Nueva Industrial Vallejo, GAM, Mexico City 07700, MexicoInstituto Politécnico Nacional, Centro de Innovación y Desarrollo Tecnológico en Cómputo, Av. Juan de Dios Bátiz s/n, Nueva Industrial Vallejo, GAM, Mexico City 07700, MexicoInstituto Politécnico Nacional, Centro de Innovación y Desarrollo Tecnológico en Cómputo, Av. Juan de Dios Bátiz s/n, Nueva Industrial Vallejo, GAM, Mexico City 07700, MexicoInstituto Politécnico Nacional, Centro de Innovación y Desarrollo Tecnológico en Cómputo, Av. Juan de Dios Bátiz s/n, Nueva Industrial Vallejo, GAM, Mexico City 07700, MexicoInstituto Politécnico Nacional, Centro de Investigación en Computación, Av. Juan de Dios Bátiz s/n, Nueva Industrial Vallejo, GAM, Mexico City 07738, MexicoOne of the four basic machine learning tasks is pattern classification. The selection of the proper learning algorithm for a given problem is a challenging task, formally known as the algorithm selection problem (ASP). In particular, we are interested in the behavior of the associative classifiers derived from Alpha-Beta models applied to the financial field. In this paper, the behavior of four associative classifiers was studied: the One-Hot version of the Hybrid Associative Classifier with Translation (CHAT-OHM), the Extended Gamma (EG), the Naïve Associative Classifier (NAC), and the Assisted Classification for Imbalanced Datasets (ACID). To establish the performance, we used the area under the curve (AUC), F-score, and geometric mean measures. The four classifiers were applied over 11 datasets from the financial area. Then, the performance of each one was analyzed, considering their correlation with the measures of data complexity, corresponding to six categories based on specific aspects of the datasets: feature, linearity, neighborhood, network, dimensionality, and class imbalance. The correlations that arise between the measures of complexity of the datasets and the measures of performance of the associative classifiers are established; these results are expressed with Spearman’s Rho coefficient. The experimental results correctly indicated correlations between data complexity measures and the performance of the associative classifiers.https://www.mdpi.com/2227-7390/10/9/1460supervised classificationmeta-learningassociative classificationfinances
spellingShingle Francisco J. Camacho-Urriolagoitia
Yenny Villuendas-Rey
Itzamá López-Yáñez
Oscar Camacho-Nieto
Cornelio Yáñez-Márquez
Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity Measures
supervised classification
meta-learning
associative classification
finances
title Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity Measures
title_full Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity Measures
title_fullStr Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity Measures
title_full_unstemmed Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity Measures
title_short Correlation Assessment of the Performance of Associative Classifiers on Credit Datasets Based on Data Complexity Measures
title_sort correlation assessment of the performance of associative classifiers on credit datasets based on data complexity measures
topic supervised classification
meta-learning
associative classification
finances
url https://www.mdpi.com/2227-7390/10/9/1460
work_keys_str_mv AT franciscojcamachourriolagoitia correlationassessmentoftheperformanceofassociativeclassifiersoncreditdatasetsbasedondatacomplexitymeasures
AT yennyvilluendasrey correlationassessmentoftheperformanceofassociativeclassifiersoncreditdatasetsbasedondatacomplexitymeasures
AT itzamalopezyanez correlationassessmentoftheperformanceofassociativeclassifiersoncreditdatasetsbasedondatacomplexitymeasures
AT oscarcamachonieto correlationassessmentoftheperformanceofassociativeclassifiersoncreditdatasetsbasedondatacomplexitymeasures
AT cornelioyanezmarquez correlationassessmentoftheperformanceofassociativeclassifiersoncreditdatasetsbasedondatacomplexitymeasures