nDeepSMOTEc: An Oversampling Method for Structured Datasets

The problem of imbalanced data is pervasive in the field of machine learning, particularly in applications such as medical diagnosis and fraud detection, where minority class samples are crucial for accurate model predictions. To improve classifier performance on imbalanced datasets, this study prop...

Full description

Bibliographic Details
Published in:IEEE Access
Main Authors: Wen-Lin Fan, Chun-Chi Chuang, Chia-Wen Hsu, Chieh-Chi Huang, Ting-Rui Guo, Chung-Chian Hsu
Format: Article
Language:English
Published: IEEE 2025-01-01
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11151264/
Description
Summary:The problem of imbalanced data is pervasive in the field of machine learning, particularly in applications such as medical diagnosis and fraud detection, where minority class samples are crucial for accurate model predictions. To improve classifier performance on imbalanced datasets, this study proposes an enhanced oversampling method based on the DeepSMOTE framework and examines its effects across different classifiers. DeepSMOTE is specifically designed for image data. We modify DeepSMOTE to make it applicable to structured tabular data and compare its performance with other oversampling techniques using Support Vector Machines and Neural Networks. According to experiments on 26 public datasets, the results demonstrate that the proposed method, nDeepSMOTEc, consistently outperforms state-of-the-art oversampling approaches across both classification models, highlighting its practical utility for structured datasets. These findings provide valuable insights and a reference point for selecting appropriate oversampling strategies when addressing imbalanced data in future research.
ISSN:2169-3536