The Dependent-variable SMOTE for learning imbalanced data-set

碩士 === 國立成功大學 === 工業與資訊管理學系 === 106 === Data mining on imbalanced data sets receives more and more attentions in recent years.The class imbalanced problem occurs when there’s just few number of instances in one classes comparing to other classes.The SMOTE:Synthetic Minority Over-Sampling Technique i...

Full description

Bibliographic Details
Main Authors: Yen-ChunChen, 陳彥均
Other Authors: Der-Chiang Li
Format: Others
Language:zh-TW
Published: 2018
Online Access:http://ndltd.ncl.edu.tw/handle/ccyreb
id ndltd-TW-106NCKU5041009
record_format oai_dc
spelling ndltd-TW-106NCKU50410092019-05-16T01:07:58Z http://ndltd.ncl.edu.tw/handle/ccyreb The Dependent-variable SMOTE for learning imbalanced data-set 以相依變量增生小類別樣本技術學習不平衡資料 Yen-ChunChen 陳彥均 碩士 國立成功大學 工業與資訊管理學系 106 Data mining on imbalanced data sets receives more and more attentions in recent years.The class imbalanced problem occurs when there’s just few number of instances in one classes comparing to other classes.The SMOTE:Synthetic Minority Over-Sampling Technique is an effective method to improve the recognition of minority class in class-imbalanced problem. SMOTE is an over-sampling method that generates new synthetic instances from the minority class. And it provides a standard procedure for further research such as Borderline SMOTE , Safe-Level SMOTE and Local-Neighborhood SMOTE . In this paper we propose an extension method of SMOTE by considering the relations between different attributes and deciding the location of synthetic samples based on fuzzy technique.This study develops a new sample-generating procedure to determine the range on the both sides of the line that connects the pairs of population samples.Three data sets taken from UCI Machine Learning Repository in the experiments. We compare the proposed method with SMOTE and other extension version including Borderline SMOTE , Safe-Level SMOTE and Local-Neighborhood SMOTE.The result shows that the proposed method achieves better classifier performance for the minority class than other methods after examined the data sets with C4.5 decision trees. Der-Chiang Li 利德江 2018 學位論文 ; thesis 49 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立成功大學 === 工業與資訊管理學系 === 106 === Data mining on imbalanced data sets receives more and more attentions in recent years.The class imbalanced problem occurs when there’s just few number of instances in one classes comparing to other classes.The SMOTE:Synthetic Minority Over-Sampling Technique is an effective method to improve the recognition of minority class in class-imbalanced problem. SMOTE is an over-sampling method that generates new synthetic instances from the minority class. And it provides a standard procedure for further research such as Borderline SMOTE , Safe-Level SMOTE and Local-Neighborhood SMOTE . In this paper we propose an extension method of SMOTE by considering the relations between different attributes and deciding the location of synthetic samples based on fuzzy technique.This study develops a new sample-generating procedure to determine the range on the both sides of the line that connects the pairs of population samples.Three data sets taken from UCI Machine Learning Repository in the experiments. We compare the proposed method with SMOTE and other extension version including Borderline SMOTE , Safe-Level SMOTE and Local-Neighborhood SMOTE.The result shows that the proposed method achieves better classifier performance for the minority class than other methods after examined the data sets with C4.5 decision trees.
author2 Der-Chiang Li
author_facet Der-Chiang Li
Yen-ChunChen
陳彥均
author Yen-ChunChen
陳彥均
spellingShingle Yen-ChunChen
陳彥均
The Dependent-variable SMOTE for learning imbalanced data-set
author_sort Yen-ChunChen
title The Dependent-variable SMOTE for learning imbalanced data-set
title_short The Dependent-variable SMOTE for learning imbalanced data-set
title_full The Dependent-variable SMOTE for learning imbalanced data-set
title_fullStr The Dependent-variable SMOTE for learning imbalanced data-set
title_full_unstemmed The Dependent-variable SMOTE for learning imbalanced data-set
title_sort dependent-variable smote for learning imbalanced data-set
publishDate 2018
url http://ndltd.ncl.edu.tw/handle/ccyreb
work_keys_str_mv AT yenchunchen thedependentvariablesmoteforlearningimbalanceddataset
AT chényànjūn thedependentvariablesmoteforlearningimbalanceddataset
AT yenchunchen yǐxiāngyībiànliàngzēngshēngxiǎolèibiéyàngběnjìshùxuéxíbùpínghéngzīliào
AT chényànjūn yǐxiāngyībiànliàngzēngshēngxiǎolèibiéyàngběnjìshùxuéxíbùpínghéngzīliào
AT yenchunchen dependentvariablesmoteforlearningimbalanceddataset
AT chényànjūn dependentvariablesmoteforlearningimbalanceddataset
_version_ 1719173386286923776