The Dependent-variable SMOTE for learning imbalanced data-set
碩士 === 國立成功大學 === 工業與資訊管理學系 === 106 === Data mining on imbalanced data sets receives more and more attentions in recent years.The class imbalanced problem occurs when there’s just few number of instances in one classes comparing to other classes.The SMOTE:Synthetic Minority Over-Sampling Technique i...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2018
|
Online Access: | http://ndltd.ncl.edu.tw/handle/ccyreb |
id |
ndltd-TW-106NCKU5041009 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-106NCKU50410092019-05-16T01:07:58Z http://ndltd.ncl.edu.tw/handle/ccyreb The Dependent-variable SMOTE for learning imbalanced data-set 以相依變量增生小類別樣本技術學習不平衡資料 Yen-ChunChen 陳彥均 碩士 國立成功大學 工業與資訊管理學系 106 Data mining on imbalanced data sets receives more and more attentions in recent years.The class imbalanced problem occurs when there’s just few number of instances in one classes comparing to other classes.The SMOTE:Synthetic Minority Over-Sampling Technique is an effective method to improve the recognition of minority class in class-imbalanced problem. SMOTE is an over-sampling method that generates new synthetic instances from the minority class. And it provides a standard procedure for further research such as Borderline SMOTE , Safe-Level SMOTE and Local-Neighborhood SMOTE . In this paper we propose an extension method of SMOTE by considering the relations between different attributes and deciding the location of synthetic samples based on fuzzy technique.This study develops a new sample-generating procedure to determine the range on the both sides of the line that connects the pairs of population samples.Three data sets taken from UCI Machine Learning Repository in the experiments. We compare the proposed method with SMOTE and other extension version including Borderline SMOTE , Safe-Level SMOTE and Local-Neighborhood SMOTE.The result shows that the proposed method achieves better classifier performance for the minority class than other methods after examined the data sets with C4.5 decision trees. Der-Chiang Li 利德江 2018 學位論文 ; thesis 49 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立成功大學 === 工業與資訊管理學系 === 106 === Data mining on imbalanced data sets receives more and more attentions in recent years.The class imbalanced problem occurs when there’s just few number of instances in one classes comparing to other classes.The SMOTE:Synthetic Minority Over-Sampling Technique is an effective method to improve the recognition of minority class in class-imbalanced problem. SMOTE is an over-sampling method that generates new synthetic instances from the minority class. And it provides a standard procedure for further research such as Borderline SMOTE , Safe-Level SMOTE and Local-Neighborhood SMOTE . In this paper we propose an extension method of SMOTE by considering the relations between different attributes and deciding the location of synthetic samples based on fuzzy technique.This study develops a new sample-generating procedure to determine the range on the both sides of the line that connects the pairs of population samples.Three data sets taken from UCI Machine Learning Repository in the experiments. We compare the proposed method with SMOTE and other extension version including Borderline SMOTE , Safe-Level SMOTE and Local-Neighborhood SMOTE.The result shows that the proposed method achieves better classifier performance for the minority class than other methods after examined the data sets with C4.5 decision trees.
|
author2 |
Der-Chiang Li |
author_facet |
Der-Chiang Li Yen-ChunChen 陳彥均 |
author |
Yen-ChunChen 陳彥均 |
spellingShingle |
Yen-ChunChen 陳彥均 The Dependent-variable SMOTE for learning imbalanced data-set |
author_sort |
Yen-ChunChen |
title |
The Dependent-variable SMOTE for learning imbalanced data-set |
title_short |
The Dependent-variable SMOTE for learning imbalanced data-set |
title_full |
The Dependent-variable SMOTE for learning imbalanced data-set |
title_fullStr |
The Dependent-variable SMOTE for learning imbalanced data-set |
title_full_unstemmed |
The Dependent-variable SMOTE for learning imbalanced data-set |
title_sort |
dependent-variable smote for learning imbalanced data-set |
publishDate |
2018 |
url |
http://ndltd.ncl.edu.tw/handle/ccyreb |
work_keys_str_mv |
AT yenchunchen thedependentvariablesmoteforlearningimbalanceddataset AT chényànjūn thedependentvariablesmoteforlearningimbalanceddataset AT yenchunchen yǐxiāngyībiànliàngzēngshēngxiǎolèibiéyàngběnjìshùxuéxíbùpínghéngzīliào AT chényànjūn yǐxiāngyībiànliàngzēngshēngxiǎolèibiéyàngběnjìshùxuéxíbùpínghéngzīliào AT yenchunchen dependentvariablesmoteforlearningimbalanceddataset AT chényànjūn dependentvariablesmoteforlearningimbalanceddataset |
_version_ |
1719173386286923776 |