Learning from small datasets containing nominal attributes

博士 === 國立成功大學 === 資訊管理研究所 === 107 === In many small-data-learning problems, owing to the incomplete data structure, explicit information for decision makers is limited. Although machine learning algorithms are extensively applied to extract knowledge, most of them are developed without considering w...

Full description

Bibliographic Details
Main Authors: Hung-YuChen, 陳泓佑
Other Authors: Der-Chiang Li
Format: Others
Language:en_US
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/y2qgaw
id ndltd-TW-107NCKU5396008
record_format oai_dc
spelling ndltd-TW-107NCKU53960082019-10-26T06:24:11Z http://ndltd.ncl.edu.tw/handle/y2qgaw Learning from small datasets containing nominal attributes 具名屬性的小樣本學習 Hung-YuChen 陳泓佑 博士 國立成功大學 資訊管理研究所 107 In many small-data-learning problems, owing to the incomplete data structure, explicit information for decision makers is limited. Although machine learning algorithms are extensively applied to extract knowledge, most of them are developed without considering whether the training sets can fully represent the population properties. Focusing on small data which contains nominal inputs and continuous outputs, this paper develops an effective sample generating procedure based on fuzzy theories to tackle the learning issue by data preprocessing. According to the derived fuzzy relations between categories and continuous outputs, the possibilities of the combinations of categories (virtual samples) can be aggregated when continuous outputs are given. Proper virtual samples are further selected by using fuzzy alpha-cut on the possibility distributions, and these are added to the training sets to form new ones. In the experiment, sixteen datasets taken from the UC Irvine Machine Learning Repository are examined with back-propagation neural networks and support vector regressions. The results reveal that the forecasting accuracies of the two models are significantly improved when they are built with the proposed new training sets. Moreover, the results also indicate the proposed method outperforms bootstrap aggregating and the synthetic minority over-sampling technique-Nominal-Continuous with the greatest amount of statistical support. Der-Chiang Li 利德江 2019 學位論文 ; thesis 43 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立成功大學 === 資訊管理研究所 === 107 === In many small-data-learning problems, owing to the incomplete data structure, explicit information for decision makers is limited. Although machine learning algorithms are extensively applied to extract knowledge, most of them are developed without considering whether the training sets can fully represent the population properties. Focusing on small data which contains nominal inputs and continuous outputs, this paper develops an effective sample generating procedure based on fuzzy theories to tackle the learning issue by data preprocessing. According to the derived fuzzy relations between categories and continuous outputs, the possibilities of the combinations of categories (virtual samples) can be aggregated when continuous outputs are given. Proper virtual samples are further selected by using fuzzy alpha-cut on the possibility distributions, and these are added to the training sets to form new ones. In the experiment, sixteen datasets taken from the UC Irvine Machine Learning Repository are examined with back-propagation neural networks and support vector regressions. The results reveal that the forecasting accuracies of the two models are significantly improved when they are built with the proposed new training sets. Moreover, the results also indicate the proposed method outperforms bootstrap aggregating and the synthetic minority over-sampling technique-Nominal-Continuous with the greatest amount of statistical support.
author2 Der-Chiang Li
author_facet Der-Chiang Li
Hung-YuChen
陳泓佑
author Hung-YuChen
陳泓佑
spellingShingle Hung-YuChen
陳泓佑
Learning from small datasets containing nominal attributes
author_sort Hung-YuChen
title Learning from small datasets containing nominal attributes
title_short Learning from small datasets containing nominal attributes
title_full Learning from small datasets containing nominal attributes
title_fullStr Learning from small datasets containing nominal attributes
title_full_unstemmed Learning from small datasets containing nominal attributes
title_sort learning from small datasets containing nominal attributes
publishDate 2019
url http://ndltd.ncl.edu.tw/handle/y2qgaw
work_keys_str_mv AT hungyuchen learningfromsmalldatasetscontainingnominalattributes
AT chénhóngyòu learningfromsmalldatasetscontainingnominalattributes
AT hungyuchen jùmíngshǔxìngdexiǎoyàngběnxuéxí
AT chénhóngyòu jùmíngshǔxìngdexiǎoyàngběnxuéxí
_version_ 1719278526122688512