Hierarchical Classification and Regression with Feature Selection

碩士 === 國立中央大學 === 資訊管理學系 === 107 === The vision of the big data era is to find the suitable sorting method and extract valuable information from numerous and messy data. With the increase of the quantity and complexity of data, data scientists no longer focus on the strengths and weaknesses of model...

Full description

Bibliographic Details
Main Authors: Chi-Wei Yeh, 葉奇瑋
Other Authors: Shih-Wen Ke
Format: Others
Language:zh-TW
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/zsq7g5
id ndltd-TW-107NCU05396075
record_format oai_dc
spelling ndltd-TW-107NCU053960752019-10-24T05:20:20Z http://ndltd.ncl.edu.tw/handle/zsq7g5 Hierarchical Classification and Regression with Feature Selection Chi-Wei Yeh 葉奇瑋 碩士 國立中央大學 資訊管理學系 107 The vision of the big data era is to find the suitable sorting method and extract valuable information from numerous and messy data. With the increase of the quantity and complexity of data, data scientists no longer focus on the strengths and weaknesses of model training but concentrate on using different calculation methods or operating architectures to find clues in the data, and finally they look forward to find ways to improve numerical prediction accuracy from these findings. In numerical prediction of data sets, the common prediction model construction methods in regression training uses linear regression, neural network and support vector regression. In order to pursue better numerical prediction results in the model, adjusting the parameters in the models is necessary. Apart from this, we use feature selection to select less relevant or redundant features and clustering to organize data into different groups. In this study, the hierarchical structure is used as an experimental prototype and extended further to deal with datasets which have a large number of features and amounts. In order to have obvious comparison results from the experimental results, our study combine different clustering, feature selection and regression algorithms to train models and process numerical prediction errors from different fields, quantities and features datasets. Our study find out that there is improvement of root mean square and mean absolute error by using hierarchical classification and regression with feature selection than using only regression or hierarchical classification and regression from analyzing and comparing multiple algorithms experimental results. In addition, our study also find out that the hierarchical structure using clustering (K-means and C-means), feature selection (Mutual Information and Information Gain) and regression (Multi-layer Perceptron) have better average performance in different datasets. Shih-Wen Ke 柯士文 2019 學位論文 ; thesis 87 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立中央大學 === 資訊管理學系 === 107 === The vision of the big data era is to find the suitable sorting method and extract valuable information from numerous and messy data. With the increase of the quantity and complexity of data, data scientists no longer focus on the strengths and weaknesses of model training but concentrate on using different calculation methods or operating architectures to find clues in the data, and finally they look forward to find ways to improve numerical prediction accuracy from these findings. In numerical prediction of data sets, the common prediction model construction methods in regression training uses linear regression, neural network and support vector regression. In order to pursue better numerical prediction results in the model, adjusting the parameters in the models is necessary. Apart from this, we use feature selection to select less relevant or redundant features and clustering to organize data into different groups. In this study, the hierarchical structure is used as an experimental prototype and extended further to deal with datasets which have a large number of features and amounts. In order to have obvious comparison results from the experimental results, our study combine different clustering, feature selection and regression algorithms to train models and process numerical prediction errors from different fields, quantities and features datasets. Our study find out that there is improvement of root mean square and mean absolute error by using hierarchical classification and regression with feature selection than using only regression or hierarchical classification and regression from analyzing and comparing multiple algorithms experimental results. In addition, our study also find out that the hierarchical structure using clustering (K-means and C-means), feature selection (Mutual Information and Information Gain) and regression (Multi-layer Perceptron) have better average performance in different datasets.
author2 Shih-Wen Ke
author_facet Shih-Wen Ke
Chi-Wei Yeh
葉奇瑋
author Chi-Wei Yeh
葉奇瑋
spellingShingle Chi-Wei Yeh
葉奇瑋
Hierarchical Classification and Regression with Feature Selection
author_sort Chi-Wei Yeh
title Hierarchical Classification and Regression with Feature Selection
title_short Hierarchical Classification and Regression with Feature Selection
title_full Hierarchical Classification and Regression with Feature Selection
title_fullStr Hierarchical Classification and Regression with Feature Selection
title_full_unstemmed Hierarchical Classification and Regression with Feature Selection
title_sort hierarchical classification and regression with feature selection
publishDate 2019
url http://ndltd.ncl.edu.tw/handle/zsq7g5
work_keys_str_mv AT chiweiyeh hierarchicalclassificationandregressionwithfeatureselection
AT yèqíwěi hierarchicalclassificationandregressionwithfeatureselection
_version_ 1719276942755102720