Hierarchical Classification and Regression with Feature Selection
碩士 === 國立中央大學 === 資訊管理學系 === 107 === The vision of the big data era is to find the suitable sorting method and extract valuable information from numerous and messy data. With the increase of the quantity and complexity of data, data scientists no longer focus on the strengths and weaknesses of model...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2019
|
Online Access: | http://ndltd.ncl.edu.tw/handle/zsq7g5 |
id |
ndltd-TW-107NCU05396075 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-107NCU053960752019-10-24T05:20:20Z http://ndltd.ncl.edu.tw/handle/zsq7g5 Hierarchical Classification and Regression with Feature Selection Chi-Wei Yeh 葉奇瑋 碩士 國立中央大學 資訊管理學系 107 The vision of the big data era is to find the suitable sorting method and extract valuable information from numerous and messy data. With the increase of the quantity and complexity of data, data scientists no longer focus on the strengths and weaknesses of model training but concentrate on using different calculation methods or operating architectures to find clues in the data, and finally they look forward to find ways to improve numerical prediction accuracy from these findings. In numerical prediction of data sets, the common prediction model construction methods in regression training uses linear regression, neural network and support vector regression. In order to pursue better numerical prediction results in the model, adjusting the parameters in the models is necessary. Apart from this, we use feature selection to select less relevant or redundant features and clustering to organize data into different groups. In this study, the hierarchical structure is used as an experimental prototype and extended further to deal with datasets which have a large number of features and amounts. In order to have obvious comparison results from the experimental results, our study combine different clustering, feature selection and regression algorithms to train models and process numerical prediction errors from different fields, quantities and features datasets. Our study find out that there is improvement of root mean square and mean absolute error by using hierarchical classification and regression with feature selection than using only regression or hierarchical classification and regression from analyzing and comparing multiple algorithms experimental results. In addition, our study also find out that the hierarchical structure using clustering (K-means and C-means), feature selection (Mutual Information and Information Gain) and regression (Multi-layer Perceptron) have better average performance in different datasets. Shih-Wen Ke 柯士文 2019 學位論文 ; thesis 87 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立中央大學 === 資訊管理學系 === 107 === The vision of the big data era is to find the suitable sorting method and extract valuable information from numerous and messy data. With the increase of the quantity and complexity of data, data scientists no longer focus on the strengths and weaknesses of model training but concentrate on using different calculation methods or operating architectures to find clues in the data, and finally they look forward to find ways to improve numerical prediction accuracy from these findings.
In numerical prediction of data sets, the common prediction model construction methods in regression training uses linear regression, neural network and support vector regression. In order to pursue better numerical prediction results in the model, adjusting the parameters in the models is necessary. Apart from this, we use feature selection to select less relevant or redundant features and clustering to organize data into different groups.
In this study, the hierarchical structure is used as an experimental prototype and extended further to deal with datasets which have a large number of features and amounts.
In order to have obvious comparison results from the experimental results, our study combine different clustering, feature selection and regression algorithms to train models and process numerical prediction errors from different fields, quantities and features datasets. Our study find out that there is improvement of root mean square and mean absolute error by using hierarchical classification and regression with feature selection than using only regression or hierarchical classification and regression from analyzing and comparing multiple algorithms experimental results. In addition, our study also find out that the hierarchical structure using clustering (K-means and C-means), feature selection (Mutual Information and Information Gain) and regression (Multi-layer Perceptron) have better average performance in different datasets.
|
author2 |
Shih-Wen Ke |
author_facet |
Shih-Wen Ke Chi-Wei Yeh 葉奇瑋 |
author |
Chi-Wei Yeh 葉奇瑋 |
spellingShingle |
Chi-Wei Yeh 葉奇瑋 Hierarchical Classification and Regression with Feature Selection |
author_sort |
Chi-Wei Yeh |
title |
Hierarchical Classification and Regression with Feature Selection |
title_short |
Hierarchical Classification and Regression with Feature Selection |
title_full |
Hierarchical Classification and Regression with Feature Selection |
title_fullStr |
Hierarchical Classification and Regression with Feature Selection |
title_full_unstemmed |
Hierarchical Classification and Regression with Feature Selection |
title_sort |
hierarchical classification and regression with feature selection |
publishDate |
2019 |
url |
http://ndltd.ncl.edu.tw/handle/zsq7g5 |
work_keys_str_mv |
AT chiweiyeh hierarchicalclassificationandregressionwithfeatureselection AT yèqíwěi hierarchicalclassificationandregressionwithfeatureselection |
_version_ |
1719276942755102720 |