A Hybrid Classification Method Based on Binary Partition of Instances

碩士 === 國立成功大學 === 資訊管理研究所 === 105 === Classification is an essential task in data mining. Preprocess techniques are generally used to improve data quality for enhancing the performance of class prediction. The techniques for data preprocessing can be categorized as on attributes or on instances. A c...

Full description

Bibliographic Details
Main Authors: Guo-HongChen, 陳國鴻
Other Authors: Tzu-Tsung Wong
Format: Others
Language:zh-TW
Published: 2017
Online Access:http://ndltd.ncl.edu.tw/handle/r3v68z
Description
Summary:碩士 === 國立成功大學 === 資訊管理研究所 === 105 === Classification is an essential task in data mining. Preprocess techniques are generally used to improve data quality for enhancing the performance of class prediction. The techniques for data preprocessing can be categorized as on attributes or on instances. A classification algorithm is trained by the data that have been processed by another, and this is called hybrid classification. This study presents a hybrid classification algorithm that first divides a training set into two subsets by a classification algorithm. Then a model is learned from not only each of the two subsets, but also from the whole training set by another algorithm. Every test instance will be classified by one of the three models. The proposed hybrid classification algorithm is tested on 20 data sets for analyzing its prediction accuracy and computational efficiency. The experimental results show that our hybrid algorithm significantly outperforms naïve Bayesian classifier and decision tree learning in most data sets, while it needs more time to learn models. With respect to two hybrid classification algorithms proposed by other studies, our hybrid algorithm can have not only a significantly higher accuracy, but also a relatively lower computational cost.