Improved Approaches for Attribute Clustering Based on the Group Genetic Algorithm

碩士 === 國立中山大學 === 資訊工程學系研究所 === 99 === Feature selection is a pre-processing step in data-mining and machine learning, and plays an important role for analyzing high-dimensional data. Appropriately selected features can not only reduce the complexity of the mining or learning process, but also impro...

Full description

Bibliographic Details
Main Authors:	Feng-Shih Lin, 林峰世
Other Authors:	Tzung-Pei Hong
Format:	Others
Language:	en_US
Published:	2011
Online Access:	http://ndltd.ncl.edu.tw/handle/29467867150684907085

id	ndltd-TW-099NSYS5392074
record_format	oai_dc
spelling	ndltd-TW-099NSYS53920742015-10-19T04:03:35Z http://ndltd.ncl.edu.tw/handle/29467867150684907085 Improved Approaches for Attribute Clustering Based on the Group Genetic Algorithm 基於群組基因演算法之屬性分群改良方法 Feng-Shih Lin 林峰世碩士國立中山大學資訊工程學系研究所 99 Feature selection is a pre-processing step in data-mining and machine learning, and plays an important role for analyzing high-dimensional data. Appropriately selected features can not only reduce the complexity of the mining or learning process, but also improve the accuracy of results. In the past, the concept of performing the task of feature selection by attribute clustering was proposed. If similar attributes could be clustered into groups, attributes could be easily replaced by others in the same group when some attribute values were missed. Hong et al. also proposed several genetic algorithms for finding appropriate attribute clusters. Their approaches, however, suffered from the weakness that multiple chromosomes would represent the same attribute clustering result (feasible solution) due to the combinatorial property, thus causing a larger search space than needed. In this thesis, we thus attempt to improve the performance of the GA-based attribute-clustering process based on the grouping genetic algorithm (GGA). Two GGA-based attribute clustering approaches are proposed. In the first approach, the general GGA representation and operators are used to reduce the redundancy of chromosome representation for attribute clustering. In the second approach, a new encoding scheme with corresponding crossover and mutation operators are designed, and an improved fitness function is proposed to achieve better convergence speed and provide more flexible alternatives than the first one. At last, experiments are made to compare the efficiency and the accuracy of the proposed approaches and the previous ones. Tzung-Pei Hong 洪宗貝 2011 學位論文 ; thesis 78 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立中山大學 === 資訊工程學系研究所 === 99 === Feature selection is a pre-processing step in data-mining and machine learning, and plays an important role for analyzing high-dimensional data. Appropriately selected features can not only reduce the complexity of the mining or learning process, but also improve the accuracy of results. In the past, the concept of performing the task of feature selection by attribute clustering was proposed. If similar attributes could be clustered into groups, attributes could be easily replaced by others in the same group when some attribute values were missed. Hong et al. also proposed several genetic algorithms for finding appropriate attribute clusters. Their approaches, however, suffered from the weakness that multiple chromosomes would represent the same attribute clustering result (feasible solution) due to the combinatorial property, thus causing a larger search space than needed. In this thesis, we thus attempt to improve the performance of the GA-based attribute-clustering process based on the grouping genetic algorithm (GGA). Two GGA-based attribute clustering approaches are proposed. In the first approach, the general GGA representation and operators are used to reduce the redundancy of chromosome representation for attribute clustering. In the second approach, a new encoding scheme with corresponding crossover and mutation operators are designed, and an improved fitness function is proposed to achieve better convergence speed and provide more flexible alternatives than the first one. At last, experiments are made to compare the efficiency and the accuracy of the proposed approaches and the previous ones.
author2	Tzung-Pei Hong
author_facet	Tzung-Pei Hong Feng-Shih Lin 林峰世
author	Feng-Shih Lin 林峰世
spellingShingle	Feng-Shih Lin 林峰世 Improved Approaches for Attribute Clustering Based on the Group Genetic Algorithm
author_sort	Feng-Shih Lin
title	Improved Approaches for Attribute Clustering Based on the Group Genetic Algorithm
title_short	Improved Approaches for Attribute Clustering Based on the Group Genetic Algorithm
title_full	Improved Approaches for Attribute Clustering Based on the Group Genetic Algorithm
title_fullStr	Improved Approaches for Attribute Clustering Based on the Group Genetic Algorithm
title_full_unstemmed	Improved Approaches for Attribute Clustering Based on the Group Genetic Algorithm
title_sort	improved approaches for attribute clustering based on the group genetic algorithm
publishDate	2011
url	http://ndltd.ncl.edu.tw/handle/29467867150684907085
work_keys_str_mv	AT fengshihlin improvedapproachesforattributeclusteringbasedonthegroupgeneticalgorithm AT línfēngshì improvedapproachesforattributeclusteringbasedonthegroupgeneticalgorithm AT fengshihlin jīyúqúnzǔjīyīnyǎnsuànfǎzhīshǔxìngfēnqúngǎiliángfāngfǎ AT línfēngshì jīyúqúnzǔjīyīnyǎnsuànfǎzhīshǔxìngfēnqúngǎiliángfāngfǎ
_version_	1718094100652097536

Improved Approaches for Attribute Clustering Based on the Group Genetic Algorithm

Similar Items