An effective and efficient approach to detect arbitrary patterns in clusters with noises in very large databases

碩士 === 國立屏東科技大學 === 資訊管理系 === 94 === Student ID:N9356001 Title of Thesis:An effective and efficient approach to detect arbitrary patterns in clusters with noises in very large databases Page:80 Name of Institute:Graduate Institute of Management Information Systems, National Pingtung University of S...

Full description

Bibliographic Details
Main Authors: Tzu-Ping Wang, 王子評
Other Authors: Cheng-Fa Tsai
Format: Others
Language:zh-TW
Published: 2006
Online Access:http://ndltd.ncl.edu.tw/handle/17182283599418965293
Description
Summary:碩士 === 國立屏東科技大學 === 資訊管理系 === 94 === Student ID:N9356001 Title of Thesis:An effective and efficient approach to detect arbitrary patterns in clusters with noises in very large databases Page:80 Name of Institute:Graduate Institute of Management Information Systems, National Pingtung University of Science and Technology Graduate Date:June, 2006 Degree Conferred:Master Name of Student:Tzu-Ping Wang Advisor:Dr. Cheng-Fa Tsai Abstract: In the domain of data mining, data clustering technology involves numerous famous algorithms. However, these algorithms suffer some limitations. For instance: 1) scalability, 2) the ability of processing noises, or 3) the detective ability of arbitrary patterns cluster, and so on. This thesis proposes a new clustering algorithm named GDH; meanwhile, it can solve the difficulties of various well-known algorithms which encounters in the above three difficulties. The thesis integrates into theories proposed by many scholars, and raises 1) the concept of gradient decrease and 2) the technology of sliding window for increasing the effectiveness of detecting arbitrary patterns cluster. The thesis compares the time complexity with the proposed GDH algorithm and K-MEANS (Partitioning-Based Method), CLIQUE (Grid-Based Method) as well as DBSCAN (Density–Based Method) which were published in the famous SCI journals. Furthermore, the thesis compares the clustering quality and the effectiveness with the proposed GDH algorithm and BIRCH, CURE (Hierarchical Method) as well as CLIQUE. According to the simulation results, it is observed that the proposed GDH approach can deal with clusters with arbitrary patterns effectively through the fine-tuning technology and gradient decrease method. In addition, it is exactly better than K-MEANS, BIRCH, CURE, CLIQUE, and DBSCAN in effectiveness and efficiency comparisons. Keyword:data mining , data clustering, gradient decrease, sliding window