HPPD: A Hybrid Parallel Framework of Partition-based and Density-based Clustering Algorithms in Data Streams

Data stream clustering refers to the process of grouping continuously arriving new data chunks into continuously changing groups to enable dynamic analysis of segmentation patterns. However, the main attention of research on clustering methods till now has been concerned with alteration of the metho...

詳細記述

書誌詳細
出版年:Al-Rafidain Journal of Computer Sciences and Mathematics
第一著者: Ammar Abd Alazeez
フォーマット: 論文
言語:英語
出版事項: Mosul University 2020-05-01
主題:
オンライン・アクセス:https://csmj.mosuljournals.com/article_164677_0a5555fea55524ae3cdceefc2d92be01.pdf
その他の書誌記述
要約:Data stream clustering refers to the process of grouping continuously arriving new data chunks into continuously changing groups to enable dynamic analysis of segmentation patterns. However, the main attention of research on clustering methods till now has been concerned with alteration of the methods updated for static datasets and changes of the available modified methods. Such methods presented only one type of final output clusters, i.e. convex or non-convex shape clusters. This paper presents a novel two-phase parallel hybrid clustering (HPPD) algorithm that identify convex and non-convex groups in online stage and mixed groups in offline stage from data stream. In this work, we first receive the data stream and apply pre-processing step to identify convex and non-convex clusters. Secondly, apply modified EINCKM to present online output convex clusters and modified EDDS to present online output non-convex clusters in parallel scheme. Thirdly, apply adaptive merging strategy in offline stage to give last composed output groups. The method is assessed on a synthetic dataset. The output results of the experiments have authenticate the activeness and effectiveness of the method.
ISSN:1815-4816
2311-7990