VLSI Architectures for Clustering Algorithms

博士 === 國立臺灣師範大學 === 資訊工程研究所 === 100 === In this dissertation, several hardware architectures are proposed for various clustering algorithms, including the c-means, competitive learning, fuzzy c-means, and fuzzy c-means with spatial constraint algorithms. All these architectures have been implemented...

Full description

Bibliographic Details
Main Authors: Hui-Ya Li, 李惠雅
Other Authors: Wen-Jyi Hwang
Format: Others
Language:en_US
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/15049414863456041584
Description
Summary:博士 === 國立臺灣師範大學 === 資訊工程研究所 === 100 === In this dissertation, several hardware architectures are proposed for various clustering algorithms, including the c-means, competitive learning, fuzzy c-means, and fuzzy c-means with spatial constraint algorithms. All these architectures have been implemented on field programmable gate array (FPGA) devices to construct system on programmable chip (SOPC) systems for clustering. Both the partitioning and centroid computation operations in the proposed c-means architecture are fully pipelined thus multiple training vectors can be concurrently processed. A lookup table based divider is employed to reduce the area cost and latency for centroid computation. Two kinds of hardware realization of competitive learning algorithm with k-winners-take-all (kWTA) activation are presented. In the first architecture, the k winners associating with an input vector are identified by a module performing partial distance search (PDS) in the wavelet domain. The neuron updating process is based on a hardware divider with simple table lookup operations. Both the partial distance search module and hardware divider adopt finite precision calculation for area cost reduction. Subspace search and multiple-coefficient accumulation techniques are also employed to reduce the computation latency for the PDS search. The second architecture is based on an efficient pipeline allowing kWTA competition processes associated with different training vectors to be performed concurrently. The pipeline architecture employs a novel codeword swapping scheme so that neurons failing the competition for a training vector are immediately available for the competitions for subsequent training vectors. The proposed fuzzy c-means architecture is an efficient parallel solution. The architecture reduces the area cost and computational complexity for membership coefficients and centroid computation by employing lookup table based dividers. The usual iterative operations for updating the membership matrix and cluster centroid are merged into one single updating process to evade the large storage requirement. Such architecture is also extended to for the implementation of fuzzy c-means with spatial constraint. In the architecture, lookup table based root operators are adapted to relax the restriction on the degree of fuzziness. Experimental results show that the proposed architectures are cost-effective, and can attain high speedup over other hardware or software implementations for large data sets and/or large number of clusters.