Study effectiveness of k-means clustering of functional data: functional principal component scores feature and mean curve feature

碩士 === 國立臺灣大學 === 數學研究所 === 99 === Organizing functional data into sensible groupings is one of the most fundamental modes of understanding and learning the underlying mechanism generating functional data. Clustering analysis is often employed to search for homogeneous subgroups of individuals in a...

Full description

Bibliographic Details
Main Authors: Chia-Tung Chiang, 江家彤
Other Authors: Hung Chen
Format: Others
Language:en_US
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/95180268728241580134
Description
Summary:碩士 === 國立臺灣大學 === 數學研究所 === 99 === Organizing functional data into sensible groupings is one of the most fundamental modes of understanding and learning the underlying mechanism generating functional data. Clustering analysis is often employed to search for homogeneous subgroups of individuals in a data set. In Abraham et al. (2003, Scandinavian Journal of Statistics), they start with feature extraction on the mean function and use k-means clustering procedure to determine the clusters. In Peng and Muller (2008, Annals of Applied Statistics), they assume common mean function for all units and start with feature extraction on the covariance function. However, the clusters found by $k$-means clustering procedure can be explained through the characteristics of mean function of each unit. This motivates a theoretical study on comparing the utilities of these two approaches under the settings of densely observed functional data. We will only present the case that the size of clusters is two only. We will present analysis on the lose of efficiency with feature extraction on the covariance function. In Chiou and Li (2007, Journal of the Royal Statistical Society, Series B), they proposed an iterative functional clustering algorithm which apply the method used in Peng and Muller to the initial clustering stage. We advocate to use the mean function in the initial stage. An analysis is provided to support this recommendation.