Summary: | 碩士 === 國立臺灣科技大學 === 資訊工程系 === 105 === The problem of time series classification has been studied for decades. In time series data, it is important to determine which part of sequence contains the most significant information for classification. The main issue is whether certain features, that represent time series into a set of characterized values, could be extract from the time series data. We propose an efficient procedure that can extract unit patterns from time series as a set of segmented time sequences. Unit patterns are repeated subsequences frequently appear in time series data, which have different lengths but share the same shape. This procedure is faster and lower computation cost than subsequence matching. After unit pattern extraction, it can be regarded as features in time series data for further classification. Fundamentally, issues involving with time series classification can be categorized into three types, including distance-based, model-based and feature-based. In this paper, we focus on feature-based method, which represents time series into a set of characterized values. We apply a time series representation envelope to process the extraction result. Because the envelope represents the pattern shape, it is a critical task to extract the proper unit patterns from time series data. We propose this unit pattern extraction method can capture the pattern shape well. We elastically scale each segmented time series into the same length by interpolation scaling. Therefore, a set of well synchronized and equal length time series can be represented as an envelope. The envelope is the profile of this set of time series data. We can use the envelope to have sparse representation for each time series in the segmented time series data. Moreover, the transformation result by envelope has the characteristic of sparsity which is an essential property to apply compressed sensing. To have good performance for time series classification, we have to build the envelope that keeps the shape of unit pattern the most. At last, we demonstrate the unit pattern extraction method can approximately make envelope work well on real world datasets.
|