Approximately Mining Recently Repeating Patterns on Data Streams

碩士 === 國立臺灣師範大學 === 資訊教育學系 === 94 === Repeating patterns represent temporal relations among data items, which could be used for data summarization and data prediction. More and more data of various applications is generated as a data stream. Accordingly, the traditional strategies for mining repeati...

Full description

Bibliographic Details
Main Author:	周蓓旻
Other Authors:	柯佳伶
Format:	Others
Language:	zh-TW
Published:	2006
Online Access:	http://ndltd.ncl.edu.tw/handle/20533636393226266446

id	ndltd-TW-094NTNU5395028
record_format	oai_dc
spelling	ndltd-TW-094NTNU53950282016-06-01T04:21:13Z http://ndltd.ncl.edu.tw/handle/20533636393226266446 Approximately Mining Recently Repeating Patterns on Data Streams 近似探勘資料流中最近重覆樣式方法之研究周蓓旻碩士國立臺灣師範大學資訊教育學系 94 Repeating patterns represent temporal relations among data items, which could be used for data summarization and data prediction. More and more data of various applications is generated as a data stream. Accordingly, the traditional strategies for mining repeating patterns on static database are not suitable in a data stream environment. Besides, in the dynamic environment of a data stream, mining the repeating patterns from the whole history data sequence does not extract the newest trend of patterns in the data stream. For this reason, two algorithms for efficiently mining recently repeating patterns in a data stream are proposed in this thesis. One is named the appearing-bit-sequence-based incremental mining algorithm and the other one is named the basic-patterns estimating-based algorithm. The incremental mining approach applies appearing bit sequences to compute the frequencies of data patterns efficiently within the sliding window. By maintaining the appearing bit sequences of maximal repeating patterns, the newly generated recently repeating patterns are mined from the maintained information to reduce processing cost when the window slides. The estimating-based method maintains the repeating patterns, potential repeating patterns, and 2-item patterns, a partition-based scheme is used to count the frequencies of patterns. By constructing a data structure to support efficiently access of remained patterns, the frequency of an unretained pattern is estimated according to the frequencies of its maximum prefix-subpattern and suffix-subpattern. The experimental results show that the incremental mining method is an efficient way for mining recently repeating patterns correctly. And the estimating-based method provides a even more faster way to discover recently repeating patterns from a data stream approximately. 柯佳伶 2006 學位論文 ; thesis 53 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 國立臺灣師範大學 === 資訊教育學系 === 94 === Repeating patterns represent temporal relations among data items, which could be used for data summarization and data prediction. More and more data of various applications is generated as a data stream. Accordingly, the traditional strategies for mining repeating patterns on static database are not suitable in a data stream environment. Besides, in the dynamic environment of a data stream, mining the repeating patterns from the whole history data sequence does not extract the newest trend of patterns in the data stream. For this reason, two algorithms for efficiently mining recently repeating patterns in a data stream are proposed in this thesis. One is named the appearing-bit-sequence-based incremental mining algorithm and the other one is named the basic-patterns estimating-based algorithm. The incremental mining approach applies appearing bit sequences to compute the frequencies of data patterns efficiently within the sliding window. By maintaining the appearing bit sequences of maximal repeating patterns, the newly generated recently repeating patterns are mined from the maintained information to reduce processing cost when the window slides. The estimating-based method maintains the repeating patterns, potential repeating patterns, and 2-item patterns, a partition-based scheme is used to count the frequencies of patterns. By constructing a data structure to support efficiently access of remained patterns, the frequency of an unretained pattern is estimated according to the frequencies of its maximum prefix-subpattern and suffix-subpattern. The experimental results show that the incremental mining method is an efficient way for mining recently repeating patterns correctly. And the estimating-based method provides a even more faster way to discover recently repeating patterns from a data stream approximately.
author2	柯佳伶
author_facet	柯佳伶周蓓旻
author	周蓓旻
spellingShingle	周蓓旻 Approximately Mining Recently Repeating Patterns on Data Streams
author_sort	周蓓旻
title	Approximately Mining Recently Repeating Patterns on Data Streams
title_short	Approximately Mining Recently Repeating Patterns on Data Streams
title_full	Approximately Mining Recently Repeating Patterns on Data Streams
title_fullStr	Approximately Mining Recently Repeating Patterns on Data Streams
title_full_unstemmed	Approximately Mining Recently Repeating Patterns on Data Streams
title_sort	approximately mining recently repeating patterns on data streams
publishDate	2006
url	http://ndltd.ncl.edu.tw/handle/20533636393226266446
work_keys_str_mv	AT zhōubèimín approximatelyminingrecentlyrepeatingpatternsondatastreams AT zhōubèimín jìnshìtànkānzīliàoliúzhōngzuìjìnzhòngfùyàngshìfāngfǎzhīyánjiū
_version_	1718289950433083392

Approximately Mining Recently Repeating Patterns on Data Streams

Similar Items