Updating Frequent Itemsets Without Rescanning the Original Database in the Incremental Environment

碩士 === 國立臺灣科技大學 === 資訊工程系 === 97 === In today, Information is important and valuable, and it helps people makes the decision, such as market strategy. People depend on the useful information more and more. For this reason, people invent many methods that extract the useful formation from the huge da...

Full description

Bibliographic Details
Main Authors:	Pai-Yu Lin, 林柏佑
Other Authors:	Bi-Ru Dai
Format:	Others
Language:	en_US
Published:	2009
Online Access:	http://ndltd.ncl.edu.tw/handle/25629406614318367928

id	ndltd-TW-097NTUS5392054
record_format	oai_dc
spelling	ndltd-TW-097NTUS53920542016-05-02T04:11:39Z http://ndltd.ncl.edu.tw/handle/25629406614318367928 Updating Frequent Itemsets Without Rescanning the Original Database in the Incremental Environment 以不需重新存取資料庫的方式來有效探勘動態資料庫中的頻繁項目集 Pai-Yu Lin 林柏佑碩士國立臺灣科技大學資訊工程系 97 In today, Information is important and valuable, and it helps people makes the decision, such as market strategy. People depend on the useful information more and more. For this reason, people invent many methods that extract the useful formation from the huge data. So, the technology of data mining is growing at rapid pace recently. Many helpful algorithms and applications are proposed in the recent years. Moreover, Researchers still try to develop efficient algorithms in this moment. Frequent pattern mining plays an important role in the data mining community since it is usually a fundamental step in various mining tasks. However, maintenance of frequent patterns is very expensive in the incremental database. In addition, the status of a pattern is changed with time. In other words, a frequent pattern is possible to become infrequent, and vice versa. In order to exactly find all frequent patterns, most algorithms have to scan the original database completely whenever an update occurs. In this work, we propose two new algorithms, iTM and ECEM. They mine frequent itemsets without rescanning the whole database in the incremental environment. These algorithms use the compressed structure, and quickly project the transaction dataset into this structure. We are able to preserve frequencies of all items, because our structure has a good compression ratio. Furthermore, these algorithms do not need rescanning the database when the user-defined threshold is changed. We also design several experiments to verify performances of our algorithms. Various transaction databases are used in our experiments. The results demonstrate that our algorithm can extract exact frequent itemsets from the transaction database, and these operations do not spend a lot of cost. In huge databases, we can obtain similar results, either. In this study, our algorithms reduce the cost in the step of scanning, and guarantee that the response time is acceptable. Bi-Ru Dai 戴碧如 2009 學位論文 ; thesis 61 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立臺灣科技大學 === 資訊工程系 === 97 === In today, Information is important and valuable, and it helps people makes the decision, such as market strategy. People depend on the useful information more and more. For this reason, people invent many methods that extract the useful formation from the huge data. So, the technology of data mining is growing at rapid pace recently. Many helpful algorithms and applications are proposed in the recent years. Moreover, Researchers still try to develop efficient algorithms in this moment. Frequent pattern mining plays an important role in the data mining community since it is usually a fundamental step in various mining tasks. However, maintenance of frequent patterns is very expensive in the incremental database. In addition, the status of a pattern is changed with time. In other words, a frequent pattern is possible to become infrequent, and vice versa. In order to exactly find all frequent patterns, most algorithms have to scan the original database completely whenever an update occurs. In this work, we propose two new algorithms, iTM and ECEM. They mine frequent itemsets without rescanning the whole database in the incremental environment. These algorithms use the compressed structure, and quickly project the transaction dataset into this structure. We are able to preserve frequencies of all items, because our structure has a good compression ratio. Furthermore, these algorithms do not need rescanning the database when the user-defined threshold is changed. We also design several experiments to verify performances of our algorithms. Various transaction databases are used in our experiments. The results demonstrate that our algorithm can extract exact frequent itemsets from the transaction database, and these operations do not spend a lot of cost. In huge databases, we can obtain similar results, either. In this study, our algorithms reduce the cost in the step of scanning, and guarantee that the response time is acceptable.
author2	Bi-Ru Dai
author_facet	Bi-Ru Dai Pai-Yu Lin 林柏佑
author	Pai-Yu Lin 林柏佑
spellingShingle	Pai-Yu Lin 林柏佑 Updating Frequent Itemsets Without Rescanning the Original Database in the Incremental Environment
author_sort	Pai-Yu Lin
title	Updating Frequent Itemsets Without Rescanning the Original Database in the Incremental Environment
title_short	Updating Frequent Itemsets Without Rescanning the Original Database in the Incremental Environment
title_full	Updating Frequent Itemsets Without Rescanning the Original Database in the Incremental Environment
title_fullStr	Updating Frequent Itemsets Without Rescanning the Original Database in the Incremental Environment
title_full_unstemmed	Updating Frequent Itemsets Without Rescanning the Original Database in the Incremental Environment
title_sort	updating frequent itemsets without rescanning the original database in the incremental environment
publishDate	2009
url	http://ndltd.ncl.edu.tw/handle/25629406614318367928
work_keys_str_mv	AT paiyulin updatingfrequentitemsetswithoutrescanningtheoriginaldatabaseintheincrementalenvironment AT línbǎiyòu updatingfrequentitemsetswithoutrescanningtheoriginaldatabaseintheincrementalenvironment AT paiyulin yǐbùxūzhòngxīncúnqǔzīliàokùdefāngshìláiyǒuxiàotànkāndòngtàizīliàokùzhōngdepínfánxiàngmùjí AT línbǎiyòu yǐbùxūzhòngxīncúnqǔzīliàokùdefāngshìláiyǒuxiàotànkāndòngtàizīliàokùzhōngdepínfánxiàngmùjí
_version_	1718254064764977152

Updating Frequent Itemsets Without Rescanning the Original Database in the Incremental Environment

Similar Items