Mining of Up-to-date Effective Knowledge

碩士 === 國立高雄大學 === 電機工程學系碩士班 === 96 === The most important part of data mining is that find out interesting and meaningful frequent patterns from database. Furthermore, temporal data mining is concerned with the analysis of temporal data and the discovery of temporal patterns and regularities. In thi...

Full description

Bibliographic Details
Main Authors: Yi-Ying Wu, 吳怡穎
Other Authors: Tzung-Pei Hong
Format: Others
Language:en_US
Published: 2008
Online Access:http://ndltd.ncl.edu.tw/handle/87798052201184256421
Description
Summary:碩士 === 國立高雄大學 === 電機工程學系碩士班 === 96 === The most important part of data mining is that find out interesting and meaningful frequent patterns from database. Furthermore, temporal data mining is concerned with the analysis of temporal data and the discovery of temporal patterns and regularities. In this thesis, therefore, a new concept of up-to-date patterns is proposed, which is a hybrid of the frequent patterns mining and temporal mining. An up-to-date pattern is thus composed of an itemset and its correspond up-to-date lifetime, in which the user-defined minimum support threshold must be satisfied. An itemset may not be frequent (large) for an entire database but may be large up-to-date since the items seldom occurring early may often occur lately. Two approaches are proposed in this thesis. In the first approach, the proposed approach can mine more useful large itemsets than the conventional ones which discover large itemsets valid only for the entire database. It first translates the log database into an item-oriented bit-map representation to speed up the execution in the later mining process and then extracts large itemsets valid with the longest lifetime from the past to the current time. The second approach then extends the previous one, which maintains up-to-date patterns when new transactions are added. Furthermore, we define a new variable to record the extra lifetime of an itemset in the log database for incremental mining approach. At lest, experimental results show that the proposed algorithm is more effective than the traditional ones in discovering such up-to-date patterns especially when the minimum support threshold is high.