Mining Fault-Tolerant Frequent Patterns in Large Databases

碩士 === 國立交通大學 === 資訊工程系 === 90 === In view of real world data may be interfered with noise which leads data to contain faults. The data mining methods proposed previously may not be applicable. Besides, we may hope that the knowledge discovered is more general and can be applied to find m...

Full description

Bibliographic Details
Main Authors: Sheng-Shun Wang, 王聖舜
Other Authors: Suh-Yin Lee
Format: Others
Language:en_US
Published: 2002
Online Access:http://ndltd.ncl.edu.tw/handle/84272332857502794908
id ndltd-TW-090NCTU0392051
record_format oai_dc
spelling ndltd-TW-090NCTU03920512016-06-27T16:08:59Z http://ndltd.ncl.edu.tw/handle/84272332857502794908 Mining Fault-Tolerant Frequent Patterns in Large Databases 在大型資料庫中探勘容錯頻繁樣式 Sheng-Shun Wang 王聖舜 碩士 國立交通大學 資訊工程系 90 In view of real world data may be interfered with noise which leads data to contain faults. The data mining methods proposed previously may not be applicable. Besides, we may hope that the knowledge discovered is more general and can be applied to find more interesting information. Hence, FT-Aprori was proposed for fault-tolerant data mining to discover information over large real-world data. However, FT-Apriori which generates and tests candidates based on Apriori property is not so efficient. In this paper, we develop memory-based algorithm FTP-mine which is based on the concept of pattern growth to mine fault-tolerant frequent patterns efficiently. In FTP-mine, the table, STable, is designed to count the item support and FT-support of the k-length patterns which have the same prefix of length k-1 by comparing transaction once. As to mining in a large database which is too large to fit in memory, FTP-mine also can be adopted by means of database partition. In addition, since there might exist a large number of fault tolerant frequent patterns and some may be contained in others, we also focus on the finding of maximal FT-frequent patterns by extending the FTP-mine algorithm. Our study shows that FTP-mine has higher performance than FT-Apriori in all kinds of parameter settings, such as various supports, tolerance, and scalability. The empirical evaluations show that the proposed method has good linear scalability and outperforms FT-Apriori in the discovery of FT-frequent pattern. Suh-Yin Lee 李素瑛 2002 學位論文 ; thesis 46 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立交通大學 === 資訊工程系 === 90 === In view of real world data may be interfered with noise which leads data to contain faults. The data mining methods proposed previously may not be applicable. Besides, we may hope that the knowledge discovered is more general and can be applied to find more interesting information. Hence, FT-Aprori was proposed for fault-tolerant data mining to discover information over large real-world data. However, FT-Apriori which generates and tests candidates based on Apriori property is not so efficient. In this paper, we develop memory-based algorithm FTP-mine which is based on the concept of pattern growth to mine fault-tolerant frequent patterns efficiently. In FTP-mine, the table, STable, is designed to count the item support and FT-support of the k-length patterns which have the same prefix of length k-1 by comparing transaction once. As to mining in a large database which is too large to fit in memory, FTP-mine also can be adopted by means of database partition. In addition, since there might exist a large number of fault tolerant frequent patterns and some may be contained in others, we also focus on the finding of maximal FT-frequent patterns by extending the FTP-mine algorithm. Our study shows that FTP-mine has higher performance than FT-Apriori in all kinds of parameter settings, such as various supports, tolerance, and scalability. The empirical evaluations show that the proposed method has good linear scalability and outperforms FT-Apriori in the discovery of FT-frequent pattern.
author2 Suh-Yin Lee
author_facet Suh-Yin Lee
Sheng-Shun Wang
王聖舜
author Sheng-Shun Wang
王聖舜
spellingShingle Sheng-Shun Wang
王聖舜
Mining Fault-Tolerant Frequent Patterns in Large Databases
author_sort Sheng-Shun Wang
title Mining Fault-Tolerant Frequent Patterns in Large Databases
title_short Mining Fault-Tolerant Frequent Patterns in Large Databases
title_full Mining Fault-Tolerant Frequent Patterns in Large Databases
title_fullStr Mining Fault-Tolerant Frequent Patterns in Large Databases
title_full_unstemmed Mining Fault-Tolerant Frequent Patterns in Large Databases
title_sort mining fault-tolerant frequent patterns in large databases
publishDate 2002
url http://ndltd.ncl.edu.tw/handle/84272332857502794908
work_keys_str_mv AT shengshunwang miningfaulttolerantfrequentpatternsinlargedatabases
AT wángshèngshùn miningfaulttolerantfrequentpatternsinlargedatabases
AT shengshunwang zàidàxíngzīliàokùzhōngtànkānróngcuòpínfányàngshì
AT wángshèngshùn zàidàxíngzīliàokùzhōngtànkānróngcuòpínfányàngshì
_version_ 1718324456120647680