A Fast Frequent Pattern Split Approach on Native XML Database for Mining Association Rules

碩士 === 朝陽科技大學 === 資訊管理系碩士班 === 93 === The popularity of XML results in producing huge numbers of XML documents. Thus, native XML databases are developed and designed to store the great quantity of XML documents. However, most techniques of association rule mining only can be applied to transaction d...

Full description

Bibliographic Details
Main Authors: Tsung-Hsien Shen, 沈宗憲
Other Authors: Chin-Feng Lee
Format: Others
Language:zh-TW
Published: 2005
Online Access:http://ndltd.ncl.edu.tw/handle/7a3rcp
id ndltd-TW-093CYUT5396013
record_format oai_dc
spelling ndltd-TW-093CYUT53960132019-05-15T19:19:45Z http://ndltd.ncl.edu.tw/handle/7a3rcp A Fast Frequent Pattern Split Approach on Native XML Database for Mining Association Rules 快速的高頻型樣分割法於原生型XML資料庫探勘關聯規則 Tsung-Hsien Shen 沈宗憲 碩士 朝陽科技大學 資訊管理系碩士班 93 The popularity of XML results in producing huge numbers of XML documents. Thus, native XML databases are developed and designed to store the great quantity of XML documents. However, most techniques of association rule mining only can be applied to transaction databases, such as Apriori algorithm, H-mine algorithm, and FP-tree algorithm, and so on. Therefore, to develop an approach of association rule mining on native XML databases is a great important research. Currently, the FP-growth based on an FP-tree algorithm performs more efficiently than other methods of association rules mining, but it cannot be applied to native XML databases. Hence, we adaptive an improving FP-tree algorithm called Frequent Pattern Split method, simply FP-split, for fast association rule mining from native XML databases. The proposed FP-split method explores association rules of character data and tags in XML documents by parsing DTD or XML schema. Unlike XQuery, FP-split method can easily aid users to extract important and complete information from XML documents without needing to understand both the structure of XML documents and their corresponding syntax. In this paper, we prove that the FP-split method is time-efficient for mining association rules from native XML databases by experiment with various parameters, such as various minimum supports, different number of items, and large amount of data. In addition, we also implement a lot of experiments to show that our proposed method performs better than FP-tree construction algorithm in transaction database. Chin-Feng Lee 李金鳳 2005 學位論文 ; thesis 90 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 朝陽科技大學 === 資訊管理系碩士班 === 93 === The popularity of XML results in producing huge numbers of XML documents. Thus, native XML databases are developed and designed to store the great quantity of XML documents. However, most techniques of association rule mining only can be applied to transaction databases, such as Apriori algorithm, H-mine algorithm, and FP-tree algorithm, and so on. Therefore, to develop an approach of association rule mining on native XML databases is a great important research. Currently, the FP-growth based on an FP-tree algorithm performs more efficiently than other methods of association rules mining, but it cannot be applied to native XML databases. Hence, we adaptive an improving FP-tree algorithm called Frequent Pattern Split method, simply FP-split, for fast association rule mining from native XML databases. The proposed FP-split method explores association rules of character data and tags in XML documents by parsing DTD or XML schema. Unlike XQuery, FP-split method can easily aid users to extract important and complete information from XML documents without needing to understand both the structure of XML documents and their corresponding syntax. In this paper, we prove that the FP-split method is time-efficient for mining association rules from native XML databases by experiment with various parameters, such as various minimum supports, different number of items, and large amount of data. In addition, we also implement a lot of experiments to show that our proposed method performs better than FP-tree construction algorithm in transaction database.
author2 Chin-Feng Lee
author_facet Chin-Feng Lee
Tsung-Hsien Shen
沈宗憲
author Tsung-Hsien Shen
沈宗憲
spellingShingle Tsung-Hsien Shen
沈宗憲
A Fast Frequent Pattern Split Approach on Native XML Database for Mining Association Rules
author_sort Tsung-Hsien Shen
title A Fast Frequent Pattern Split Approach on Native XML Database for Mining Association Rules
title_short A Fast Frequent Pattern Split Approach on Native XML Database for Mining Association Rules
title_full A Fast Frequent Pattern Split Approach on Native XML Database for Mining Association Rules
title_fullStr A Fast Frequent Pattern Split Approach on Native XML Database for Mining Association Rules
title_full_unstemmed A Fast Frequent Pattern Split Approach on Native XML Database for Mining Association Rules
title_sort fast frequent pattern split approach on native xml database for mining association rules
publishDate 2005
url http://ndltd.ncl.edu.tw/handle/7a3rcp
work_keys_str_mv AT tsunghsienshen afastfrequentpatternsplitapproachonnativexmldatabaseforminingassociationrules
AT chénzōngxiàn afastfrequentpatternsplitapproachonnativexmldatabaseforminingassociationrules
AT tsunghsienshen kuàisùdegāopínxíngyàngfēngēfǎyúyuánshēngxíngxmlzīliàokùtànkānguānliánguīzé
AT chénzōngxiàn kuàisùdegāopínxíngyàngfēngēfǎyúyuánshēngxíngxmlzīliàokùtànkānguānliánguīzé
AT tsunghsienshen fastfrequentpatternsplitapproachonnativexmldatabaseforminingassociationrules
AT chénzōngxiàn fastfrequentpatternsplitapproachonnativexmldatabaseforminingassociationrules
_version_ 1719088581261131776