Improved Association Rules Algorithm Using Matirx in Distributed System

碩士 === 嶺東科技大學 === 資訊科技系碩士班 === 107 === With the development of cloud computing, intelligent mobile applications, and IoT, data has changed into large-volume data sets and results in how to use a association rule method to obtain useful information from big data becomes a hot research topic. In order...

Full description

Bibliographic Details
Main Authors: CHEN,Yi-YUN, 陳依芸
Other Authors: CHANG,TSUI-PING
Format: Others
Language:zh-TW
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/mfv24u
id ndltd-TW-107LTC00396007
record_format oai_dc
spelling ndltd-TW-107LTC003960072019-07-23T03:37:29Z http://ndltd.ncl.edu.tw/handle/mfv24u Improved Association Rules Algorithm Using Matirx in Distributed System 利用矩陣在分散式系統中改善關聯式規則演算法 CHEN,Yi-YUN 陳依芸 碩士 嶺東科技大學 資訊科技系碩士班 107 With the development of cloud computing, intelligent mobile applications, and IoT, data has changed into large-volume data sets and results in how to use a association rule method to obtain useful information from big data becomes a hot research topic. In order to obtain more accurate and useful information, the high performance association rule methods becomes an important issue. The Apriori and Frequent Pattern Growth (FP-Growth) algorithms are the most common association rules algorithms and used to acquire the useful information from big data. Currently, there are many researches proposed their algorithms to improve the performance of Apriori algorithm. In 2018, a research proposed the concept of multi-tree to improve the performance of Apriori algorithm. This proposed algorithm reduces the time cost of generating candidate sets and the number of database scans. However, for big data, the time cost is not acceptable for users. On the other hand, some researches improve the performance of FP-growth algorithm by using parallel framework (i.e., Hadoop MapRedcue). However, Hadoop MapReduce cannot provide suitable memory to store and process the FP tree. In this paper, the concept of matrix is used to improve the performance of association rule algorithm on Spark framework. Spark-based algorithms can provide suitable memory to parallel construct the matrix and increase the processing speed. We also report experimental results of the prototype implementation on Spark system and propose a popular implementation of Spark framework. CHANG,TSUI-PING 張翠蘋 2019 學位論文 ; thesis 45 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 嶺東科技大學 === 資訊科技系碩士班 === 107 === With the development of cloud computing, intelligent mobile applications, and IoT, data has changed into large-volume data sets and results in how to use a association rule method to obtain useful information from big data becomes a hot research topic. In order to obtain more accurate and useful information, the high performance association rule methods becomes an important issue. The Apriori and Frequent Pattern Growth (FP-Growth) algorithms are the most common association rules algorithms and used to acquire the useful information from big data. Currently, there are many researches proposed their algorithms to improve the performance of Apriori algorithm. In 2018, a research proposed the concept of multi-tree to improve the performance of Apriori algorithm. This proposed algorithm reduces the time cost of generating candidate sets and the number of database scans. However, for big data, the time cost is not acceptable for users. On the other hand, some researches improve the performance of FP-growth algorithm by using parallel framework (i.e., Hadoop MapRedcue). However, Hadoop MapReduce cannot provide suitable memory to store and process the FP tree. In this paper, the concept of matrix is used to improve the performance of association rule algorithm on Spark framework. Spark-based algorithms can provide suitable memory to parallel construct the matrix and increase the processing speed. We also report experimental results of the prototype implementation on Spark system and propose a popular implementation of Spark framework.
author2 CHANG,TSUI-PING
author_facet CHANG,TSUI-PING
CHEN,Yi-YUN
陳依芸
author CHEN,Yi-YUN
陳依芸
spellingShingle CHEN,Yi-YUN
陳依芸
Improved Association Rules Algorithm Using Matirx in Distributed System
author_sort CHEN,Yi-YUN
title Improved Association Rules Algorithm Using Matirx in Distributed System
title_short Improved Association Rules Algorithm Using Matirx in Distributed System
title_full Improved Association Rules Algorithm Using Matirx in Distributed System
title_fullStr Improved Association Rules Algorithm Using Matirx in Distributed System
title_full_unstemmed Improved Association Rules Algorithm Using Matirx in Distributed System
title_sort improved association rules algorithm using matirx in distributed system
publishDate 2019
url http://ndltd.ncl.edu.tw/handle/mfv24u
work_keys_str_mv AT chenyiyun improvedassociationrulesalgorithmusingmatirxindistributedsystem
AT chényīyún improvedassociationrulesalgorithmusingmatirxindistributedsystem
AT chenyiyun lìyòngjǔzhènzàifēnsànshìxìtǒngzhōnggǎishànguānliánshìguīzéyǎnsuànfǎ
AT chényīyún lìyòngjǔzhènzàifēnsànshìxìtǒngzhōnggǎishànguānliánshìguīzéyǎnsuànfǎ
_version_ 1719229218314780672