Location-Based Parallel Sequential Pattern Mining Algorithm

Given a data sequence, sequential pattern mining, which finds frequent sequence patterns among them, is an important data mining problem. However, in the existing sequential pattern mining, only the purchase order of the items is considered, and the position where the item is purchased is not consid...

Full description

Bibliographic Details
Main Authors: Byoungwook Kim, Gangman Yi
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8826432/
Description
Summary:Given a data sequence, sequential pattern mining, which finds frequent sequence patterns among them, is an important data mining problem. However, in the existing sequential pattern mining, only the purchase order of the items is considered, and the position where the item is purchased is not considered. In this paper, we developed a sequential pattern mining algorithm using Apache spark. The proposed algorithm finds frequent sequential patterns in parallel by distributing data to several machines. Experimentally, we performed a comprehensive performance study on the proposed algorithm by varying various parameter values using various synthetic data. Experimental results show that the proposed algorithm shows a linear speed improvement over the number of machines.
ISSN:2169-3536