The Research of Customer Purchase Behavior Using Constraint-based Sequential Pattern Mining Approach

博士 === 國立中央大學 === 資訊管理研究所 === 95 === Sequential pattern mining is an important data-mining method for determining time-related behavior in sequence databases. The information obtained from sequential pattern mining can be used in marketing, medical records, sales analysis, and so on. Existing method...

Full description

Bibliographic Details
Main Authors: Ya-Han Hu, 胡雅涵
Other Authors: 陳彥良
Format: Others
Language:en_US
Published: 2007
Online Access:http://ndltd.ncl.edu.tw/handle/68336666379019185920
Description
Summary:博士 === 國立中央大學 === 資訊管理研究所 === 95 === Sequential pattern mining is an important data-mining method for determining time-related behavior in sequence databases. The information obtained from sequential pattern mining can be used in marketing, medical records, sales analysis, and so on. Existing methods only focus on the concept of frequency because of the assumption that sequences’ behaviors do not change over time. Business sales environments are always highly dynamic and complicated, however, so the sequences’ behaviors may change over time. In this study, we first divide this problem into two sub-problems: sequential pattern mining in business-to-business (B2B) environment and business-to-customer (B2C) environment due to their unique sequence characteristics. Then, three new concepts, recency, repetition, and compactness, are incorporated into traditional sequential pattern mining to discover meaningful patterns in these two environments. The concept of recency causes patterns to quickly adapt to the latest behaviors in sequence databases. The concept of repetition ensures the occurrences of a pattern in a data-sequence must exceed user-specified thresholds. The concept of compactness ensures reasonable time spans for the discovered patterns. Two new patterns as well as efficient algorithms are presented in this dissertation. Thorough empirical evaluations are also given. The results show that the proposed methods are computationally efficient and they are more advantageous than traditional methods when sequences’ behaviors change over time.