MOVIE SALES PATTERN CLUSTERING FOR RECOMMENDATION SYSTEM

碩士 === 國立臺灣科技大學 === 工業管理系 === 103 === A recommender system (RS) where consumers are presented with items that are relevant to them obtains a lot of attention on e-commerce. By utilizing consumer’s explicit feedback given to the system, recommendation given can be more accurate. The mathematics behin...

Full description

Bibliographic Details
Main Authors: Ghilman Fatih, 吉雷曼
Other Authors: Chao-Lung Yang
Format: Others
Language:en_US
Published: 2015
Online Access:http://ndltd.ncl.edu.tw/handle/63006215275454988404
Description
Summary:碩士 === 國立臺灣科技大學 === 工業管理系 === 103 === A recommender system (RS) where consumers are presented with items that are relevant to them obtains a lot of attention on e-commerce. By utilizing consumer’s explicit feedback given to the system, recommendation given can be more accurate. The mathematics behind RS is using a matrix sized number of users multiplies number of items available. Calculating this very big matrix is exhaustive and inefficient. In this research, the concept of divide-and-conquer were borrowed by clustering items into several groups for enhancing the matrix computation in RS. Twitter’s user movie rating data was used to generate the matrix and IMDb movie data was used for clustering the movies. Two-step clustering was proposed to first cluster the movies based on its internal attributes. The second step is clustering movies by sales pattern of each movie. When clustering movies by sales pattern, the duration of a movie shown in theater can be considered as a product life. For better clustering time-series sales pattern, the discrete sales information was transformed into functional data. The functional data clustering was performed and the accuracy, computation time and recommendation given by traditional RS and our pre-cluster RS are compared. We found by clustering the items before doing matrix factorization, the accuracy of the predicted rating is better and computation time is faster. Moreover, the recommendation given is also based on the combination of latent features and items similarity.