A Multi-node Data Mining System with Cloud Technology -Using Decision Tree

碩士 === 國立虎尾科技大學 === 資訊管理研究所 === 102 === With the improvement of information technology, data mining can be used to analyze various kinds of data. The parameters of mining method always affect the results’ quality. Researchers need constantly spend lots of computational time to find the optimal p...

Full description

Bibliographic Details
Main Authors: Hsuan-Lin Liu, 劉宣麟
Other Authors: Nian-Ze Hu
Format: Others
Language:zh-TW
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/4psz37
Description
Summary:碩士 === 國立虎尾科技大學 === 資訊管理研究所 === 102 === With the improvement of information technology, data mining can be used to analyze various kinds of data. The parameters of mining method always affect the results’ quality. Researchers need constantly spend lots of computational time to find the optimal parameter set. However, currently commercial mining tools are unable to deal with multi-data model at one time. Furthermore, we need to spend much time when processing a mining model with large data set. This study proposes a new architecture using open-source statistical language R as the base, choosing decision tree model as our evaluation method. Use C# to design user interface as well as a work server and the R language script program. Apply the concept of cloud service technologies to our system, and develop a multi-node processing architecture. The proposed mining process will corporate all available hosts to improve the solving performance. This system can save computational periods and try to find the best combination of parameters of each model. Finally, we will provide the system limitation test (data size, ram usage) compared with some commercial mining software, and evaluate the feasibility of this architecture.