Using of Rough Set Theory to Increase the Performance of Big Data with Uncertainty

碩士 === 嶺東科技大學 === 資訊管理系碩士班 === 104 === Since the advent of Big Data era, large amounts of data are constantly being produced and stored; and research institutions International Data Corporation also pointed out that the global amount of data is growing at an annual rate of 50% growth forecast the ne...

Full description

Bibliographic Details
Main Authors: CHEN, YEN-AN, 陳彥安
Other Authors: HUANG, KUANG-YU
Format: Others
Language:zh-TW
Published: 2016
Online Access:http://ndltd.ncl.edu.tw/handle/94520591221420675302
id ndltd-TW-104LTC00395009
record_format oai_dc
spelling ndltd-TW-104LTC003950092017-05-27T04:35:41Z http://ndltd.ncl.edu.tw/handle/94520591221420675302 Using of Rough Set Theory to Increase the Performance of Big Data with Uncertainty 運用粗集合理論以俾利含不精準資料之大數據分析 CHEN, YEN-AN 陳彥安 碩士 嶺東科技大學 資訊管理系碩士班 104 Since the advent of Big Data era, large amounts of data are constantly being produced and stored; and research institutions International Data Corporation also pointed out that the global amount of data is growing at an annual rate of 50% growth forecast the next six years, the amount of data will grow 10 times as much, but because the government appears after the word big data, all data saved none of its filtration, thus resulting in relatively lower availability of information, how to improve the availability of information has become the most attention by one of the topics; secondly, when the use of the non-parallel processing of large data type data, face restrictions processing speed, memory and storage space. Therefore, this study presents the data mining technology and distributed parallel computing technology to be integrated, the large-volume data and simplify the analysis. Research using data mining the Data Reduction (DR) technology, to be considered for Record Reduction and Value Reduction. Data mining technology system using Fuzzy C-mean and Rough Set Theory reached streamline data purposes. First (1) through the FCM will information be clustering; and then (2) the use of RST derive more condensed knowledge rules; and (3) through an approximate collection concept RST's, from the original data set, the data set is not accurate data sets to be singled out. In addition to using the above two techniques to achieve DR goals, thus making the follow-up information becomes more clear and effective, significantly increasing the availability of information, thus making the analysis and application of large amounts of data on the entire parallel computing platform used Hadoop Cloud . HUANG, KUANG-YU 黃光宇 2016 學位論文 ; thesis 52 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 嶺東科技大學 === 資訊管理系碩士班 === 104 === Since the advent of Big Data era, large amounts of data are constantly being produced and stored; and research institutions International Data Corporation also pointed out that the global amount of data is growing at an annual rate of 50% growth forecast the next six years, the amount of data will grow 10 times as much, but because the government appears after the word big data, all data saved none of its filtration, thus resulting in relatively lower availability of information, how to improve the availability of information has become the most attention by one of the topics; secondly, when the use of the non-parallel processing of large data type data, face restrictions processing speed, memory and storage space. Therefore, this study presents the data mining technology and distributed parallel computing technology to be integrated, the large-volume data and simplify the analysis. Research using data mining the Data Reduction (DR) technology, to be considered for Record Reduction and Value Reduction. Data mining technology system using Fuzzy C-mean and Rough Set Theory reached streamline data purposes. First (1) through the FCM will information be clustering; and then (2) the use of RST derive more condensed knowledge rules; and (3) through an approximate collection concept RST's, from the original data set, the data set is not accurate data sets to be singled out. In addition to using the above two techniques to achieve DR goals, thus making the follow-up information becomes more clear and effective, significantly increasing the availability of information, thus making the analysis and application of large amounts of data on the entire parallel computing platform used Hadoop Cloud .
author2 HUANG, KUANG-YU
author_facet HUANG, KUANG-YU
CHEN, YEN-AN
陳彥安
author CHEN, YEN-AN
陳彥安
spellingShingle CHEN, YEN-AN
陳彥安
Using of Rough Set Theory to Increase the Performance of Big Data with Uncertainty
author_sort CHEN, YEN-AN
title Using of Rough Set Theory to Increase the Performance of Big Data with Uncertainty
title_short Using of Rough Set Theory to Increase the Performance of Big Data with Uncertainty
title_full Using of Rough Set Theory to Increase the Performance of Big Data with Uncertainty
title_fullStr Using of Rough Set Theory to Increase the Performance of Big Data with Uncertainty
title_full_unstemmed Using of Rough Set Theory to Increase the Performance of Big Data with Uncertainty
title_sort using of rough set theory to increase the performance of big data with uncertainty
publishDate 2016
url http://ndltd.ncl.edu.tw/handle/94520591221420675302
work_keys_str_mv AT chenyenan usingofroughsettheorytoincreasetheperformanceofbigdatawithuncertainty
AT chényànān usingofroughsettheorytoincreasetheperformanceofbigdatawithuncertainty
AT chenyenan yùnyòngcūjíhélǐlùnyǐbǐlìhánbùjīngzhǔnzīliàozhīdàshùjùfēnxī
AT chényànān yùnyòngcūjíhélǐlùnyǐbǐlìhánbùjīngzhǔnzīliàozhīdàshùjùfēnxī
_version_ 1718454016940179456