Study of Performance Optimization Scheme for Hadoop MapReduce Architecture

博士 === 國防大學理工學院 === 國防科學研究所 === 104 === As the use of cloud computing increases rapidly, Big Data also continue to grow quickly. The performance of data processing for big data has become an important research issue. This thesis discusses performance measurement methods together with performance tun...

Full description

Bibliographic Details
Main Authors: LO,HSIANG-FU, 羅祥福
Other Authors: LIU,CHIANG-LUNG
Format: Others
Language:zh-TW
Published: 2016
Online Access:http://ndltd.ncl.edu.tw/handle/55720702234492522318
id ndltd-TW-104CCIT0584003
record_format oai_dc
spelling ndltd-TW-104CCIT05840032017-09-15T04:40:14Z http://ndltd.ncl.edu.tw/handle/55720702234492522318 Study of Performance Optimization Scheme for Hadoop MapReduce Architecture 啟發式雲端平台自我效能優化機制之研究 LO,HSIANG-FU 羅祥福 博士 國防大學理工學院 國防科學研究所 104 As the use of cloud computing increases rapidly, Big Data also continue to grow quickly. The performance of data processing for big data has become an important research issue. This thesis discusses performance measurement methods together with performance tuning scheme in Hadoop MapReduce and then correspondingly proposes the performance improvement methods. To design a performance measurement scheme for Hadoop information hiding applications, a Performance AnalysiS Scheme for MapReduce Information Hiding (PASS-MIH) model is proposed to analyze and measure the performance impact factors of Hadoop information hiding applications. Experimental results show that PASS-MIH model can estimate four levels of performance impact factors for MR-based LSB test case and gain 53.8% performance improvement rate while integrating an existing Hadoop parameter tuning method. In addition, a Comprehensive Performance Rating (CPR) model was used to identify nine principal components from workload history and Hadoop configuration that strongly impacted the Hadoop performance. Experimental results indicate that tuning principal components of Hadoop configurations can produce non-linear performance results. In addition, an ACO-based Hadoop Configuration Optimization (ACO-HCO) scheme is proposed to optimize the performance of Hadoop by automatically tuning its configuration parameter settings. ACO-HCO first employed gene expression programming technique to build an object function based on historical job running records, which represents a correlation among the Hadoop configuration parameters. It then employs ant colony optimization technique, which makes use of the objective function to search for optimal or near optimal parameter settings. Experimental results verify that ACO-HCO scheme enhances the performance of Hadoop significantly compared with the default settings. Moreover, it outperforms both rule-of-thumb settings and the Starfish model in Hadoop performance optimization. LIU,CHIANG-LUNG LIU,FONG-HAO CHANG,KO-CHIN 劉江龍 劉豐豪 張克勤 2016 學位論文 ; thesis 79 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 博士 === 國防大學理工學院 === 國防科學研究所 === 104 === As the use of cloud computing increases rapidly, Big Data also continue to grow quickly. The performance of data processing for big data has become an important research issue. This thesis discusses performance measurement methods together with performance tuning scheme in Hadoop MapReduce and then correspondingly proposes the performance improvement methods. To design a performance measurement scheme for Hadoop information hiding applications, a Performance AnalysiS Scheme for MapReduce Information Hiding (PASS-MIH) model is proposed to analyze and measure the performance impact factors of Hadoop information hiding applications. Experimental results show that PASS-MIH model can estimate four levels of performance impact factors for MR-based LSB test case and gain 53.8% performance improvement rate while integrating an existing Hadoop parameter tuning method. In addition, a Comprehensive Performance Rating (CPR) model was used to identify nine principal components from workload history and Hadoop configuration that strongly impacted the Hadoop performance. Experimental results indicate that tuning principal components of Hadoop configurations can produce non-linear performance results. In addition, an ACO-based Hadoop Configuration Optimization (ACO-HCO) scheme is proposed to optimize the performance of Hadoop by automatically tuning its configuration parameter settings. ACO-HCO first employed gene expression programming technique to build an object function based on historical job running records, which represents a correlation among the Hadoop configuration parameters. It then employs ant colony optimization technique, which makes use of the objective function to search for optimal or near optimal parameter settings. Experimental results verify that ACO-HCO scheme enhances the performance of Hadoop significantly compared with the default settings. Moreover, it outperforms both rule-of-thumb settings and the Starfish model in Hadoop performance optimization.
author2 LIU,CHIANG-LUNG
author_facet LIU,CHIANG-LUNG
LO,HSIANG-FU
羅祥福
author LO,HSIANG-FU
羅祥福
spellingShingle LO,HSIANG-FU
羅祥福
Study of Performance Optimization Scheme for Hadoop MapReduce Architecture
author_sort LO,HSIANG-FU
title Study of Performance Optimization Scheme for Hadoop MapReduce Architecture
title_short Study of Performance Optimization Scheme for Hadoop MapReduce Architecture
title_full Study of Performance Optimization Scheme for Hadoop MapReduce Architecture
title_fullStr Study of Performance Optimization Scheme for Hadoop MapReduce Architecture
title_full_unstemmed Study of Performance Optimization Scheme for Hadoop MapReduce Architecture
title_sort study of performance optimization scheme for hadoop mapreduce architecture
publishDate 2016
url http://ndltd.ncl.edu.tw/handle/55720702234492522318
work_keys_str_mv AT lohsiangfu studyofperformanceoptimizationschemeforhadoopmapreducearchitecture
AT luóxiángfú studyofperformanceoptimizationschemeforhadoopmapreducearchitecture
AT lohsiangfu qǐfāshìyúnduānpíngtáizìwǒxiàonéngyōuhuàjīzhìzhīyánjiū
AT luóxiángfú qǐfāshìyúnduānpíngtáizìwǒxiàonéngyōuhuàjīzhìzhīyánjiū
_version_ 1718533656726732800