Improving Fair Scheduling Performance on Hadoop

碩士 === 國立東華大學 === 資訊工程學系 === 103 === Cloud computing and big data are both famous issues in the world. Cloud computing can not only support a storage platform which we can access big data, but also can provide a real technique to process truly large amounts of data at the same time. Therefore,...

Full description

Bibliographic Details
Main Authors: Ya-Wen Cheng, 鄭雅文
Other Authors: Shou-Chih Lo
Format: Others
Published: 2015
Online Access:http://ndltd.ncl.edu.tw/handle/16860934128679350588
id ndltd-TW-103NDHU5392031
record_format oai_dc
spelling ndltd-TW-103NDHU53920312016-07-31T04:22:08Z http://ndltd.ncl.edu.tw/handle/16860934128679350588 Improving Fair Scheduling Performance on Hadoop 改善Hadoop公平排程器之效能 Ya-Wen Cheng 鄭雅文 碩士 國立東華大學 資訊工程學系 103 Cloud computing and big data are both famous issues in the world. Cloud computing can not only support a storage platform which we can access big data, but also can provide a real technique to process truly large amounts of data at the same time. Therefore, this thesis choses the open-source-based Hadoop to study. Our study is focused on improving Hadoop performance by using fair scheduling. Our goal is trying to refer many real time parameters and using them to decide which job can take system resource at first. In addition, we adjust the relative parameters dynamically, for example job priority or delay time etc. We hope to enhance the job runtime speed and improve system performance. This thesis mentions five mechanisms: job classification, pool resource assignment, job sorting based on FIFO, job sorting based on fairness, dynamic delay time adjustment and dynamic job priority adjustment. We use these strategies by consulting real system status and making them to impact on the system performance. Finally, our proposed mechanisms can actually improve the fair scheduling performance. The experiment approved our method is better than the original Hadoop fair scheduling. The result displays the great improvement. Shou-Chih Lo 羅壽之 2015 學位論文 ; thesis 68
collection NDLTD
format Others
sources NDLTD
description 碩士 === 國立東華大學 === 資訊工程學系 === 103 === Cloud computing and big data are both famous issues in the world. Cloud computing can not only support a storage platform which we can access big data, but also can provide a real technique to process truly large amounts of data at the same time. Therefore, this thesis choses the open-source-based Hadoop to study. Our study is focused on improving Hadoop performance by using fair scheduling. Our goal is trying to refer many real time parameters and using them to decide which job can take system resource at first. In addition, we adjust the relative parameters dynamically, for example job priority or delay time etc. We hope to enhance the job runtime speed and improve system performance. This thesis mentions five mechanisms: job classification, pool resource assignment, job sorting based on FIFO, job sorting based on fairness, dynamic delay time adjustment and dynamic job priority adjustment. We use these strategies by consulting real system status and making them to impact on the system performance. Finally, our proposed mechanisms can actually improve the fair scheduling performance. The experiment approved our method is better than the original Hadoop fair scheduling. The result displays the great improvement.
author2 Shou-Chih Lo
author_facet Shou-Chih Lo
Ya-Wen Cheng
鄭雅文
author Ya-Wen Cheng
鄭雅文
spellingShingle Ya-Wen Cheng
鄭雅文
Improving Fair Scheduling Performance on Hadoop
author_sort Ya-Wen Cheng
title Improving Fair Scheduling Performance on Hadoop
title_short Improving Fair Scheduling Performance on Hadoop
title_full Improving Fair Scheduling Performance on Hadoop
title_fullStr Improving Fair Scheduling Performance on Hadoop
title_full_unstemmed Improving Fair Scheduling Performance on Hadoop
title_sort improving fair scheduling performance on hadoop
publishDate 2015
url http://ndltd.ncl.edu.tw/handle/16860934128679350588
work_keys_str_mv AT yawencheng improvingfairschedulingperformanceonhadoop
AT zhèngyǎwén improvingfairschedulingperformanceonhadoop
AT yawencheng gǎishànhadoopgōngpíngpáichéngqìzhīxiàonéng
AT zhèngyǎwén gǎishànhadoopgōngpíngpáichéngqìzhīxiàonéng
_version_ 1718367067713830912