Improving the Hadoop System Performance through Activity-Aware Data Access
碩士 === 輔仁大學 === 資訊工程學系碩士班 === 107 === As cloud computing is getting more and more popular, cloud systems have been widely adopted to store and share information among users. The Apache Hadoop is one the most popular cloud platforms in the cloud community. It could consist of a large number of comp...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2019
|
Online Access: | http://ndltd.ncl.edu.tw/handle/9nusyp |
id |
ndltd-TW-107FJU00396039 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-107FJU003960392019-09-17T03:40:08Z http://ndltd.ncl.edu.tw/handle/9nusyp Improving the Hadoop System Performance through Activity-Aware Data Access 藉由資料存取位置篩選來改進Hadoop系統效能 CHEN, YU-LIN 陳宥霖 碩士 輔仁大學 資訊工程學系碩士班 107 As cloud computing is getting more and more popular, cloud systems have been widely adopted to store and share information among users. The Apache Hadoop is one the most popular cloud platforms in the cloud community. It could consist of a large number of computing nodes and keep data with replicas across its computing nodes. As a result, jobs based on the MapReduce model in Hadoop could be divided into smaller tasks and get distributed to multiple computing nodes to speed up their execution. However, the progress of MapReduce jobs can be delayed by accessing data from computing nodes if those nodes have heavy disk I/O during the data access. This research aims to mitigate the delay issue by helping MapReduce jobs to access data from computing nodes with less disk I/O instead of the busy ones. Consequently, the progress of MapReduce jobs could also be accelerated. Besides, through our approach, the real-time disk loading in the Hadoop cluster could also be more balanced as we always access data from disks with less disk activity. YEH, TSO-ZEN 葉佐任 2019 學位論文 ; thesis 53 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 輔仁大學 === 資訊工程學系碩士班 === 107 === As cloud computing is getting more and more popular, cloud systems have been widely adopted to store and share information among users.
The Apache Hadoop is one the most popular cloud platforms in the cloud community. It could consist of a large number of computing nodes and keep data with replicas across its computing nodes. As a result, jobs based on the MapReduce model in Hadoop could be divided into smaller tasks and get distributed to multiple computing nodes to speed up their execution. However, the progress of MapReduce jobs can be delayed by accessing data from computing nodes if those nodes have heavy disk I/O during the data access.
This research aims to mitigate the delay issue by helping MapReduce jobs to access data from computing nodes with less disk I/O instead of the busy ones. Consequently, the progress of MapReduce jobs could also be accelerated. Besides, through our approach, the real-time disk loading in the Hadoop cluster could also be more balanced as we always access data from disks with less disk activity.
|
author2 |
YEH, TSO-ZEN |
author_facet |
YEH, TSO-ZEN CHEN, YU-LIN 陳宥霖 |
author |
CHEN, YU-LIN 陳宥霖 |
spellingShingle |
CHEN, YU-LIN 陳宥霖 Improving the Hadoop System Performance through Activity-Aware Data Access |
author_sort |
CHEN, YU-LIN |
title |
Improving the Hadoop System Performance through Activity-Aware Data Access |
title_short |
Improving the Hadoop System Performance through Activity-Aware Data Access |
title_full |
Improving the Hadoop System Performance through Activity-Aware Data Access |
title_fullStr |
Improving the Hadoop System Performance through Activity-Aware Data Access |
title_full_unstemmed |
Improving the Hadoop System Performance through Activity-Aware Data Access |
title_sort |
improving the hadoop system performance through activity-aware data access |
publishDate |
2019 |
url |
http://ndltd.ncl.edu.tw/handle/9nusyp |
work_keys_str_mv |
AT chenyulin improvingthehadoopsystemperformancethroughactivityawaredataaccess AT chényòulín improvingthehadoopsystemperformancethroughactivityawaredataaccess AT chenyulin jíyóuzīliàocúnqǔwèizhìshāixuǎnláigǎijìnhadoopxìtǒngxiàonéng AT chényòulín jíyóuzīliàocúnqǔwèizhìshāixuǎnláigǎijìnhadoopxìtǒngxiàonéng |
_version_ |
1719250852378574848 |