Summary: | 碩士 === 長庚大學 === 資訊工程學系 === 104 === In the second version of Hadoop MapReduce, YARN
(Yet Another Resource Negotiator) was proposed to enhance the
performance of the second version Hadoop MapReduce. One of
the essential improvement is a dependency between the tasks,
which is no longer required in the YARN. With this concept,
task idleness was created inside the node due to a premature
allocation of the reduced tasks. While, it should be processed once
all map tasks of the same job have been processed completely.
Consequently, it costs longer time to finish all of the submitted
jobs. Mean while, the cost of allocating tasks into nodes should
be considered as well, where allocation of the tasks into higher
performance nodes should be prioritized. In this article, a novel
scheduling policy which comprises of task selection and task
scheduling is proposed to solve this drawbacks by minimizing
the idle time as well as maximizing the resource utilization.
The performance of our proposed algorithm was compared to
FIFO, FAIR, and Capacity schedulers, as built-in and pluggable
scheduling policies in Hadoop YARN, for the verification. The
experimental results showed that our novel protocol performed
better results in the average CPU utilization and job's completion
time, as well as its task's completion time, over those compared
schedulers.
|