LSTPD: Least Slack Time-Based Preemptive Deadline Constraint Scheduler for Hadoop Clusters
Big data refers to numerous forms of complex and large datasets which need distinctive computational platforms in order to be analyzed. Hadoop is one of the popular frameworks for analytics of big data. In Hadoop, a big job is split into multiple small tasks and then they are distributed to worker n...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9117131/ |
id |
doaj-926a64a116994203bf38414abd6fb940 |
---|---|
record_format |
Article |
spelling |
doaj-926a64a116994203bf38414abd6fb9402021-03-30T02:46:48ZengIEEEIEEE Access2169-35362020-01-01811175111176210.1109/ACCESS.2020.30025659117131LSTPD: Least Slack Time-Based Preemptive Deadline Constraint Scheduler for Hadoop ClustersIhsan Ullah0Muhammad Sajjad Khan1https://orcid.org/0000-0003-3238-0434Muhammad Amir2Junsu Kim3Su Min Kim4Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon, South KoreaDepartment of Electronics Engineering, Korea Polytechnic University, Siheung, South KoreaDepartment of Electrical Engineering, International Islamic University at Islamabad, Islamabad, PakistanDepartment of Electronics Engineering, Korea Polytechnic University, Siheung, South KoreaDepartment of Electronics Engineering, Korea Polytechnic University, Siheung, South KoreaBig data refers to numerous forms of complex and large datasets which need distinctive computational platforms in order to be analyzed. Hadoop is one of the popular frameworks for analytics of big data. In Hadoop, a big job is split into multiple small tasks and then they are distributed to worker nodes in a parallel way using MapReduce to speed up computational processes. In this aspect, it is important how to improve throughput performance. MapReduce jobs require quick responses from the worker nodes to complete them under their deadlines. The existing scheduling schemes for Hadoop such as FIFO, fair, and capacity schedulers cannot guarantee the quick response requirement satisfying a prior deadline. Thus, Hadoop system needs to improve response time and completion time for the heterogeneous MapReduce jobs. In this paper, we propose an efficient preemptive deadline constraint scheduler based on least slack time and data locality. In order for better allocation of tasks and load balancing, we first analyze the task scheduling behaviors of the Hadoop platform. Based on that, we propose a novel preemptive approach which considers the remaining execution time of the job being executed in deciding preemption. The experimental results show that the proposed scheme significantly reduces the job execution time and queue waiting time, compared to existing schemes.https://ieeexplore.ieee.org/document/9117131/HadoopMapReducedistributed systemparallel computingpreemptive job schedulingqueuing theory |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Ihsan Ullah Muhammad Sajjad Khan Muhammad Amir Junsu Kim Su Min Kim |
spellingShingle |
Ihsan Ullah Muhammad Sajjad Khan Muhammad Amir Junsu Kim Su Min Kim LSTPD: Least Slack Time-Based Preemptive Deadline Constraint Scheduler for Hadoop Clusters IEEE Access Hadoop MapReduce distributed system parallel computing preemptive job scheduling queuing theory |
author_facet |
Ihsan Ullah Muhammad Sajjad Khan Muhammad Amir Junsu Kim Su Min Kim |
author_sort |
Ihsan Ullah |
title |
LSTPD: Least Slack Time-Based Preemptive Deadline Constraint Scheduler for Hadoop Clusters |
title_short |
LSTPD: Least Slack Time-Based Preemptive Deadline Constraint Scheduler for Hadoop Clusters |
title_full |
LSTPD: Least Slack Time-Based Preemptive Deadline Constraint Scheduler for Hadoop Clusters |
title_fullStr |
LSTPD: Least Slack Time-Based Preemptive Deadline Constraint Scheduler for Hadoop Clusters |
title_full_unstemmed |
LSTPD: Least Slack Time-Based Preemptive Deadline Constraint Scheduler for Hadoop Clusters |
title_sort |
lstpd: least slack time-based preemptive deadline constraint scheduler for hadoop clusters |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2020-01-01 |
description |
Big data refers to numerous forms of complex and large datasets which need distinctive computational platforms in order to be analyzed. Hadoop is one of the popular frameworks for analytics of big data. In Hadoop, a big job is split into multiple small tasks and then they are distributed to worker nodes in a parallel way using MapReduce to speed up computational processes. In this aspect, it is important how to improve throughput performance. MapReduce jobs require quick responses from the worker nodes to complete them under their deadlines. The existing scheduling schemes for Hadoop such as FIFO, fair, and capacity schedulers cannot guarantee the quick response requirement satisfying a prior deadline. Thus, Hadoop system needs to improve response time and completion time for the heterogeneous MapReduce jobs. In this paper, we propose an efficient preemptive deadline constraint scheduler based on least slack time and data locality. In order for better allocation of tasks and load balancing, we first analyze the task scheduling behaviors of the Hadoop platform. Based on that, we propose a novel preemptive approach which considers the remaining execution time of the job being executed in deciding preemption. The experimental results show that the proposed scheme significantly reduces the job execution time and queue waiting time, compared to existing schemes. |
topic |
Hadoop MapReduce distributed system parallel computing preemptive job scheduling queuing theory |
url |
https://ieeexplore.ieee.org/document/9117131/ |
work_keys_str_mv |
AT ihsanullah lstpdleastslacktimebasedpreemptivedeadlineconstraintschedulerforhadoopclusters AT muhammadsajjadkhan lstpdleastslacktimebasedpreemptivedeadlineconstraintschedulerforhadoopclusters AT muhammadamir lstpdleastslacktimebasedpreemptivedeadlineconstraintschedulerforhadoopclusters AT junsukim lstpdleastslacktimebasedpreemptivedeadlineconstraintschedulerforhadoopclusters AT suminkim lstpdleastslacktimebasedpreemptivedeadlineconstraintschedulerforhadoopclusters |
_version_ |
1724184655854305280 |