Deadline Constraint Aware Scheduler for Executing High Performance Computing Application on Hadoop MapReduce Framework
D C Vinutha1, G T Raju2

1D C VINUTHA, Research Scholar, Department of CSE, RNS Institute of Technology, Bengaluru, Associate Professor, Dept. of ISE Vidyavardhaka College of Engineering, Mysuru.Visvesvaraya Technological University, Belagavi, Karnataka.
2G T RAJU, Professor, Department of CSE, RNS Institute of Technology, Bengaluru, Visvesvaraya Technological University, Belagavi, Karnataka.

Manuscript received on 02 June 2019 | Revised Manuscript received on 10 June 2019 | Manuscript published on 30 June 2019 | PP: 3263-3271 | Volume-8 Issue-8, June 2019 | Retrieval Number: H7434068819/19©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: MapReduce (MR) is a parallel computing programing framework used for executing scientific and data-intensive High-Performance Computing (HPC) application in parallel nature using Hadoop platform. Certain jobs come with Service Level Agreement (SLA) prerequisite for their job computation. The state-of-art SLA aware MR scheduler methods do not consider the problems of dynamic makespan time and varying virtual computing machine performance. Hence, this paper presents Deadline Constraint Aware Scheduler (DCAS) for Hadoop MapReduce Framework. The DCAS can obtain the optimum results for the scheduling of deadline constrained problems using Hadoop job history logs. The DCAS makespan model is designed by considering heterogeneous Hadoop framework. This model assumes that some of the virtual computing machines cannot guarantee SLA prerequisite. Further, DCAS takes data locality into consideration for allocating resources to reduce the makespan time of data access. However, the available resources cannot guarantee SLA prerequisite of all jobs. Hence the proposed DCAS makes an attempt to guarantee the SLA prerequisite of all jobs. From extensive analysis it can be seen that no prior work has considered dynamic scientific and data-intensive computing on HMR framework. Further, no prior work has considered alignment considering long read genomic sequence alignment. DCAS model offers parallel execution of gene sequence alignment process under multi core environment. Thus, aid in improving resource utilization. Experimental analysis on the proposed approach has been carried out on gene sequence alignment using BWA-SW, CAP3 assembly, and text mining applications. Experimental results revealed that an average makespan performance (resources utilization) improvement of 43.95%, 42.33%, and 52.52% is achieved using DCAS when compared to HMR framework for Cap3, gene sequence alignment, and text mining applications.
Keyword: Big data, Cloud Computing, Hadoop, High performance computing, MapReduce, SLA, Task scheduler QoS.
Scope of the Article: Computer Science and Its Applications.