Hierarchially Distributed Data Matrix Scheme for Big Data Processing
G.Sirichandana Reddy1, Ch .Mallikarjuna Rao2

1G.Sirichandana Reddy, P.G. Scholar, Mtech-Computer Science and Engineering, Department of Computer Science and Engineering, GRIET, Hyderabad, India.
Dr.Ch. Mallikarjuna Rao, Professor of CSE Department of Computer Science and Engineering, GRIET, Hyderabad, India.

Manuscript received on September 16, 2019. | Revised Manuscript received on 24 September, 2019. | Manuscript published on October 10, 2019. | PP: 4161-4165 | Volume-8 Issue-12, October 2019. | Retrieval Number: L36581081219/2019©BEIESP | DOI: 10.35940/ijitee.L3658.1081219
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Map Reduce is a programming paradigm and an affiliated Design for processing and making substantial data sets. It operates on a large cluster of specialty machines and is extremely scalable Across the past years, MapReduce and Spark have been offered to facilitate the job of generating big data programs and utilization. However, the tasks in these structures are roughly described and packaged as executable jars externally any functionality being presented or represented. This means that extended roles are not natively composable and reusable for consequent improvement. Moreover, it also impedes the capacity for employing optimizations on the data stream of job orders and pipelines. In this article, we offer the Hierarchically Distributed Data Matrix (HDM), which is a practical, strongly-typed data description for writing composable big data appeals. Along with HDM, a runtime composition is presented to verify the performance of HDM applications on dispersed infrastructures. Based on the practical data dependency graph of HDM, various optimizations are employed to develop the appearance of performing HDM jobs. The empirical outcomes show that our optimizations can deliver increases of between 10% to 60% of the Job-Completion-Time for various types of applications when associated with the current state of the art, Apache Spark.
Keywords: Map Reduce, Apache Spark, HDM, Big Data Processing, Distributed system.
Scope of the Article: Big Data Analytics