Dimensionality Reduction using Machine Learning and Big Data Technologies
S. Ranga Swamy1, P S V Srinivasa Rao2, J.V N Raju3, M. Nagavamsi4

1Dr Ranga Swamy Sirisati, Associate Professor, Department of Computer Science & Engineering, Vignan’s Institute of Management and Technology for Women, Kondapur, Ghatkesar.
2Dr PSV Srinivasa Rao, Professor, Department of Computer Science & Engineering, Vignan’s Institute of Management and Technology for Women, Kondapur, Ghatkesar,
3JVN Raju, Assistant Professor, Department of Computer Science & Engineering, Sri Vasavi Institute of Engineering & Technology, Nandamuru.
4Mireyala Nagavamsi, Assistant Professor, Department of Computer Science & Engineering, Sri Vasavi Institute of Engineering & Technology, Nandamuru.

Manuscript received on November 15, 2019. | Revised Manuscript received on 20 November, 2019. | Manuscript published on December 10, 2019. | PP: 1740-1745 | Volume-9 Issue-2, December 2019. | Retrieval Number: B7580129219/2019©BEIESP | DOI: 10.35940/ijitee.B7580.129219
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Machine learning and big data models are most useful constraints in software technologies. But these systems need very less data at processing time, also technology wise data dimensionality increases day by day. Any algorithm applicable for high dimensional data requires more processing time and storage resources. The curse of dimensionality refers to all the problems that arise when working with data in the higher dimensions that did not exist in the lower dimensions. Our paper attempts to deal with the issue of safety for information at low dimensionality. Addressing this trouble is equivalent to addressing the safety problem of the hardware and software platform. Decision tree (DT) ML model is helpful for these dimensional and clustering problems. DTML model has been reduced the duplicate data size and clustering achieved efficiency 94.3% and reduction ratio by 32.4%.. 
Keywords: Machine Learning, Big Data, Dimensionality Reduction, Software Technologies, HDFS, Pet byte, Reduplications
Scope of the Article: Machine learning