Outlier Detection in High Dimensional Data Based on Elastic Net Regression
Ch. Anuradha1, M. Ramesh2, Patnala S.R. Chandra Murty3

1Ch. Anuradha*, Dept. of CSE, ANU, Guntur, Andhra Pradesh, India.
2Dr. M. Ramesh, Dept. of CSE, RVR & JC College of Engineering, Guntur, Andhra Pradesh, India.
3Dr. Patnala S.R. Chandra Murty, Dept. of CSE, ANU, Guntur, Andhra Pradesh, India.
Manuscript received on September 16, 2019. | Revised Manuscript received on 25 September, 2019. | Manuscript published on October 10, 2019. | PP: 325-328 | Volume-8 Issue-12, October 2019. | Retrieval Number: L34791081219/2019©BEIESP | DOI: 10.35940/ijitee.L3479.1081219
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Outlier detection in large datasets is the dynamic research area in computer science such as data mining, database systems, and distributed systems. Outlier detection faces many challenges due to the absence of data samples from the outlier class. Massive algorithms have been projected to conquer the challenges in this field to improve the efficiency of regression approach for large datasets. Currently, no particular efficient regression technique is designed for outlier detection. In this research, we proposed an ElasticNet regression model for detecting the outliers in high dimensional data. To validate the efficiency and competence of our projected algorithm, it is implemented in the open source software called Weka Explorer. The parameters such as Mean absolute error 0.0022, RMSE 0.0387, Relative absolute error (RAE) 0.4562 and Root relative squared error (RSE) 7.8722 are calculated using annthyroid dataset. ElasticNet model consumes less computational time, generates fast convergence results, provides high accuracy and correctly classified accuracy is 98.25%.
Keywords: Outlier Detection, Elastic Net Regression, High-Dimensional Data, Weka Explorer, Annthyroid.
Scope of the Article: Data Mining