Ensembling Coalesce of Logistic Regression Classifier for Heart Disease Prediction using Machine Learning
Shakila Basheer1, Rincy Merlin Mathew2, M. Shyamala Devi3
1Shakila Basheer*, Assistant Professor, Department of Information Systems, College of Computer and Information Sciences, Princess Nourah Bint Abdul Rahman University, Riyadh, Saudi Arabia.
2Rincy Merlin Mathew, Lecturer, Department of Computer Science, College of Science and Arts, Khamis Mushayt, King Khalid University, Abha, Asir, Saudi Arabia.
3M. Shyamala Devi, Associate Professor, Computer Science and Engineering, Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology, Avadi, Chennai, Tamil Nadu, India.
Manuscript received on September 16, 2019. | Revised Manuscript received on 24 September, 2019. | Manuscript published on October 10, 2019. | PP: 127-133 | Volume-8 Issue-12, October 2019. | Retrieval Number: L34731081219/2019©BEIESP | DOI: 10.35940/ijitee.L3473.1081219
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Inn today’s modern world, the world population is affected with some kind of heart diseases. With the vast knowledge and advancement in applications, the analysis and the identification of the heart disease still remain as a challenging issue. Due to the lack of awareness in the availability of patient symptoms, the prediction of heart disease is a questionable task. The World Health Organization has released that 33% of population were died due to the attack of heart diseases. With this background, we have used Heart Disease Prediction dataset extracted from UCI Machine Learning Repository for analyzing and the prediction of heart disease by integrating the ensembling methods. The prediction of heart disease classes are achieved in four ways. Firstly, The important features are extracted for the various ensembling methods like Extra Trees Regressor, Ada boost regressor, Gradient booster regress, Random forest regressor and Ada boost classifier. Secondly, the highly importance features of each of the ensembling methods is filtered from the dataset and it is fitted to logistic regression classifier to analyze the performance. Thirdly, the same extracted important features of each of the ensembling methods are subjected to feature scaling and then fitted with logistic regression to analyze the performance. Fourth, the Performance analysis is done with the performance metric such as Mean Squared error (MSE), Mean Absolute error (MAE), R2 Score, Explained Variance Score (EVS) and Mean Squared Log Error (MSLE). The implementation is done using python language under Spyder platform with Anaconda Navigator. Experimental results shows that before applying feature scaling, the feature importance extracted from the Ada boost classifier is found to be effective with the MSE of 0.04, MAE of 0.07, R2 Score of 92%, EVS of 0.86 and MSLE of 0.16 as compared to other ensembling methods. Experimental results shows that after applying feature scaling, the feature importance extracted from the Ada boost classifier is found to be effective with the MSE of 0.09, MAE of 0.13, R2 Score of 91%, EVS of 0.93 and MSLE of 0.18 as compared to other ensembling methods.
Keywords: Machine Learning, Classification, MAE, MSE, MSLE, EVS and R2 Score.
Scope of the Article: Classification