Performance Evaluation of Machine Learning Models for Diabetes Prediction
Aishwarya Jakka1, Vakula Rani J2
1Dr.Vakula Rani J, Dept. of MCA, CMRIT, Bangalore, India.
2KAishwarya Jakka , Graduate student, Information Science at Pittsburgh, USA.
Manuscript received on 27 August 2019. | Revised Manuscript received on 02 September 2019. | Manuscript published on 30 September 2019. | PP: 1976-1980 | Volume-8 Issue-11, September 2019. | Retrieval Number: K21550981119/2019©BEIESP | DOI: 10.35940/ijitee.K2155.0981119
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Diabetes is one of the prevalent diseases all over the world. As per the International Diabetes Federation (IDF) report of the year 2017, diabetes is prevalent in about 8.8% of the Indian adult population and is one of the top ten causes of death in India. In untreated and unidentified diabetes could cause fluctuations in the sugar levels and extreme cases, damage organs such as kidneys, eyes, and arteries in the heart. By using Machine learning algorithms to predict the disease from the relevant datasets at an early stage could likely save human lives. The purpose of this investigation is to assess the classifiers that can predict the probability of disease in patients with the greatest precision and accuracy. Experimental work has been carried out using classification algorithms such as K Nearest Neighbor (KNN), Decision Tree(DT), Naive Bayes (NB), Support Vector Machine (SVM), Logistic Regression (LR) and Random Forest(RF) on Pima Indians Diabetes dataset using nine attributes which is available online on UCI Repository. The performance of classifier is evaluated based on precision, recall, accuracy and is estimated over correct and incorrect instances. The results proved that Logistic Regression (LR) performs better with the accuracy of 77.6 % in comparison to other algorithms.
Keywords: Classification; K-Nearest-Neighbor (KNN) Decision Tree (DT), Naïve Bayes (NB), Support-Vector Machine (SVM), Logistic Regression (LR) ,International Diabetes Federation (IDF).
Scope of the Article: