Constituent Depletion and Divination of Hypothyroid Prevalance using Machine Learning Classification
M. Shyamala Devi1, Ankita Shil2, Prakhar Katyayan3, Tanmay Surana4

1M. Shyamala Devi, Associate Professor, Computer Science and Engineering, Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology, Avadi, Chennai, Tamil Nadu, India.
2Ankita Shil, III Year B.Tech Student, Computer Science and Engineering, Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology, Avadi, Chennai, Tamil Nadu, India.
3Prakhar Katyayan, III Year B.Tech Student, Computer Science and Engineering, Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology, Avadi, Chennai, Tamil Nadu, India.
4Tanmay Surana , III Year B.Tech Student, Computer Science and Engineering, Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology, Avadi, Chennai, Tamil Nadu, India

Manuscript received on September 16, 2019. | Revised Manuscript received on 24 September, 2019. | Manuscript published on October 10, 2019. | PP: 1607-1612 | Volume-8 Issue-12, October 2019. | Retrieval Number: L31501081219/2019©BEIESP | DOI: 10.35940/ijitee.L3150.1081219
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: With the vast growth of technology, the world is moving towards different style of instant food habits which lead to the irregular functioning of the body organs. One such victim problem we face is the existence of hypothyroid in the body. Hypothyroid is the under active thyroid circumstance, where the thyroid gland does not produce required amount of essential hormones. The prediction of hypothyroid still remains as a challenging task due to the non availability of exact symptoms. By keeping this analysis in mind, this paper focus on prediction of hypothyroid based on the clinical parameters. The hypothyroid dataset from the UCI machine learning repository is used for predicting the existence of hypothyroid using machine learning classification algorithms. The prediction of existence of hypothyroid is carried out in four ways. Firstly, the raw data set is fitted with various classification algorithms to find the existence of hypothyroid. Secondly, the data set is tailored by the Ada Boost Regressor algorithm to extract the important features from the hypothyroid dataset. Then the extracted feature importance of the hypothyroid dataset is then fitted to the various classification algorithms. Thirdly, the hypothyroid dataset is subjected to the dimensionality reduction using principal component analysis. The PCA reduced hypothyroid dataset is then fitted with classification algorithms to predict the existence of hypothyroid. Fourth, the performance analysis is done for the raw data set, Feature importance AdaBoost hypothyroid dataset and PCA reduced hypothyroid dataset by comparing the performance metrics like precision, recall, FScore and Accuracy. This paper is implemented by python scripts in Anaconda Spyder Navigator. Experimental Result shows that the Random Forest, Naive Bayes and Logistic regression have the accuracy of 99.5 for the raw dataset, feature importance reduced dataset and the accuracy of 99.8 for the five component reduced PCA dataset.
Keywords: Machine Learning, Feature Extraction, PCA, MSE, MAE, R2 Score.
Scope of the Article: Machine Learning