Feature Selection for Breast Cancer Detection using Machine Learning Algorithms
Sreyam Dasgupta1, Ronit Chaudhuri2, Swarnalatha Purushotham3

1Sreyam Dasgupta, Department of Computer Science, Vellore Institute of Technology, Kolkata, India.
2Ronit Chaudhuri, Department of Computer Science, Vellore Institute of Technology, Kolkata, India.
3Swarnalatha Purushotham, Department of Computer Science, Vellore Institute of Technology, Vellore, India.

Manuscript received on 29 June 2019 | Revised Manuscript received on 05 July 2019 | Manuscript published on 30 July 2019 | PP: 2080-2083 | Volume-8 Issue-9, July 2019 | Retrieval Number: I8723078919/19©BEIESP | DOI: 10.35940/ijitee.I8723.078919

Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Cancer has been portrayed as a heterogeneous disease comprising of a wide range of subtypes. The early diagnosis of a cancer type is very important to determine the course of medical treatment required by the patient. The significance of classifying cancerous cells into benign or malignant has driven many research studies, in the biomedical and the bioinformatics field. In the past years researchers have been encouraged to use different machine learning (ML) techniques for cancer detection, as well as prediction of survivability and recurrence. What’s more, ML instruments can be used to distinguish key highlights from complex datasets and uncover their significance. An assortment of these procedures, including Artificial Neural Networks (ANNs), Bayesian Networks (BNs), Random Forest Methods (RVMs) and Decision Trees (DTs) has been usually used in cancer research for the development of predictive models, resulting in successful and exact decision making. Although it is obvious that the usage of machine learning techniques can enhance our comprehension of cancer detection, progression, recurrence and survivability, a proper level of accuracy is required for these strategies to be considered in the ordinary clinical practice. The predictive models talked about here depend on different administered ML strategies and on various input features and data samples. We have used Naïve-Bayes classifier, Neural Networks method, Decision Tree and Logistic Regression algorithm to detect the type of breast cancer (Benign or Malignant) and selection of features which are more relevant for prediction. We have made a comparative study to find out the best algorithm of the above four, for prediction of cancer type. With a high level of accuracy, any of these methods can be used to predict the type of breast cancer of any particular patient.
Keywords: Breast Cancer, Feature Selection, Logistic Regression, Naïve Bayes, Decision Tree, Neural Network, Machine Learning

Scope of the Article: Machine Learning