Multivariate Classification of Drugs using Parametric and Nonparametric Machine Learning Models
N. Priya1, G. Shobana2

1Dr. N. Priya*, Associate Professor, PG Department of Computer Science, SDNB Vaishnav College for Women (Affiliated to University of Madras), Chennai.
2G. Shobana, Assistant Professor, Department of Computer Applications, Madras Christian College (Affiliated to University of Madras), Chennai.
Manuscript received on December 16, 2019. | Revised Manuscript received on December 22, 2019. | Manuscript published on January 10, 2020. | PP: 2021-2027 | Volume-9 Issue-3, January 2020. | Retrieval Number: C8740019320/2020©BEIESP | DOI: 10.35940/ijitee.C8740.019320
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: In pharmaceutical research, traditional drug discovery process is time consuming and expensive, where several compounds are experimentally tested for their biological activities. Series of lab experiments are conducted to analyze newly synthesized drug’s pharmaceutical activities and its biological effects on human. With every new drug discovery, the required clinical properties can be determined using machine learning models and this greatly reduces the experimental cost. This paper explores parametric and non-parametric machine learning models to classify administration properties of drugs and its toxicity. The multinomial classification of drugs was based on their physicochemical and ADMET properties. Balanced data samples were drawn from chEMBL and was pre-processed. Features were reduced using Recursive Feature Elimination and the attributes were ranked based on their importance to reduce highly correlated attributes. The performance of parametric and non-parametric machine learning models was analyzed on cheminformatic data that includes physiochemical, biological and pharmaceutical properties of the drug molecules. Selecting the potent drug candidate along with its administration properties greatly reduces wet lab experimental time and cost. Multiclass classification can be determined efficiently using non-parametric machine learning model. Optimal feature engineering, tuning hyperparameters and adopting hybrid algorithms would result in more accurate predictions in future for cheminformatics data. 
Keywords: Parametric, Machine Learning, Drug Discovery, Cheminformatics.
Scope of the Article: Machine Learning