Machine Learning Based Malware Detection: A Boosting Methodology
Tejaswini Ghate1, Chetan Pathade2, Chaitanya Nirhali3, Krunal Patil4, Nilesh Korade5

1Tejaswini Ghate*, Student, Department of Computer Engineering, Pimpri Chinchwad College of Engineering & Research, Ravet, India.
2Chetan Pathade, Student, Department of Computer Engineering, Pimpri Chinchwad College of Engineering & Research, Ravet, India.
3Chaitanya Nirhali, Student, Department of Computer Engineering, Pimpri Chinchwad College of Engineering & Research, Ravet, India.
4Krunal Patil, Student, Department of Computer Engineering, Pimpri Chinchwad College of Engineering & Research, Ravet, India.
5Nilesh Korade, Assistant Professor, Department of Computer Engineering, Pimpri Chinchwad College of Engineering & Research, Ravet, India.
Manuscript received on January 12, 2020. | Revised Manuscript received on January 22, 2020. | Manuscript published on February 10, 2020. | PP: 2241-2245 | Volume-9 Issue-4, February 2020. | Retrieval Number: D1717029420/2020©BEIESP | DOI: 10.35940/ijitee.D1717.029420

Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Malware damages computers without user’s consent; they cause various threats unknowingly, hence detection of these is very crucial. In this study, we proposed to detect the presence of malware by using the classification technique of Machine Learning. Classification type in Machine Learning requires the output variable to be of a categorical kind; it attempts to draw some conclusion from the ascertained values. In short, classification constructs a model based on the training set and values or predicts categorical class labels. In our work, we propose to classify the presence of malware by incorporating two chief classification algorithms, such as Support Vector Machine and Logistic Regression. The data set used for it was not satisfactory. Consequently, we tend to explore a data set that met our necessities and enforced Logistic Regression on the same moreover, we plotted a scatter-gram for the scope of visualization and incorporated XG-Boost for the performance enhancement. This study assists in analyzing the presence of malware by adopting a proper dataset and ascertaining pivotal attributes leading to this classification. 
Keywords: Machine learning, Cyber security, Visualization, Malware, Classification, XG Boost.
Scope of the Article: Machine learning