A Phishing URL Classification Technique using Machine Learning Approach
Manish Tiwari1, Tripti Arjariya2

1Manish Tiwari, Computer Science from Gargi Institute of Science & Technology, Bhopal, (Madhya Pradesh), India.
2Dr. Tripti Arjariya, Professor, Department of Computer Science Engineering, Rajiv Gandhi Technical University, Bhopal, (Madhya Pradesh), India.

Manuscript received on December 16, 2020. | Revised Manuscript received on January 03, 2020. | Manuscript published on January 10, 2021. | PP: 73-79 | Volume-10 Issue-3, January 2021 | Retrieval Number: 100.1/ijitee.C83380110321| DOI: 10.35940/ijitee.C8338.0110321
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract:  The phishing attack is one of the very common attacks deployed using the social engineering techniques. The attack tries to capture the victim’s personal and sensitive information to trick and can results in terms of financial and social reputation loss. In this presented work the main focus is to investigate the phishing techniques and their detection approaches. In this context first a review on recently contributed URL based phishing attack detection and prevision techniques is prepared. Further based on the suitable techniques a new data mining based model is proposed for implementation. The proposed model first take training on phish tank database URLs and then identify the similar pattern based URLs in two classes legitimate and phishing. First the dataset is preprocessed and the features are computed. The computed features are then transformed in terms of transactional database and association rules are prepared. To generate the association rules the apriori algorithm and FP-Tree algorithm is employed. Based on conducted experiments, the performance the FP-Tree based classification technique much efficient and accurate as compared to apriori algorithm, because the apriori algorithm is much time expensive then the FP-Tree. Finally the future extension of the work is also suggested. 
Keywords: Phishing detection, URL classification, Association rule mining, Rule based classification, Apriori algorithm, FP-Tree algorithm.