Spam Detection in Social Networking Sites using Artificial Intelligence Technique
Amit Pratap Singh1, Maitreyee Dutta2

1Amit Pratap Singh, Lecturer, Department of Information Technology, RKGITM, Ghaziabad, India. 

2Dr. Maitreyee Dutta, Professor & Head, Information Management and Coordination Unit, NITTTR, Chandigarh(U.T.),  India. 

Manuscript received on 7 June 2019 | Revised Manuscript received on 12 June 2019 | Manuscript Published on 08 July 2019 | PP: 20-25 | Volume-8 Issue-8S3 June 2019 | Retrieval Number: H10070688S319/19©BEIESP

Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open-access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Social networks provide a way for users to remain in contact with their friends. The increasing popularity of social networks allows social site users to gather large amounts of individual information about their friends. Among numerous sites, Twitter is the fastest growing website. Its popularity has also attracted many spammers to use large amounts of spam to penetrate legitimate users’ accounts. In this research work, the Spam detection system in social sites” is designed to detect the spammer by using a machine learning approach. Initially, data is collected from H-Spam14 site and then different preprocessing schemes such as to convert data into lowercase; stop word removal will be applied. After this, the data enters into the feature extraction phase, in which tokenization process is used to divide the entire sentence into a group of words and hence extract the best features from the raw data. To select an appropriate value of extracted feature set, Artificial Bee Colony (ABC) has been applied as an optimization algorithm to determine the optimal feature sets from spam as well as non-spam data. Then, the classification process has been performed using Artificial Neural network (ANN) to distinguish the spam and non-spam data. At the end of the process, performance metrics and comparison will be performed between proposed and existing work to validate the proposed work. The proposed spam detection system can obtained higher accuracy precision, recall and F-measure compared to the existing classifiers such as naïve Bayes and Support vector machine (SVM).

Keywords: Spam detection, Twitter, ABC and ANN.
Scope of the Article: Artificial Intelligence