An Efficient Classifier for Spam Detection in Social Network
Amit Pratap Singh1, Maitreyee Dutta2
1Amit Pratap Singh*, Asst. Prof. Dept. of Computer Science and Engineering, KEC Ghaziabad, India.
2Dr. Maitreyee Dutta, Professor & Head, Information Management and Coordination Unit, NITTTR Chandigarh(U.T.), India.
Manuscript received on October 11, 2019. | Revised Manuscript received on 24 October, 2019. | Manuscript published on November 10, 2019. | PP: 2323-2328 | Volume-9 Issue-1, November 2019. | Retrieval Number: A5218119119/2019©BEIESP | DOI: 10.35940/ijitee.A5218.119119
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: The way to stay connects users with their friends provided by the Social networks. The popularity of social networks has been increasing day-by-day and enables users to collect huge volume of information about their friends. Various social networking sites are available among them, Twitter is fastest growing website. Due to its popularity various spammers are attracted towards it to utilized large amounts of spam to modify authorised users accounts. In this paper, the “Spam detection system in social sites” is developed for detection of spammer through the technique of machine learning. The work is initiated towards collection of data from H-Spam 14 site and then applied pre-processing mechanism such as conversion of data into lowercase and, removal of stop words etc. After this phase, the pre-processed data comes into the phase of feature extraction, which involves process of tokenization that used to split the entire sentences into a word-group and so the best features has been extracted from the raw data. To select the optimized value from extracted set of features, the optimization algorithm, Artificial Bee Colony (ABC) has utilized here to obtain the optimal sets of feature from spam along with non-spam data. The next process has to be done through Artificial Neural Network (ANN) to differentiate the spam and non-spam data. In the final process, the parameters for performance measure and compare the proposed and existing work to check the improvement in proposed work. In this proposed work the spam detection system gained higher accuracy, precision, recall and F-measure as compare to the previously utilized classifiers named as naïve Bayes and Support vector machine (SVM).
Keywords: Spam Detection, Twitter, ABC , AN, Tokenization.
Scope of the Article: Classification