Performance Evaluation of Several Machine Learning Classification Algorithms with Combined Feature Selection Methods for Sentiment Analysis
Premnarayan Arya1, Amit Bhagat2
1Premnarayan Arya, Department of Computer Applications, Maulana Azad National Institute of Technology, Bhopal (M.P), India.
2Dr. Amit Bhagat, Department of Computer Applications, Maulana Azad National Institute of Technology, Bhopal (M.P), India.
Manuscript received on 07 April 2019 | Revised Manuscript received on 20 April 2019 | Manuscript published on 30 April 2019 | PP: 703-710 | Volume-8 Issue-6, April 2019 | Retrieval Number: F3769048619/19©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Sentiment analysis (SA) is broadly studied to extract opinions from on line reviews and several methods have been proposed in current works. SA algorithms are used to classifying reviews in positive and negative. SA or machine learning classification algorithms apply directly on online review data sets without using feature selection methods (FSMs) leads poor performance. Towards deal with this problem, we proposed a model to improved performance of sentiment classification methods. This paper investigated performance of five machine learning classification algorithms like Naïve Bayes, k-Nearest Neighbor, Support Vector Machine, Logistic Regression, and Random Forest with different FSMs Unigram, Bigram, Information Gain, Chi-Square, Gini Index. Our method implemented on two data sets, first, electronics product data sets and second, movie review data sets. In starting, applying individually FSMs to extract features then applying combined FSMs to generate feature vector score. The feature selected by their feature vector score ranking. In last, the classification algorithms used popular feature vector for classifying the reviews into positive and negative. The performance measured of the classifier through Precision, recall, and F-measure. The best results achieved by all classification algorithms with the combination of FSMs (Unigram, Bigram, IG, GI, CS), and the highest F-score achieved by RF algorithm.
Keyword: Sentiment Analysis, Machine learning Classification Algorithm, Feature Selection Method.
Scope of the Article: Classification