An Efficient Approach to Reduce Text Dimension for Precise Text Classification for Big Data
Akshada Bhanushali1, Pravin Rahate2

1Akshada Bhanushali, PG Student, Department of Computer Engineering, Datta Meghe College of Engineering, Navi Mumbai (Maharashtra), India.
2Pravin Rahate, Assistant Professor, Department of Computer Engineering, Datta Meghe College of Engineering, Navi Mumbai (Maharashtra), India.
Manuscript received on 15 August 2018 | Revised Manuscript received on 27 August 2018 | Manuscript published on 30 November 2018 | PP: 1-4 | Volume-7 Issue-11, August 2018 | Retrieval Number: K25150871118/18©BEIESP
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: In today’s society, few famous news websites such as Google and sina server gives information for users. But recently with the continuous development of information technology, the quantity of disorder data is increasing in volume. Text classification and organization has become a task. The traditional manual classification of news text not only consumes a lot of human and financial resources, but classification is also not achieved quickly. This paper makes a research about the news text classification. A news text classification model is proposed based on Latent Dirichlet Allocation (LDA) and Domain Word Filtering. The model reduces the features dimension of the news text effectively and gets good classification results. This model uses topic model to reduce text dimension and get good features as the dimension of the news texts is too high. 
Keyword: Topic Model, LDA, Domain Word Filtering, News Website, Text Classification
Scope of the Article: Classification