Information Filtering Model Based on Topic Pattern for Document Modeling
Chinnu C. George1, Abdul Ali2
1Chinnu C. George, PG Scholar, Department of Computer Science, Ilahia College of Engineering, Muvattupuzha (Kerala), India.
2Abdul Ali, Assistant Professor, Department of Computer Science, Ilahia College of Engineering, Muvattupuzha (Kerala), India.
Manuscript received on 13 October 2015 | Revised Manuscript received on 22 October 2015 | Manuscript Published on 30 October 2015 | PP: 39-43 | Volume-5 Issue-5, October 2015 | Retrieval Number: E2212105515/15©BEIESP
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: In the field of machine learning and text mining topic modelling is widely used. Topic modelling generates models to discover the hidden topics in a collection of documents and each of these topics are represented by the distribution- of words. Many term-based and pattern-based approaches are there in the field of information filtering. Patterns are more discriminative than the single words. In many pattern-based methods only the presence or absence of the patterns in the documents are considered. Even if the pattern occurs multiple times in the documents to be filtered equal importance is considered. Another problem with the existing pattern-based methods is that the semantics of the terms in the patterns are not considered. Another limitation is that the distribution of the patterns is not given any importance. To deal with the above limitations and problems this paper includes a new ranking method that considers the frequency of the patterns, pattern distribution and semantic based pattern representation to estimate the relevance of the documents based on the user information needs. This helps to filter out the irrelevant documents effectively. Extensive experiments are conducted using the TREC data collection Reuters Corpus Volume 1 to evaluate the effectiveness of the proposed method .The result shows that the proposed model outperforms the pattern based topic for document modeling in information filtering.
Keywords: Topic Modelling, Information Filtering, User Interest Modeling, Semantic Based Relevance Ranking.
Scope of the Article: Music Modelling and Analysis