An Efficient Term Weighting Approach for Document Classification using Knn Classifier
Aijazahamed Qazi1, R. H. Goudar2, P.S.Hiremath3
1Aijazahamed Qazi, Department of CSE, SDMCET, Dharwad, India.
2R.H.Goudar, Department of CNE, Center for PG Studies Visvesvaraya Technological University, Belgaum, India.
3P.S.Hiremath, Department of MCA, KLE Technological University, Hubli, India.
Manuscript received on 02 June 2019 | Revised Manuscript received on 10 June 2019 | Manuscript published on 30 June 2019 | PP: 3127-3130 | Volume-8 Issue-8, June 2019 | Retrieval Number: H7270068819/19©BEIE
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: With substantial expansion in the volume of computerized information, document classification has become an emerging area of exploration in the research community. A defined methodology for this task is to apply machine learning methods. The traditional term frequency and inverse document frequency weighting approach considers only the statistical information. This paper introduces a new approach to weight a term by calculating the semantic similarity between the category label and the term. Also, the weight of a term comprises of its co-occurrence computation. Experiments were carried on the Reuters-21578 benchmark dataset. The results obtained specify that the proposed method outperforms the traditional method with kNN classifier.
Keyword: Frequency, kNN, Similarity, Ontology.
Scope of the Article: Classification.