A Keyword Based Educational and Non-Educational Website Recognition Tool
Sangita Modi1, Sudhir B. Jagtap2

1Sangita Modi, Swami Ramanand Teerth Marathwada University, Nanded, India.

2Sudhir B. Jagtap, Swami Ramanand Teerth Marathwada University, Nanded, India.

Manuscript received on 05 September 2019 | Revised Manuscript received on 29 September 2019 | Manuscript Published on 29 June 2020 | PP: 451-419| Volume-8 Issue-10S2 August 2019 | Retrieval Number: J107708810S19//2019©BEIESP | DOI: 10.35940/ijitee.J1077.08810S19

Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open-access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Today we all depend upon internet to do our daily activities. For booking hotel, air tickets, finding particular places, travelling, cooking, education, banking, etc. we require internet. To get a specific thing immediately, we require filtering tools. E-learning is a new and rapidly growing media in modern education system, which is totally based upon internet. While surfing on internet students may get distracted from offensive and irrelevant websites. In avoiding such distractions, filters play a vital role. This paper proposes a filter tool which carries out web scraping of text data, data cleaning, Natural language processing and filtering the non-learning sites in real-time. We have collected the text from paragraphs, images and video tags. This extracted textual data is in the form of sentences, which are processed part of speech (POS) by NLP. In NLP we are using WSD method to find the exact meaning of the ambiguous words in that context. This tool creates a knowledge base of student related sites using NLP and SVM classification technique. Word sense disambiguation is used to find the correct senses of those words, in the present sentence, which may have multiple meanings. We have created a keyword database of all learning sites. Lastly, we are classifying the sites in two categories learning and non-learning using Support Vector Machine in this tool.

Keywords: E-learning, NLP, Web Content Mining, SVM, POS, WSD.
Scope of the Article: Image Processing and Pattern Recognition