An Improved LSA Model for Electronic Assessment of Free Text Document
Rufai M. M1, Afolabi Adeolu2, Fenwa O. D3, Ajala F.A4

1Rufai Mohammed Mutiu*, Computer Technology Department, Yaba College of Technology, Yaba, Lagos, Nigeria.
2Prof. A. O. Afolabi, Computer Science Department, Ladoke Akintola University of Technology, Ogbomoso, Oyo State, Nigeria.
3Dr. (Mrs.) O. D. Fenwa, Computer Science Department, Ladoke Akintola University of Technology, Ogbomoso, Oyo State, Nigeria.
4Dr. (Mrs.) F. A. Ajala, Computer Science Department, Ladoke Akintola University of Technology, Ogbomoso, Oyo State, Nigeria. 

Manuscript received on February 05, 2021. | Revised Manuscript received on February 17, 2021. | Manuscript published on February 28, 2021. | PP: 152-159 | Volume-10 Issue-4, February 2021 | Retrieval Number: 100.1/ijitee.D85360210421| DOI: 10.35940/ijitee.D8536.0210421
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Latent Semantic Analysis (LSA) is a statistical approach designed to capture the semantic content of a document which form the basis for its application in electronic assessment of free-text document in an examination context. The students submitted answers are transformed into a Document Term Matrix (DTM) and approximated using SVD-LSA for noise reduction. However, it has been shown that LSA still has remnant of noise in its semantic representation which ultimately affects the assessment result accuracy when compared to human grading. In this work, the LSA Model is formulated as an optimization problem using Non-negative Matrix Factorization(NMF)-Ant Colony Optimization (ACO). The factors of LSA are used to initialize NMF factors for quick convergence. ACO iteratively searches for the value of the decision variables in NMF that minimizes the objective function and use these values to construct a reduced DTM. The results obtained shows a better approximation of the DTM representation and improved assessment result of 91.35% accuracy, mean divergence of 0.0865 from human grading and a Pearson correlation coefficient of 0.632 which proved to be a better result than the existing ones. 
Keywords: Ant Colony Optimization, Electronic Assessment, Latent Semantic Analysis, Non-Negative Matrix Factorization.