Systematic Classification of Historical Handwritten Tamil Palm Leaf Manuscript using CART algorithm and RBF Network
M. Sornam1, Poornima Devi. M2
1M. Sornam, Department of Computer Science, University of Madras, Chennai, India.
2Poornima Devi. M, Department of Computer Science, University of Madras, Chennai, India.
Manuscript received on 11 September 2019. | Revised Manuscript received on 25 September 2019. | Manuscript published on 30 September 2019. | PP: 185-194 | Volume-8 Issue-11, September 2019. | Retrieval Number: K12780981119/2019©BEIESP | DOI: 10.35940/ijitee.K1278.0981119
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Character classification in the handwritten Tamil palm-leaf manuscript is more challenging than the other document character classification due to degradation and ancient characters in the palm-leaf manuscript. In this work, RBF (Radial Basis Function) network and CART (Classification and Regression Tree) were used to classify the Tamil palm leaf segmented characters. This work consists of two phases: In the first phase, the scanned Tamil palm leaf images were preprocessed by converting them into a grayscale image and then the images were allowed to remove noise using a median filter. In the second phase, GLCM (Gray Level Co-occurrence Matrix) feature extraction method was used to extract the statistical features from the segmented characters and these features were used to train the RBF network and CART algorithm. For the RBF network, Nguyen-Widrow weight initialization technique was used to generate the weight instead of random initialization. The dataset used in this work is Kuzhanthai Pini Maruthuvam (Medicine for child-related disease). By comparing RBF using Nguyen-Widrow method with CART algorithm, RBF yields promising result of 98.4% of accuracy whereas CART produced 98.8% of accuracy for character classification. The digitization of the Tamil palm-leaf manuscript will preserve the historical secrets, traditional medicine to cure disease, healthy lifestyle, etc. It can be used in the archeological department and Tamil libraries having a palm leaf script to preserve the manuscript from degrading.
Keywords: RBF, Nguyen-Widrow, CART, Gini, Entropy, GLCM, Adaptive Mean Threshold.
Scope of the Article: