Deep Learning Classifier for Gene Expression Datasets using a Hybrid LSTM Network
Immaculate Mercy A1, Chidambaram M2

1Immaculate Mercy A.*, Pursuing, Ph.D, Computer Science, Bharathidasan University. Tamil Nadu, India.
2Chidamabaram M., Assistant Professor, Computer Science, Rajah Serfoji Government College (RSGC), Thanjavur. India.
Manuscript received on January 17, 2020. | Revised Manuscript received on January 29, 2020. | Manuscript published on February 10, 2020. | PP: 1081-1089 | Volume-9 Issue-4, February 2020. | Retrieval Number: D1562029420/2020©BEIESP | DOI: 10.35940/ijitee.D1562.029420
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: A deep learning system Long Short-term memory (LSTM) is incorporated for the classification of differentially expressed genes which causes certain abnormalities in the human body. The LSTM is employed along with the K-Nearest Neighbour (KNN) algorithm so as to achieve the classification to its precision. The feature selection process plays a vital as some of the existing algorithms tend to neglect the features of concern. The classification further leads to enhanced prediction method. The K-Nearest Neighbour method is used to filter the correlation degree between each value with target value. This hybrid algorithm has a clear leverage over the existing methods. This work is well supported by the Feature Selection which includes a hybrid of Principal Component Analysis and the CHI square test. This hybrid approach provides with a good feature selection which aides in the seamless flow of the process towards classification and prediction. The Eigen values and the Eigen vectors are computed which effectively leads to the identification of Principal components. The Chi Square test is implemented for calculating the scores. The features that are obtained are ranked by these scores and the datasets which has the highest scores are further taken for training. The algorithms employed in this work has a clear advantage over the Bayesian networks as the Bayesian networks are prone to errors within the layers which may cause the values to explode or vanish. The accuracy of the classification and the prediction process achieved is unsurpassed when compared to the existing methods.
Keywords:  Deep learning, LSTM, KNN, PCA, CHI square
Scope of the Article:  Deep learning