Incorporating Forgetting Mechanism in Q-learning Algorithm for Locomotion of Bipedal Walking Robot
Rashmi Sharma1, Inder Singh2, Deepak Bharadwaj3, Manish Prateek4

1Rashmi Sharma, Research, Department of Computer Science Engineering, University of Petroleum & Energy Studies, Dehradun (Uttarakhand) India.
2Dr. Inder Singh, Department of Computer Science Engineering, University of Petroleum & Energy Studies, Dehradun (Uttarakhand)
India.
3
Deepak Bharadwaj, Department of Mechanical Engineering, University of Petroleum & Energy Studies, Dehradun (Uttarakhand) India.
4Manish Prateek, Department of Computer Science Engineering, University of Petroleum & Energy Studies, Dehradun (Uttarakhand) India.

Manuscript received on 01 May 2019 | Revised Manuscript received on 15 May 2019 | Manuscript published on 30 May 2019 | PP: 1782-1787 | Volume-8 Issue-7, May 2019 | Retrieval Number: G5312058719/19©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: A walking bipedal is a kind of humanoid which resembles human. Bipedal are programmed for some specific tasks. The work studied biped walking with ZMP to control balance mechanism using reinforcement learning(RL). The proposed forgetting Q-learning algorithm helps the bipedal to learn to walk without any prior knowledge of dynamics model of the system. In this work, the study is carried out to examine improvement to reinforcement learning(RL) algorithm in order to successfully relate with the continuously changing environment. The bipedal navigation is studied by implementing forgetting mechanism in the traditional Q-learning algorithm. Simulations were performed on each of the six joints of both legs of bipedal to evaluate the feasibility study of the proposed algorithm. The optimal policy for navigation was evaluated. Incorporating forgetting mechanism improves the learning time of the RL agent to a certain extent in a dynamic environment. The learning architecture was developed to solve complex control problems. It uses different modules that consists of simple controllers with RL forgetting Q-learning algorithm.
Keyword: Bipedal, Reinforcement Learning, Q-Learning Algorithm, Walking Robot, Optimal Policy, Forgetting Mechanism.
Scope of the Article: Web Algorithms.