Loading

A Minecraft Agent Based on a Hierarchical Deep Reinforcement Learning Model
Arjun Panwar

Arjun Panwar, Researcher, Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America (USA).

Manuscript received on 30 September 2025 | Revised Manuscript received on 08 October 2025 | Manuscript Accepted on 15 October 2025 | Manuscript published on 30 October 2025 | PP: 8-12 | Volume-14 Issue-11, October 2025 | Retrieval Number: 100.1/ijitee.K115414111025 | DOI: 10.35940/ijitee.K1154.14111025

Open Access | Editorial and Publishing Policies | Cite | Zenodo | OJS | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Open-world games such as Minecraft pose significant challenges for reinforcement learning (RL) systems due to their long-horizon objectives, sparse rewards, and requirement for compositional skill learning. This study investigates how a Hierarchical Deep Reinforcement Learning (HDRL) approach can improve agent performance and sample efficiency in such complex environments. We develop a hierarchical agent composed of three interconnected levels: (i) a high-level planner responsible for decomposing tasks into subtasks using the options framework for temporal abstraction, (ii) mid-level controllers that manage reusable subtasks such as resource gathering, crafting, and smelting, and (iii) a low-level visuomotor policy that interacts with the environment through human-like keyboard and mouse inputs. The agent’s learning pipeline integrates pretraining from human demonstration datasets (MineRL) and large-scale video pretraining (VPT) to establish behavioural priors before reinforcement learning fine-tuning. This design leverages modern hierarchical algorithms such as Option-Critic, FeUdal Networks (FuN), HIRO, and Hierarchical Actor-Critic (HAC), enabling the agent to operate across multiple temporal scales. Evaluation is conducted using Obtain Diamond-style benchmarks and BASALT “reward-free” tasks to measure generalization and human alignment. Ablation studies assess the effect of each hierarchical layer, the inclusion of demonstrations, and large-scale video-based priors on overall performance. Results indicate that HDRL substantially enhances task completion rates and sample efficiency compared to monolithic RL agents, particularly in longhorizon and reward-sparse scenarios. This research was conducted to address the limitations of existing RL systems in complex, open-ended worlds and to explore how hierarchical structures can bridge the gap between low-level control and highlevel planning. The findings demonstrate that hierarchical reinforcement learning provides a scalable and interpretable framework for developing agents capable of long-term reasoning and adaptive skill composition. The proposed model advances the state of the art in game-based AI, offering insights applicable to both Minecraft research and broader domains involving openended task learning and autonomous decision-making.

Keywords: Hierarchical Deep Reinforcement Learning, Minecraft agent, Reinforcement learning Model.
Scope of the Article: Artificial Intelligence and Methods