Loading

A Minecraft Agent Based on a Hierarchical Deep Reinforcement Learning ModelCROSSMARK Color horizontal
Arjun Panwar

Arjun Panwar, Researcher, Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America (USA).

Manuscript received on 30 September 2025 | Revised Manuscript received on 08 October 2025 | Manuscript Accepted on 15 October 2025 | Manuscript published on 30 October 2025 | PP: 8-12 | Volume-14 Issue-11, October 2025 | Retrieval Number: 100.1/ijitee.K115414111025 | DOI: 10.35940/ijitee.K1154.14111025

Open Access | Editorial and Publishing Policies | Cite | Zenodo | OJS | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Open-world games such as Minecraft pose significant challenges for reinforcement learning (RL) systems due to their long-horizon objectives, sparse rewards, and requirement for compositional skill learning. This study investigates how a Hierarchical Deep Reinforcement Learning (HDRL) approach can improve agent performance and sample efficiency in such complex environments. We develop a hierarchical agent composed of three interconnected levels: (i) a high-level planner responsible for decomposing tasks into subtasks using the options framework for temporal abstraction, (ii) mid-level controllers that manage reusable subtasks such as resource gathering, crafting, and smelting, and (iii) a low-level visuomotor policy that interacts with the environment through human-like keyboard and mouse inputs. The agent’s learning pipeline integrates pretraining from human demonstration datasets (MineRL) and large-scale video pretraining (VPT) to establish behavioural priors before reinforcement learning fine-tuning. This design leverages modern hierarchical algorithms such as Option-Critic, FeUdal Networks (FuN), HIRO, and Hierarchical Actor-Critic (HAC), enabling the agent to operate across multiple temporal scales. Evaluation is conducted using Obtain Diamond-style benchmarks and BASALT “reward-free” tasks to measure generalization and human alignment. Ablation studies assess the effect of each hierarchical layer, the inclusion of demonstrations, and large-scale video-based priors on overall performance. Results indicate that HDRL substantially enhances task completion rates and sample efficiency compared to monolithic RL agents, particularly in longhorizon and reward-sparse scenarios. This research was conducted to address the limitations of existing RL systems in complex, open-ended worlds and to explore how hierarchical structures can bridge the gap between low-level control and highlevel planning. The findings demonstrate that hierarchical reinforcement learning provides a scalable and interpretable framework for developing agents capable of long-term reasoning and adaptive skill composition. The proposed model advances the state of the art in game-based AI, offering insights applicable to both Minecraft research and broader domains involving openended task learning and autonomous decision-making.

Keywords: Hierarchical Deep Reinforcement Learning, Minecraft agent, Reinforcement learning Model.
Scope of the Article: Artificial Intelligence and Methods