Learning to imitate tasks in Minecraft

Published:

The goal of this work was to identify algorithms that utilize readily available gameplay data to learn to perform tasks in Minecraft. The MineRL competition involves solving tasks in complex, hierarchical, sparse environments, with constraints on training time and compute. This necessitates the use of sample-efficient exploration techniques and training with human priors. To address this, I trained fD-based algorithms on processed data from MineRL, a collection of 60 million samples of users doing tasks (essentially, training an RL algorithm paired with demonstration data).

Specifically, I used Deep Q-learning from Demonstrations (DQfD) for learning to chop trees and mine diamonds in Minecraft, as opposed to learning behavior purely from RL methods. For the task of chopping trees, DQfD outperformed Deep-Q Networks (DQN, the vanilla RL equivalent) by tenfold. Slides for the project are available here. The codebase is available here. Please feel free to reach out if you would like to know more about my experience with the competition :)



MineRL