Reinforcement Learning for Battlesnake

UVic AI’s primary project is using reinforcement learning to build a top competing bot in the game Battlesnake. Battlesnake is an internationally played game in which 4 snake bots fight for survival in a digital environment. Each turn, every snake’s code is sent the state of the environment and given 500ms to compute and submit their next move. To train in this multi-agent synchronous game environment, we have built a Monte Carlo Tree Search algorithm that trains via self-play (inspired by AlphaZero). After only 3 training epochs, our current model is at an intermediate level of play, and showing no signs of diminishing returns.

Our plan for the summer is to improve the interpretability of this RL model, use Random Network Distillation to improve MCTS exploration, and to scale up training until we’re first place on the leaderboard.

We also help those with less programming experience write their first battlesnake bot using basic heuristics during the lead-up to seasonal battlesnake tournaments.


Vector quantized variational autoencoders are a class of generative models that learn to produce a compact, discrete representation of data while maintaining reconstruction quality. This is a new, beginner-friendly project that we will be working through step by step via workshops, starting with a basic introduction to autoencoders (see Events page).