MadcoreTom

Learning snakes

This project was about using gradient descent to train different models how to play snake.

This won't be the most effective way, and is unlikely to come up with novel solutions, but it was a fun toy project.

The Premise

2 models, with weights, are chosen to play a game against each other. Each has different inputs, but the outputs are either to move left, right, or continue straight. The move is chosen by selecting the one with the highest score, from the valid options.

For example, with left: 0.1, straight: 0.5, right: 0.8, the snake will turn right if its a valid move, otherwise straight, otherwise left, otherwise game over.

The Game

Its a mix between Tron, and 2 player snake. The snakes play on a grid, and each move their head advances one tile. their tail continues to grow every second frame, so it leaves a trail behind itself which is somtheing it or the opponent may crash into. The level also includes a few random solid tiles too, just to keep things interesting

Gameplay

Training

Training is done by

  • Picking a model at random
  • Tweaking the weights
  • Making it play every other model n times (which may include the same model with different weights)

If it wins more than it lost, then adpot the new weights. If it lost more than it one, modify the weights in the opposite direction

I just let it run though a few thousand iterations, and that's usually enough to see some interesting behaviour emerge

Here's 10 training iterations:

Training

The Models

Random

Random output values = random choices. Not very intelligent

Random with weights

Output values are random, multiplied by their weight. Their choices will be random, but the probability is defined by the weights.

Basic Weights

Each output is bias + weight * x where x is the number of solid tiles around the snake's head. There's no indication of where the obstacles are, so it doesn't have much chance.

Better Weights

This one is more complex, and wins the most.

Each output is calculated as

bias[0] + 
weight[0] * x +
weight[1] * y +
weight[2] * r +
weight[3] * e +
weight[4] * c +
weight[5] * g

where:

  • x = the number of solid tiles around the snake's head.
  • y = the number of solid tiles around the snake's head, if it chose to go straight.
  • r = whether turning left or right will being the player closer to their own tail.
  • e = whether turning left or right will being the player closer to their opponent's head.
  • c = whether turning left or right will being the player closer to centre of the game space.
  • g = how close they are getting to walls (a gradient from 0-1)

What's next?

Some inputs seemed useful, and others didn't. I think the snake needs to know more to make an intelligent choice. Maybe it will learn to play more agressively.

I think some form of "ripple" effect from movement might be useful.

I also want to add another piece of gameplay, where the snakes can "jump" other snakes, adding a z axis to the game.