AlphaBoop - The Strongest AI for the Board Game called Boop

Robin Wu

Taiji Kitten
I once introduced Tenley to a two-player abstract-strategy board game from 2022 called: Boop.

At first glance, the game looks childish due to its cat theme 'booping' each other on a bed, but the mechanics of this game is very "thinky".

The game plays on a 6x6 board, each player gets 8 kittens in their pool. Below, I overlaid a heatmap that tells you which spot the AI thinks are good moves to play. As you can see, any kittens you place on the board indiscriminately pushes ('booping') neighboring kittens away vertically, horizontally, and diagonally. You can even push kittens off the board itself, returning the fallen kittens back to the players' pool.

Note that kittens do not get 'booped' away if there is already another piece blocking it.

JpMgnBc.png


When you get three kittens in a row, they get taken off the board, promoted as 'Cats'. If you get 2 kittens + 1 Cat in a row, they also get taken off the board, but only 2 kittens get promoted as 'Cats'. If you get 1 kitten and 2 Cats in a row, they get taken off the board, but only 1 kitten gets promoted as a 'Cat'.

XrJMdyP.png


As you can see, the Cats are chonky and fat. Therefore, while they can boop any pieces, kittens are unable to boop the cats.

In situations where all 8 of your pieces are on the board, you may take any of your piece off the board, and if that piece is a kitten, it automatically gets promoted into a 'Cat'.

NDUYgjA.png


There are two ways to win this game:
  1. You must get 3 Cats in a row.
  2. You must have 8 Cats (all of your pieces) on the board.

So, this childish looking game is actually very tactical. According to the rules (I am not joking here): “boop” or “meow” sound effects are encouraged when you boop.

So I used the AlphaZero algorithm to master the game of Boop. Google's DeepMind created AlphaZero which is superior to AlphaGo that was used to master the game of Go to defeat the top professional player, Lee Sedol, back in 2016.

The game of Boop can be played online, and I used AlphaBoop to absolutely obliterate the top human players, so I can confidently say that this AI has achieved superhuman playing strength.

I shared my work on Reddit, and the creator of this game shared my work on Facebook.

The AlphaZero algorithm is actually very elegant to understand. To put it simply, it's two algorithms complementing each other, creating a positive feedback loop. Those two algorithms are: 1) Monte-Carlo Tree Search and 2) Neural Network.

Step 1: The AI starts with a tree search to generate training data. This AI is very stupid, but it's not completely stupid. Because, it's running simulations of the game, and for each turn, it decides what moves to make by tallying up the win rate for each option while balancing high value moves versus exploring novel moves. So it's slightly better than complete stupidity, and that is important.

Step 2: Neural Network trains on the training data that the tree search generated. It predicts two things:
  1. The probability distribution over the most promising moves.
  2. The likelihood of winning or losing from a given position.
Step 3: The tree search generates more training data, guided by the neural network which says: "Hey! I think you should focus your search on these moves instead of those other moves."

Step 4: The Neural Network now trains on the newly generated training data that is slightly less stupid than before.

And the Chicken and Egg cycle continues. The 'Zero' in AlphaZero implies learning entirely from self-play without human-provided game data, starting with no prior knowledge and building its understanding through iterative self-improvement. And this algorithm can be applied to virtually any board game that does not involve luck or hidden information.

Lately, I have also tried to apply AlphaZero to another two-player abstract game from 2003 called YINSH which is played on a hexagonal board with 85 interaction points.

Each player takes turn to place each of their 5 rings anywhere on the board.

On each turn, you place a marker of your color in any of your 5 rings. Then, you move your ring in any of the three directions of where you placed your marker.

When rings jump over markers, those markers indiscriminately flip colors. White becomes black, and black becomes white. Rings cannot jump over other rings.

When you get 5 markers of your color in a row, those markers get taken off the board, and you must then remove any of you rings and place it onto the score track. The objective of the game is to get 3 rings removed, meaning you must get 5 in a row three times. But each time you remove a ring, you have fewer rings on the board. There is an inherent "catch-up" mechanism where "winning" can put you at a disadvantage.

I have been training on the Blitz version of YINSH which is simply to score 5 in a row once. I need a proof-of-concept before I move on to the complete game.

Here is Iteration 116; the training is not yet complete (I am only providing samples of the game):

cO2CMWk.png


I am also the one who programmed the UI for this: https://kiwimaru.github.io/YINSH/

While there have been other attempts at implementing an AI for YINSH, it appears to me that people just gave up when it comes to placing the rings on the board. Human players have different styles of play. Some like to play along the edges. Some play the center. And some believes in a balance of both.

The AI seems to like forming its rings in a straight line. I am surprised how the first and second player do not even try to fight for the same region of the board. I was expecting more of a fight. I did try to play against my own AI where I tried to disrupt the ring placement from forming a line, but the AI still kicked by butt.

If I were to train the complete game of YINSH, it's likely that the opening ring placement will be different. But I have always wondered what the best ring placement is for this game because nobody knows.
 
Top