Porting the optimal yahtzee algorithm to typescript
A while back I had developed a love for the dice game yahtzee and built a web app to play the game in an individual format. I was interested in a player's ability to intuit the probabilistic math at work and push for higher scores. The game I built is based on the standard hasbro yahtzee rules without yahtzee bonuses.
After playing the game a few hundred times I thought I had a good sense for an optimal play. I would prioritize the top section ensuring that I got the 35 bonus points and drop low probability scores like "Yahtzee" and "Large Straight" if I ever got into trouble. I started looking into algorithms for playing yahtzee better and discovered some research by James Glenn at Loyola in 2006 to break down the game and build an optimal strategy.
The paper is from 2006 when computers were a lot slower than they are now. Without the memory and processing constraints, I was able bring the algorithm into a browser app to guide players along the optimal route as they go.
The optimal yahtzee algorithm
Each turn (a player gets random dice then two re-rolls then must select a score slot) can be broken down into an independent widget that links to other future turn widgets. Without using yahtzee bonuses, a widget is defined by the sum of all the top section scores and whether or not each category has been scored yet. A game starts at the first widget (the sum of the top sections is zero and no categories have been scored yet) and ends in ones of 63 final states (the sum of the top sections is between 0 and 63 and all categories are scored). There are only 63 final states because the bonus applied to the top section sum occurs when the sum is 63 or greater. The optimal strategy and expected score value can be computed by computing three mappings within each widget.
The first is the mapping of outcomes of the final roll to the best score slot available. This is pretty simple to compute as there are 252 possible rolls. We calculate the "best score slot" as the one with the highest score of the dice + expected score value of the widget that scoring that slot would lead us to. Now we can calculate of all the possible second re-roll decisions, which have the expected score value by considering the likelihood of an outcome and the expected score value of that outcome. From there we can work backward again to get another mapping of outcomes to the optimal first reroll decisions.
Ultimately we are left with the three mappings that give us the optimal strategy for a widget. 1. Given an initial roll, which re-roll decision is optimal. 2. Given the result of the first reroll, which re-roll decision is optimal. 3. Given the result of fthe second re-roll, which scoring decision is optimal.
Because each widget depends on knowing the expected values of "downstream" widgets or widgets with more categories filled out, we structure the order in which we process widgets from back to front starting with final game state widgets and ending with the initial game widget. Final game states really only consider the 35 bonus points for the top section sum and have an expected value of 35 only if that sum is 63, otherwise the widget expected value is 0. We can process a widget when we have processed all the potential widgets that the current widget could point to after scoring.
Computation of the game graph
Building the entire game graph means processing the expected score value of every widget. Once we compute the score value of the initial widget we know what score we can expected on average for any game we play. Building the game graph with the current rules (no yahtzee bonus) means processing every possible state of a structure with an integer 0-63 and an array of boolean values representing whether a category has been scored. This number is 64 * (2^13) = 524,288. Thanks to some cleverness recommended in the paper, we can remove all the states where the top section sum is not divisible by the top section scored categories ie. the top sum cannot be 17 when only the "Ones" category is scored. After processing out all impossible states we are left with 363,008.
Another clever improvement of the algorithm is that when playing the game we only need to keep track of the total expected score value of each widget not the inner data. When playing games of yahtzee we can compute each widget and it's optimal strategy quickly using this expected score value map.
Currently the graph takes around 35 minutes to build with each widget taking around 7-8ms to process. There is likely potential in multi-threading to accelerate things.
Memory and data encoding
Once the game graph is built, it's exported to a json file in the shape of a string to number hash map where the widget state is the key and the expected score number is the value. Currently this json file is XMb. I could likely greatly reduce the size of this file by removing the keys and keeping an implicit ordering to expected score value numbers or reducing the precision of the expected score values.
Reaction to playing with the optimal solution
It feels very strange playing with the optimal solution. It feels like the dice are naturally doing what you want when in reality you are just playing a statistical advantage. Certainly the scores are higher and even if you miss one or two recommended moves you're still on the highest possible path moving forward.
Some of the interesting things I've noticed the algorithm recommending that either affirm or go against my intuition are:
- Score full houses, three-of-a-kinds, and four-of-a-kinds when you get them. I had thought they were fairly common but optimal algorithm likes to score these early in games when they can be found.
- Fill a zero in the yahtzee column if you're stuck, if you can't then zero out the ones and twos.
Contributing
I have some features I'd like to build out to make the game more interesting, like having the optimal solution not visible but checking with a users actions, or evaluating how many optimal moves a player made after a game.
If you're interested in contributing make a github issue here.
The game can be viewed here.