Teaching a Computer to Fight Itself: Implementing an Artificial Intelligence Solver for Ogre
by Will Hutton
January 25, 2019
The game of Ogre was first published in 1977, but my introduction to the game was the 2012 Ogre Designer's Edition Kickstarter campaign. My wife was my primary opponent, so I let her play the Ogre because it was only one unit to manage . . . and it was a steamroller. After losing a few games as the defender, the complexity of the game became painfully obvious to me. Which defensive units should I take? Where should I place those units? How should I move each unit? Should units fight together or separately? Which Ogre systems should I target first; the main battery, the missiles, the secondaries batteries, or the treads? No matter what strategy I tried, each game always ended in an Ogre victory for my wife.
Then I thought, "What better way to learn how to win a game about a massive AI tank than by creating my own AI solver to play thousands of games and tell me how to best play the game!" The genesis of this idea was a Python software course at Washington State University. Each student needed a semester project. When I sought approval for mine, I discovered the professor, Dr. Bob Lewis, was very interested, as he was a longtime Ogre fan!
The challenge in programming a computer is that it only does exactly what it is programmed to do. How could I program a computer to find a solution if I did not already know the solution myself? There are numerous artificial intelligence techniques and most learn by using a fitness function to evaluate the quality of their solution. I used an AI technique known as a genetic algorithm that would begin with a random selection of defensive units. 100 random training games were played, and then evaluated using a fitness function. Just like in nature, the games with the best outcomes were selected to be parents of future games. After many generations, I hoped for a strategy that would perform better than random chance.
Before I could implement the genetic algorithm, I had to employ a rules engine. The rules engine is responsible for enforcing the game rules of Ogre, for example; Ogre and Infantry units may cross ridge lines, but no other unit may cross a ridge line; no unit may move into a crater hex; etc. Abstract games like chess are played on a grid of rows and columns that can be easily represented as a data structure known as a 2-dimensional array. The Ogre map is a very specific type of grid called an "offset, point up hex map". This type of map is not well suited to representation using a 2-dimensional array because a lot of space (i.e. computer memory) is wasted and calculating distances between two hexes is difficult.
In a genetic algorithm, a sequence similar to DNA is used to represent one solution to a given problem. This potential solution is evaluated by a fitness function. Additional solutions are created from the most successful previous solutions. The first challenge to overcome was how to represent each of the many possible combinations of defensive units.
Each defensive unit is assigned a symbol from a finite alphabet. In the case of biological DNA, that alphabet is very small; just four letters A, C, G, and T. The actual symbols used in a genetic algorithm do not matter as long as they are used consistently. We could use the first seven symbols of the English alphabet to represent each of the seven defensive units in Table 1, but I1, I2, I3, Gv, Hv, Hz, Mt are more descriptive than A, B, C, D, E, F, G.
|Defense Unit Type||DNA|
Initially defensive units are chosen at random according to the rules of the Basic (Mark III Attack) Scenario; 20 points of infantry and 12 armor units. A sample selection may be: I2, I2, I3, I1, I1, I2, I2, I3, I1, I1, I2, Gv, Gv, Hz, Hz, Hz, Hz, Gv, Mt. Interestingly, this selection was generated at random for this article, but by pure chance it contains four Howitzers. In The Ogre Book (2001), Chester Hendrix's "The Four Howitzer Defense in Ogre" describes a valid strategy incorporating a similar selection of defensive units. The odds of randomly selecting at least four Howitzers is approximately 4 in 1,000! Placement of each Howitzer is key and it may take a genetic algorithm many games to learn the optimal placement given there are 222 potential hexes (0101 - 1516) for three of the four Howitzers and 90 hexes (0101-1507) for placing the remaining Howitzer. How the Howitzers fight together; what they target, and in what order, and at what odds to combine firepower, must also be learned dynamically. Again, this may take a genetic algorithm many games to learn.
One obvious advantage of computation and artificial intelligence is the vast number of games that can be played in a short amount of time. Computer science uses the worst-case run-time scenario for analyzing the efficiency of an algorithm (i.e. asymptotic analysis). A two-player game of Ogre can be played by humans in 30-60 minutes, so we use 60 minutes for calculations. The AI solver can play about five games of Ogre in just under one second! In the same amount of time it takes humans to play just one game of Ogre, the AI solver can play over 21,000 games. That is over 21,000 chances to learn something new about optimal strategy compared to just one chance to learn for the humans.
I programmed each defense unit to make tactical decisions based on one of several predefined behaviors. How would they move in relation to the Ogre? Would they work together with nearby units to combine firepower? What Ogre systems should be targeted first, the main battery, the missiles, or the treads? Inspired by Aristotle, I chose to implement three basic behaviors for the defense units; cowardly, disciplined, and aggressive. Aristotle was concerned with how much of a particular thing was the correct amount. In Nicomachean Ethics, written for his son, we find Aristotle's renowned idea, "moderation in all things." Not enough aggression results in cowardice which may be a bad thing because soldiers will not fulfill their duty. Likewise, too much aggression is also bad thing if a soldier charges foolheartedly into overwhelming odds and certain death. The correct amount of aggression results in a disciplined soldier that will not abandon his position needlessly and fight when necessary as a member of an integrated team.
|Defense Unit Behavior||DNA|
Cowardly units flee the Ogre and move towards the command post. If they are still in range of the Ogre after their movement phase, they fire at a random Ogre system and never combine their attacks with other defense units. Aggressive defense units pursue the Ogre. They combine their attacks whenever possible and target the Ogre probabilistically, with a 50% chance of targeting the main battery or missiles, a 20% chance of targeting a secondary battery, and a 30% chance of targeting the treads. Disciplined units stand their ground and do not move. Disciplined units combine their attacks with nearby units for 3:1 odds whenever possible and always target Ogre systems based on decreasing threat level; first the missiles, then the main battery, followed by the secondary batteries, and finally the treads.
This programmed behavior for aggressive units illustrates a potential pitfall in artificial intelligence. The programmer must be careful not to allow their own bias, include incomplete or incorrect assumptions of the problem to influence the solution. As a new player, I implemented what I thought was the best strategy for the defense. However, veteran players will state that 1:1 odds optimizes their chances. Veteran players may also prioritize attacking treads over secondary batteries to slow or stop an Ogre. After all, an immobile Ogre is harmless, provided any scenario objectives are outside of the range of its remaining weapons. This problem is further addressed in the Future Work section.
The defense unit behavior DNA becomes a prefix of the defense unit DNA. For example, an aggressive GEV would be represented by the sequence AGv, and a disciplined Heavy Tank as DHv. When all of the defense units' DNA is concatenated together, the result is a single string of characters that defines all the defensive units and their behaviors for a single game.
When the game is over, the AI solver uses a fitness function to evaluate the effectiveness of the selected defense units and their behaviors. The fitness function I used is simply the victory conditions of the Basic (Mark III Attack) Scenario:
Complete Ogre victory: CP and all defending units are destroyed.
Ogre victory: CP is destroyed, and the Ogre flees the south edge of the map.
Marginal Ogre victory: CP and Ogre are both destroyed.
Defense victory: CP survives, and the Ogre is destroyed.
Complete defense victory: CP survives, the Ogre is destroyed, and 30 attack points of defender units remain.
When ties occur, they are broken by counting the remaining attack power of the defense units. If a tie still remains, games with fewer turns are preferred.
Two of the top ten games are chosen at random to become parents of the next game. Each parent contributes half of their DNA to the next game, with a slight probability of a genetic mutation. To illustrate this process, we will use a simplified DNA sequence using just two defense units for each game.
|1||DHvAI3||Disciplined Heavy Tank, Aggressive Infantry-3|
|2||CMtDGv||Cowardly Missile Tank, Disciplined GEV|
The DNA of Game 1 is split in half, resulting in two strands of DNA; DHv and AI3. Both pieces of DNA, which we will call "Parent 1 Left" and "Parent 1 Right," have a 50% chance of being passed on to Game 3. This process is repeated for Game 2. Table 4 shows the probability of each DNA sequence for Game 3 using Game 1 and Game 2 as parents.
|Game 3 DNA||Source||Probability|
|DHvCMt||Parent 1 Left and Parent 2 Left||25%|
|DHvDGv||Parent 1 Left and Parent 2 Right||25%|
|AI3CMt||Parent 1 Right and Parent 2 Left||25%|
|AI3DGv||Parent 1 Right and Parent 2 Right||25%|
Each symbol has a small probability (e.g., 5%) of mutation. For example, using the DNA sequence AI3DGv, there is a 5% chance the aggressive behavior of the Infantry-3 unit mutates and becomes either cowardly or disciplined. Then there is a 5% chance the unit type mutates from Infantry-3 to some other type of defense unit, a GEV perhaps? The rest of the DNA sequence is checked for mutations, then finally Game 3 is played, and the result is evaluated with the fitness function above. Successful games pass on their DNA to future games, and the AI solver continues to make better defense unit selections and employ better defense unit behavior. Actual DNA sequences for each game played by the AI solver are much longer than those in Table 4.
If we assume Steve Jackson designed Ogre to be balanced and both players play optimally, we should expect a Nash Equilibrium. The details of a Nash Equilibrium are not important to this article but may an interesting diversion for some readers (esp. fans of the movie A Beautiful Mind). In a nutshell, we should expect the Ogre to win 50% of the time and the defender to win 50% of the time. Games that are unbalanced tend to not be fun to play. Thus, we should expect a winning AI solver to win approximately 50% of its games. When choosing units randomly, the AI solver wins an average of just 5 in 100 games. This is a pitiful 5%-win rate, far from the goal of 50%. However, after 100,000 games using the genetic algorithm, the AI solver has learned enough to improve its outcome 64% in the span of time it would take us humans to play six real games of Ogre.
The current AI solver does not account for defense unit placement. Recall "The Four Howitzer Defense in Ogre" discussion. The placement of each Howitzer is key to the strategy. If the initial placement hexes were also recorded as part of a defender unit's DNA, the genetic algorithm could learn that it won more games when the Howitzers are placed in hexes that allow for mutual fire support (e.g., 1506, 1007, 0904, and 0706) as opposed to in a single row at north end of the map (e.g., 0101, 0102, 0103, and 0104).
Odds on the Combat Results Table are also important. Luck may play a part in an individual game, but it has no place informing strategy over many games. Therefore, there may be an optimal attack ratio to be found. The AI solver could be modified to learn what this optimum ratio is. This would require less programmed behaviors (e.g., aggressive and cowardly), and a framework to allow the genetic algorithm to discover one or more approaches to defensive unit movement, target selection, and attack strength (i.e. cooperation with other defensive units).
Local maxima appear after playing 100,000 games. At this point, we must play tens of thousands of additional games, over much longer periods of time, to derive only fractional improvements. If the AI solver could experiment with additional defense unit behaviors, this local maxima may likely be improved. Use of AI solvers can also result in a complicated strategy emerging that a human might have difficulty seeing in advance.
The game engine implements a trivial AI for the Ogre. The Ogre always starts in the same hex (i.e. 1222), takes the shortest path to the Command Post, reserves one missile for destroying the CP, destroys the CP with the longest-range available weapon, then immediately flees to the south of the map. This is not an optimal strategy and will always preclude a complete Ogre victory because the Ogre will not hunt down and destroy any remaining defense units. The AI solver risks overfitting its strategy to this preprogrammed Ogre strategy. Overfitting is when an AI finds an adequate solution to a particular situation that does not generalize well to other situations.
AI failures are a source of inspiration in Ogre fiction and scenarios. In "The Black Knight" (Ogre Scenario Book 1), an Ogre forgets its goal after suffering from combat stress. In "Run for the Border" (also in Ogre Scenario Book 1), an Ogre abandons its initial programming to defect to the enemy. Finally, in "Wounded Company" (Ogre Scenario Book 2), a Vulcan decides that saving the AI brain of an immobilized Mark V is more important than its programmed combat objective.
I hope that sharing my experience with programming an AI solver for Ogre has inspired you to learn more about artificial intelligence, game theory, philosophy, computer science, or even create your own fan fiction and Ogre scenarios. This endeavor has taught me a lot, but I still lose often when my wife and I play Ogre!
The Ogrezine II PDF, combining all of these articles with additional new material, will be available on Warehouse 23.