Dominion > Simulation

Dominion Evolutionary Algorithm - Finding the optimum strategy on a given board

(1/4) > >>

BeXXsoR:
Hi everyone,

this is my first post on this forum, so first of all I'd like to say Hello to all of you.

I have used Geronimoos Simulator a lot lately, and its an amazing tool to test the effect of cards and strategies on a given board. What I have in mind now is to extend its functionality to not only let fixed strategies play against each other, but to figure out the optimal (or near-optimal) strategy on a given board on its own!

I want to use an Evolutionary Algorithm (EA) to achieve this. An EA is a heuristic algorithms that tries to solve optimization problems by imitating nature's behaviour of evolution.

Here's a quick roundup of how an EA works:
An EA starts with a random population of possible solutions (in our case, random strategies), and applies the following steps:

* Reproduction: Two selected strategies are combined to create a new solution, meaning that the new solutions takes some parameters of the first and some of the second strategy.
* Mutation: A random subset of the population is mutated, meaning that a random parameter of their strategy is altered randomly.
* Evaluation: Each solution is assigned a fitness value that acts as a measure of its quality.
* Selection: Out of all of the strategies of the starting population and the ones created in steps 2 and 3, a new population is formed by kicking out as many strategies as needed to get the population down to a given size. Strategies with a lower fitness value are more likely to be kicked out.
* These steps are iterated until a given stopping condition is reached. The crucial parts of implementing a good Dominion EA are:
a) The representation of Dominion strategies by as few parameters as possible, and
b) The definition of a quick to evaluate fitness function to measure the quality of a strategy

Here are my thoughts on these points:
a) The representation of Dominion strategies by as few parameters as possible
To define a strategy, I like to use Geronimoos buying rules. But these can get as complex as you like, so I have to make some restrictions:

* Engines vs. BM are so different that it's hard to define a template that may generate both types of strategies. But as these are the two main strategies used in todays games, I like to focus on them first and ignore Slogs, Rushes, Combos for the time being.
* I like to ignore Alt-VP cards for the time being, because they would make the template even more complex. I might add them later when the base EA works.
* I like to ignore tax, debt, events, landmarks and some special cards as well for the same reasonMy representation of a strategie looks like this:
Each strategy assigns values to 17 parameters k_1 to k_17, which are:

* k_1, k_2, ... , k_10: Maximum number of copies the strategy wants to gain of board card 1, 2, ..., 10, resp. Valid values are 0, 1, 2 and no restriction
* k_11: Same as k_1 to k_10 for the bane card if its in play
* k_12, k_13: Same as k_1 to k_10 for Gold and Silver
* k_14: Card that is check for start of greening. Valid choices are "Most-expensive card with k_i = no restriction" and "Gold"
* k_15: Number of copies of k_14 to have in deck before greening starts. Valid values are 1, 2, 3, 4
* k_16: Number of Provinces left in the supply before buying Duchys. Valid values are 4, 5, 7
* k_17: What to go for if the greening condition is fulfilled, but the player doesnt have enough money to buy the next desired green card. Valid values are "Follow your usual strat" or "Treasure"The Buy strategy with these parameters looks like this:

* Buy Provinces if [Count in Deck(k_14) >= k_15]
* Buy Duchys if [Count in Supply(Provinces) <= k_16]
* Buy Estates if [Count in Supply(Provinces) <= 2]
* Buy Gold and Silver respecting the values k_12 and k_13 if [Count in Deck(k_14) >= k_15 AND k_17 = "Treasure"]
* Buy the most expensive kingdom card you can afforf respecting the values of k_1 to k_11. If you have the choice between two or more cards with the same cost, buy the one you have less copies of in your deck.
* Buy Gold or Silver respecting the values k_12 and k_13So for example a simple Big Money strategy without any actions would look like this:

* k_1 to k_11 = 0
* k_12 to k_13 = No Restrictions
* k_14 = "Gold"
* k_15 = 2
* k_16 = 5
* k_17 = "Treasure"And Festival/Smithy without any treausre would look this way (lets say Festival is k_1 and Smithy is k_4):

* k_1 and k_4 = No restrictions
* k_2, k_3, k_5 to k_13 = 0
* k_14 = "Festival"
* k_15 = 4
* k_16 = 7
b) The definition of a quick to evaluate fitness function to measure the quality of a strategy
The hard part in this statement is the "quick to evaluate". By nature, strategies are there to compete against each other, so you cant measure their quality in a solitaire game, but have to let them play real (i.e. simulated) games vs. other strategies. I though about letting all current strategies play a littel tournament, but that would consume way to much CPU time. Therefore I came up with the idea of letting the user define 1 to 4 benchmark strategies that the new strategies have to play against. For example, these strategies could be Smithy BM ("How good is new strategy against BM), Witch & Militia ("How good does it stand against attacks) and Village & Smithy ("How does it perform vs. a base engine). The sum of the win rates against all benchmark strategies defines the fitness value of a strategy.

Before I start the implementation, I lilke to hear your opinions on this. How can i optimize the representation of a strategy? How can I generate a better and/or quicker fitness function?

And last but not least, I want to thank you, Geronimoo, for creating such a great simulator and making it open source for the community! I only can imagine how much time and energy you have invested in this tool!

Regards
BeXXsoR

JW:
Have you seen the Provincial AI for dominion? The write-up for that may contain useful ideas. https://graphics.stanford.edu/~mdfisher/DominionAI.html

DG:
If you dig back through the forums you might also find a thesis which used Dominion as a basis for AI development using mathematical methods. I'm assuming that was theoretical whereas you will be looking for practical results.

ben_king:

--- Quote from: BeXXsoR on April 24, 2017, 01:41:59 pm ---Before I start the implementation, I lilke to hear your opinions on this. How can i optimize the representation of a strategy? How can I generate a better and/or quicker fitness function?

--- End quote ---

Ultimately any fitness function is going to be an approximation.  The best approximation is going to be simulating playouts, probably at least a few thousand games.  Using a simulator like Geronimoo's that has fixed rules for how each card is played, this can actually be pretty fast to do.

Testing it against a number of benchmark strategies might be an okay approximation, but it will fail miserably in some cases.  Suppose that your kingdom has (among other cards), Enchantress and Hunting Grounds.  Maybe one of your candidates likes to play something that resembles Hunting Grounds-BM.  That might play really well against all the benchmarks, but it will fail big-time against anyone who knows to pick up Enchantress.  Sure you can over-terminal to try to fight that, but the benchmarks aren't likely to select a candidate that over-terminals.  The point, I guess, is that there are way too many possible scenarios (especially in post-Empires Dominion) for the benchmarks to select good strategies.  More than ever, it depends on the board.

Unfortunately, I think even with the best genetic algorithm and fitness function, this type of approach will not come close to being able to beat good humans.  Maybe on some weak big money boards, it could get good shuffle luck and sometimes win.  But deciding what cards to buy is at best only half the skill in Dominion.  I don't know if a human-level AI is what you're aiming for, but the proposed AI will probably fall short.

If you are looking to build something human-level, you might want to look into Monte-Carlo tree search.  This is more or less what AlphaGo, plus a number of other top AIs for other games use.

trivialknot:
Good luck with the evolutionary algorithm!  Even if it isn't the best AI ever, it could be a fun project, and I hope to hear more about it.

A few comments on the strategy representation:

I don't understand k_17.  For the simple Big Money strategy, it seems like k_17 doesn't matter, since either "follow your strategy" or "treasure" both lead to buying treasure.  For the Engine strategy, it doesn't usually make sense for the Engine strategy to suddenly veer into treasure during the greening phase.

One problem with your strategy parameters is that it doesn't handle card priorities well. A strategy that wants to buy Chapel may never get around to buying it because it's given lowest priority.  I'm also not sure how multiple buys are handled.  Provincial offers some ideas on how to handle card priorities, but Provincial's method is imperfect too.  You could probably go a bit crazier with the number of parameters.