Topic: What's stopping AI from mastering Dominion? (Read 11849 times)

J Reggie · « **on:** March 01, 2018, 06:06:27 pm »

After AlphaGo’s defeat of Lee Sedol in 2016, it has been considered inevitable that computers, specifically artificial neural networks, can beat the top humans at any board game given enough preparation. Now, obviously that preparation is the key; since stef doesn't have an in with the Google DeepMind team that I'm aware of, it'll be be a long time until we get a neural network that can beat Mic Qsenoch in a league match. But even if a supercomputer was thrown at the problem, there are a number of other challenges it would need to overcome. I'll outline a few that I can think of, but please add more.

Lack of training data
AlphaGo analyzed hundreds of thousands of professional games online before it reached a professional level. We have nowhere near that amount of data, not even taking into account the rating of the players. For one, there's really no such thing as a “professional dominion player”; there are a handful of highly-rated players who regularly place well in tournaments, plus stef himself (who doesn't participate in tournaments or rated games anymore). A neural network trained on data from all rated games would probably develop reasonably strong money-based strategies and beat lower-ranked players, but would lose to anyone who who frequented these forums. Furthermore, there's no good bot to play hundreds of thousands of games against in order to get a basis for good play. An AI that trains against Lord Rattington is going to end up marginally harder to beat than Lord Rattington, which isn't saying much.

2. Sheer number of starting game states
In games like chess, checkers, and go, there are a total of 2 possible starting game states for any given player: either you go first or you go second. In Dominion, that number is at least in the realm of quadrillions. An algorithm would need to analyze each game on the fly; it can't have a bank of games it's played with that same kingdom to fall back on. Even if a neural network could get to the point of basic kingdom analysis, small things like what the Bane is or what Obelisk is on can completely change a game. This type of intricacy is pretty common, but specific instances of it are rare enough that a neural network may not have a point of reference to determine whether, say, opening Young Witch is worth it in a given kingdom. It will also be bad at recognizing 2-card combos on its own. Something like Donate/Market Square May be obvious to a human, but how would a computer ever come up with that?

3. Randomness
Computers that play board games will often form a tree of possible moves when deciding what move to make next. The only variable they have to take into account is their opponent, and they can form a tree of moves for their opponent as easily as they can form one for themselves. However, it wouldn't be time-efficient for a computer to calculate each possible arrangement of its cards after a shuffle every time it plays or buys a card; it would need to develop some more abstract way of determining its choices. That's a level that AI as of yet tends to struggle with. This hypothetical dominion bot wouldn't be able to calculate its exact percentage chance of winning with each decision, so what would it go off of?

4. Number of small choices
For a while, Lord Rattington had a bug where if you played Vault until it had 0 cards in hand, it would freeze. This is an example of a bot not understanding a simple decision of whether or not to discard. Even after the bug was fixed, the bot continued to discard until it had 0 cards in hand. An AI would have to learn how to respond to every little prompt, and the difference between each one. It can't just choose not to discard to Militia, but it has to eventually choose not to discard to Vault. This is made harder by the fact that it's not necessarily train against an opponent who plays 5 Vaults per turn. It's going to have to somehow learn what the best response is to each little prompt, and it's going to have to do so quickly, as it may not get many other chances to go over that situation. This isn't how neural networks work; they need lots of reinforcement in order to be able to make good decisions.

5. Unique cards
An AI, if it got to this point, would likely be able to determine that Village and Worker's Village are pretty similar, and might group cards into categories much like humans do. But it could have more trouble with unique cards like Lurker or Villa that seem similar to other cards but play very differently. It could misjudge a card’s function based on a small number of games. I don't have a good idea of how this would play out; it's possible the AI could be better at this than I'd expect.

And that brings me to…

What advantages does an AI have?

All is not hopeless! I think there are a few things computers have going for them when it comes to Dominion, assuming they get over these many hurdles:
Deck tracking - a computer should have a perfect memory of its deck and discard pile at all times
Balancing deck contents - it would probably find a good balance for building an engine in the early game
Forced wins - the counting and math blunders that plague high-level humans wouldn't be an issue for a computer

So what do you think? Is it possible to create an AI that consistently beats the best humans? Is a neural network the best way to go? Will we even play Dominion anymore when the AIs take over the Earth? This is just the beginning.

ConMan · « **Reply #1 on:** March 01, 2018, 06:21:06 pm »

I think that you could have some limited success under controlled conditions, in particular a fixed Kingdom and maybe fixed opening hands (I wouldn't go so far as to require identical random seeds for shuffling or anything like that). Under those conditions, I'd consider some kind of genetic algorithm that has some kind of heuristic information about the game state, and specific information about what decisions it gets to make. Possibly even some kind of hybrid genetic / neural network algorithm? It wouldn't even necessarily need to know what each card does, because it would figure out on its own that in particular game states it should buy or play particular cards (for example, it would hopefully learn that it usually wants to play enough of its Treasures at the start of the Buy phase to afford stuff).

I suspect that it would take a long time to train, not necessarily because of the number of choices but because of the stochastic nature of the game - in Chess or Go when you move or place a piece it does what it does, and if it's the "right" move you will get the "right" result; in Dominion, if you have $7 and a Moat in hand and your draw deck is two Estates and a Copper then even if playing the Moat is the correct move you won't necessarily get the result you need. So you'd want the AI to "forget" things more slowly, I think - you don't want it to make a good move, have a run of bad luck, and thus decided to never make that move again.

weesh · « **Reply #2 on:** March 01, 2018, 06:37:41 pm »

Quote from: J Reggie on March 01, 2018, 06:06:27 pm

Lack of training data
AlphaGo analyzed hundreds of thousands of professional games online before it reached a professional level. We have nowhere near that amount of data, not even taking into account the rating of the players....

This is a reasonable point, but not technically accurate.

There were not enough professional games to seed alpha go, and they actually used ZERO professional games.
The library was made entirely of the games of very strong amateur games, of which there are vast libraries, and the play is actually not that much weaker than pro games.

Furthermore, alpha go played against itself many times (millions? i can't recall), and MOST of the analysis it did was based on THOSE games, rather than the seed games.

It's possible that if all the games played online were analysed, that would be enough to kickstart the process, and allow the computer to play against itself and analyse it's own games.

But, if the "alphaDominion" requires the same use of pattern recognition to follow the correct branches as alphaGo used, it could be very possible that there are not enough strong games to perform the analysis.

Afterall, it doesn't survey the same board everytime. it would not be able to find a single kingdom thousands of times played by strong players, let alone for every possible kingdom.

---

My theory is that there is not the MOTIVE (financially, computationally, newsworthiness, and time-wise) for people to develop a dominion playing computer.
You could work really hard on such a program, spending quite a bit of money, and achieve notoriety only among a small number of dominion fans, and not make waves like the poker playing, chess playing, and go playing programs can.

if you want to make waves, the next big challenge is a poker playing program that can win against more than one opponent.
Right? No one did that while I was looking the other way did they?

DG · « **Reply #3 on:** March 01, 2018, 08:24:00 pm »

First of all, we have one off strategies such as counting house and traveling fair that will probably not be found by any measuring algorithm unless a kingdom is played out through all possible states to a victory. Playing games through all possible states is going to take many iterations and when you move to another kingdom, with minions maybe, you need to start the assessment from scratch again.

Secondly, it can be hard to separate out important factors. Do you change your card play to suit a buying strategy or do you change your buying to match your card play? That's a block on the current level of AI but a supercomputer should crack that. However what if your card play and purchases depend upon your opponent's hand next turn? What if your opponent plays differently based on your card plays and purchases? This is beginning to sound like a rapidly expanding problem.

Awaclus · « **Reply #4 on:** March 01, 2018, 08:35:15 pm »

I don't know, he hasn't been very active in over two years.

ConMan · « **Reply #5 on:** March 01, 2018, 10:09:24 pm »

Quote from: DG on March 01, 2018, 08:24:00 pm

Secondly, it can be hard to separate out important factors. Do you change your card play to suit a buying strategy or do you change your buying to match your card play? That's a block on the current level of AI but a supercomputer should crack that. However what if your card play and purchases depend upon your opponent's hand next turn? What if your opponent plays differently based on your card plays and purchases? This is beginning to sound like a rapidly expanding problem.

My suspicion is that you might get some mileage out of having two separate AIs - one that handles the Action phase, and one that handles the Buy phase (assuming that the kingdom is chosen such that there are no meaningful decisions to make involving Night, Clean-up or other players' turns). That might simplify the structure, and you can randomly pair them up to form a "full player" in each iteration of the training.

It will always be a question of how much abstraction and simplification you can use to make the problem more easily solvable, and how much of an effect that has on the skill of the AI. For example, instead of feeding in the complete game state, down to the probabilities of what card is on top of the opponent's deck, is it enough to just have a few simple measures like "Provinces left in Supply", "VP differential between players" and "Coin value of Treasures in hand"? (Answer: The list probably needs to be longer, but probably not as long as you might think.)

terminalCopper · « **Reply #6 on:** March 02, 2018, 02:58:33 am »

Quote from: J Reggie on March 01, 2018, 06:06:27 pm

Lack of training data
AlphaGo analyzed hundreds of thousands of professional games online before it reached a professional level. We have nowhere near that amount of data, not even taking into account the rating of the players. For one, there's really no such thing as a “professional dominion player”; there are a handful of highly-rated players who regularly place well in tournaments, plus stef himself (who doesn't participate in tournaments or rated games anymore). A neural network trained on data from all rated games would probably develop reasonably strong money-based strategies and beat lower-ranked players, but would lose to anyone who who frequented these forums. Furthermore, there's no good bot to play hundreds of thousands of games against in order to get a basis for good play. An AI that trains against Lord Rattington is going to end up marginally harder to beat than Lord Rattington, which isn't saying much.

2. Sheer number of starting game states
In games like chess, checkers, and go, there are a total of 2 possible starting game states for any given player: either you go first or you go second. In Dominion, that number is at least in the realm of quadrillions. An algorithm would need to analyze each game on the fly; it can't have a bank of games it's played with that same kingdom to fall back on. Even if a neural network could get to the point of basic kingdom analysis, small things like what the Bane is or what Obelisk is on can completely change a game. This type of intricacy is pretty common, but specific instances of it are rare enough that a neural network may not have a point of reference to determine whether, say, opening Young Witch is worth it in a given kingdom. It will also be bad at recognizing 2-card combos on its own. Something like Donate/Market Square May be obvious to a human, but how would a computer ever come up with that?

3. Randomness
Computers that play board games will often form a tree of possible moves when deciding what move to make next. The only variable they have to take into account is their opponent, and they can form a tree of moves for their opponent as easily as they can form one for themselves. However, it wouldn't be time-efficient for a computer to calculate each possible arrangement of its cards after a shuffle every time it plays or buys a card; it would need to develop some more abstract way of determining its choices. That's a level that AI as of yet tends to struggle with. This hypothetical dominion bot wouldn't be able to calculate its exact percentage chance of winning with each decision, so what would it go off of?

4. Number of small choices
For a while, Lord Rattington had a bug where if you played Vault until it had 0 cards in hand, it would freeze. This is an example of a bot not understanding a simple decision of whether or not to discard. Even after the bug was fixed, the bot continued to discard until it had 0 cards in hand. An AI would have to learn how to respond to every little prompt, and the difference between each one. It can't just choose not to discard to Militia, but it has to eventually choose not to discard to Vault. This is made harder by the fact that it's not necessarily train against an opponent who plays 5 Vaults per turn. It's going to have to somehow learn what the best response is to each little prompt, and it's going to have to do so quickly, as it may not get many other chances to go over that situation. This isn't how neural networks work; they need lots of reinforcement in order to be able to make good decisions.

5. Unique cards
An AI, if it got to this point, would likely be able to determine that Village and Worker's Village are pretty similar, and might group cards into categories much like humans do. But it could have more trouble with unique cards like Lurker or Villa that seem similar to other cards but play very differently. It could misjudge a card’s function based on a small number of games. I don't have a good idea of how this would play out; it's possible the AI could be better at this than I'd expect.

These are all good points, making a Dominion AI difficult. But I believe, the most important reason is missing:

6. By now, no one is willing to invest millions of dollars

I am pretty sure, if a couple of top players and a dozen brilliant guys from DeepMind would work together for a year, the resulting AI would beat us all.

Titandrake · « **Reply #7 on:** March 02, 2018, 04:27:29 am »

Okay, so I've seen a few threads about Dominion AI, which I have tried to ignore, but I don't think I can anymore.

I'm doing machine learning stuff right now, and in the past have messed around with game AIs. If you want you can read some blog posts I've written about the subject, here and here and here and here. None of those posts will be necessary, I'm linking them just to prove I have thought about this stuff before.

This is a subject where I can probably rant for a long time, and I'm not looking to rant, so my hot takes are

1. CouncilRoom has logged all games from Isotropic days, if you're interested I'm sure you could asked the current runner of CouncilRoom for the database.

2. I'm not that concerned by the number of starting states. It makes your problem harder, but is not a deal-breaker by itself. Chess / Checkers / Go only have 1 starting state, but it quickly branches into several different situations. If you take a start-of-the-art and throw it in the middle of a random Chess / Checkers / Go game, we'd expect it to perform well. A good AI should generalize to different game states - a good Dominion AI should do the same. That being said, it is definitely easier if you stick to a fixed board, but it isn't as interesting of a problem.

If anything I'm concerned more about the number of unique things cards do. Stuff like Young Witch or Landmarks or Events, where their mere existence drastically changes the landscape, and each one is super different. Game difficulty is guided by a mix of branching factor and ability to generalize across states.

3. Randomness is definitely an issue, but this doesn't have any relation to whether a Dominion bot can calculate an exact win chance. To be pedantic, it can, it's just not guaranteed to be accurate or to have low-variance estimates. But sometimes that's fine, as long as the right move is given higher value than all the wrong moves. Although this does affect search trees, I think credit assignment is a more important problem - it's a lot harder to attribute winning / losing moves when the game is random.

4. I'm not sure you need to learn all the small choices. I could see some handcoded heuristics taking you very far for that.

5. Basically I think 5 is the same as point 2.

AFAIK, the project Dan is on is focusing on Base-only. To me, that seems doable but also non-trivial. Superhuman performance in full Dominion sounds really, really hard. I think difficulty-wise, full Dominion is harder than multiplayer Texas Hold'Em, because there's so much fiddly stuff / uniqueness in all the different cards. I don't think this is a "12 ML researchers for 1 year" project. I think full Dominion is more like a 30-50 ML researchers project, and even with that many people, they only have about a 20% chance of doing it in a year.

Edit: realistically, the biggest thing stopping it is that there aren't enough people interested. The Computer Go community wasn't that big, but was around for long enough to make several very fast Go engines (~thousands of games/sec on a single CPU thread running highly optimized C) + hold Computer Go tournaments + get servers to support letting bots play against humans. The Dominion community is a lot newer and not as many people care about the problem.

Cave-o-sapien · « **Reply #8 on:** March 02, 2018, 11:47:51 am »

What is a realistic target skill level to shoot for, given the resources available?

crj · « **Reply #9 on:** March 02, 2018, 12:15:39 pm »

A useful summary of a number of important points.

I'd like to make a few minor observations. But first a big one: the reasons Dominion is harder for an AI to play than Go or Chess are very nearly exactly the reasons I enjoy playing it more.

Quote from: J Reggie on March 01, 2018, 06:06:27 pm

Lack of training data

A lack of training data is also an issue for humans, and it's not clear whether humans or AIs are more disadvantaged by the relative lack in Dominion's case. On the one hand, humans can intuit things AIs would have to deduce from data; on the other, an AI is better at generating reams of new training data quickly.

A major disadvantage AIs face is that Dominion is more prone to "Prisoner's Dilemma" situations than something like Chess. How can an AI assess whether a game will be a mirror, and/or whether it's going to have to fight to win the split on some pile, without anticipating how its opponent will play? And how will it learn to do things a human opponent won't correctly anticipate?

Conversely, I can see two other advantages AIs have which you didn't mention:

An AI can correctly evaluate probabilities in the blink of an eye. If it's playing the odds, it can play them right. Humans have to rely on coarser heuristics.

An AI will not be distracted by the card names. If some future expansion contained two cards called Turd and Diamond, a human would instinctively expect Diamond to be better than Turd before they've so much as glanced at what each does. Even if some strategy article demonstrated why Turds were better than Diamonds, it would be really hard for humans to shake their preconceptions.

ben_king · « **Reply #10 on:** March 02, 2018, 12:24:05 pm »

Quote from: crj on March 02, 2018, 12:15:39 pm

An AI can correctly evaluate probabilities in the blink of an eye. If it's playing the odds, it can play them right. Humans have to rely on coarser heuristics.

An AI will be using heuristics as well -- there's no feasible way even for a computer system to enumerate the possibilities in a game like Dominion. Even a much simpler game with less hidden information like No-Limit Texas Hold 'em is intractable for a computer to calculate exact strategies. Even the best chess bots use heuristics -- enumerating the game tree is impossible.

crj · « **Reply #11 on:** March 02, 2018, 12:54:59 pm »

Quote from: ben_king on March 02, 2018, 12:24:05 pm

An AI will be using heuristics as well

Agreed. Thats' why I said "coarser heuristics".

The related question of whether a human "teaches" a computer the heuristics it uses, or whether the AI derives them itself is somewhat deeper and more interesting. As is the question of whether or a human can understand any heuristics the AI comes up with.

Dingan · « **Reply #12 on:** March 02, 2018, 01:47:58 pm »

Does Dominion change too much and too rapidly for an AI to have time to develop? Or rather, could a neural network be built to be able to learn completely new mechanics that hadn't yet been programmed into it? My hunch is not generally, depending on how different the new mechanics are. There's 2 extremes: (1) "$4 | Action | This is just like Village but with some vanilla add-on", (2) "$4 | Action | Play a game of Go with your opponent; if you won, +$3". If needing to handle such drastically new mechanics requires a complete overhaul of the AI, then the problem is more than there not being enough motive for people now to build an AI; there is also the problem of having to update the AI every 12 months or so, which may or may not require a complete start-from-scratch overhaul, and may or may not require more than 12 months. Chess, Texas Hold Em, etc. have the luxury of a static ruleset.

dedicateddan · « **Reply #13 on:** March 02, 2018, 03:18:55 pm »

Dominion is a hard game. In the original AlphaGo paper, games are described by two parameters: the number of decisions, d, and the typical number of legal moves, d. For chess, d~80 and b~35, whereas for dominion, d~200 and b~10.

The decisions in dominion games are also quite different. In chess, each move is played on the same board, whereas in dominion, deciding which action to play is much different from deciding which card to buy or which card to trash. The difficulty of the problem is further increased by the presence of randomness and hidden information.

Fortunately, the state space can quickly be made tractable with a few simplifying assumptions. Instead of enumerating all possible states of a deck, the mean-field approximation can be used to describe the average card in a deck. For example, the starting deck produces an average of 0.7 coins/card and 3.5 coins/turn. With this model a payload function consisting of coins, buys, attacks, gains, attacks, cards trashed, and VP gained can be maximized, which allows for an analytical solution for most decisions.

Of course, some decisions are hard. These questions are best approached through the lens of machine learning. Difficult single decisions, such as how to use a remodel, can be approached with Monte Carlo Tree Search (MCTS). Each possible option is enumerated, play continues, and the choice that led to the best outcome, on average, is selected. Parameters in the global model, such as the payload function, can be optimized through a series of bot tournaments on a particular board.

The greatest advantage that AI has in dominion is its speed of play. Being able to play multiple games a second means that an AI can check for patterns and interactions far faster than a human can. With sufficient computational power, "perfect" dominion play is possible, although in practice finite resources are always available. The trick then is to determine which simplifying assumptions can be made to construct a bot that plays "very good" dominion.

Titandrake · « **Reply #14 on:** March 02, 2018, 10:16:35 pm »

Quote from: Cave-o-sapien on March 02, 2018, 11:47:51 am

What is a realistic target skill level to shoot for, given the resources available?

My understanding is that the current Dominion AI projects are from a group of grad students working on it in their spare time.

These predictions have a habit of being very wrong, but I think a group of grad students working in their spare time have a solid chance of hitting Base-only level 50 pretty quickly, assuming that money strategies are enough to get there. If they work on it for months, they have a reasonable (~50%) chance of hitting level 55, which is about the point where I'd expect you'd need to learn non-trivial engines (the kind that only beat Smithy-BM by 1-2 turns.)

Provincial was able to get pretty far with genetic algorithms to learn buy rules + hardcoded play rules, and that included Intrique and Seaside, so I do think it's definitely possible.

Quote from: Dingan on March 02, 2018, 01:47:58 pm

Does Dominion change too much and too rapidly for an AI to have time to develop? Or rather, could a neural network be built to be able to learn completely new mechanics that hadn't yet been programmed into it? My hunch is not generally, depending on how different the new mechanics are. There's 2 extremes: (1) "$4 | Action | This is just like Village but with some vanilla add-on", (2) "$4 | Action | Play a game of Go with your opponent; if you won, +$3". If needing to handle such drastically new mechanics requires a complete overhaul of the AI, then the problem is more than there not being enough motive for people now to build an AI; there is also the problem of having to update the AI every 12 months or so, which may or may not require a complete start-from-scratch overhaul, and may or may not require more than 12 months. Chess, Texas Hold Em, etc. have the luxury of a static ruleset.

This is another argument in favor of sticking to Base-only for a proof of concept. In principle, it is possible that a bot can get good at evaluating new cards. I believe Provincial showed this - the author of that bot implemented a custom card, and showed the genetic algorithm learned how to incorporate it into its strategy.

It just depends on how close it is to existing cards. You see a similar thing in card evaluations. In Cornucopia previews, people thought Jester was going to be nuts, but it turned out it was just good, not insane. So there was a lot of error there, but Jester was doing something very different from the things before it. In contrast, everyone knew Junk Dealer was going to be a good card, because they had experience with Upgrade, and Junk Dealer is pretty close to Upgrade effect-wise.

trivialknot · « **Reply #15 on:** March 02, 2018, 10:54:42 pm »

What Dominion AI projects currently exist?

How does the Lord Rattington AI work? How did the AI work in other Dominion clients?

FemurLemur · « **Reply #16 on:** March 03, 2018, 12:02:02 pm »

I've actually been working on utilizing Reinforcement Learning to make a strong Dominion AI. IMO, lack of training data is not an issue. I'm having it generate it's own data using self-play. The sky's the limit there. If you're wanting an AI to be better than the best human Dominion player, you wouldn't want to take the Supervised Learning approach anyway. Then it picks up on our bad habits.

I don't personally think Randomness is an issue either. It can learn to make predictions about the hidden information and build a model of the expected State.

As for this idea that I've seen a couple people throw around that it would need to be prepared for every little possibility, that's really just not how Machine Learning works. We use ML so that it won't have to experience every possibility. The goal is that it becomes good at discovering the patterns indicative of a strong Action or State. If you combine a strong learned intuition with MCTS, it should be far better than humans at spotting and adjusting for edge cases.

It's not so much that the variety of cards or openings is too much conceptually, it's just that the amount of needed training time seems like it will be outrageous. That's the biggest struggle as far as I can tell. You can build something great that can become better than the best human in theory, but you're going to need a lot of clever tricks to speed up its rate of learning.

FemurLemur · « **Reply #17 on:** March 03, 2018, 12:09:59 pm »

Quote from: trivialknot on March 02, 2018, 10:54:42 pm

How does the Lord Rattington AI work?

DXV said recently that the best way to think of ShuffleIt is that it basically doesn't have AI right now. Rattington is a placeholder that's just meant to be better than nothing. The logic is probably a combination of making random choices, with some very minor heuristics thrown in (I'm talking things as simple as "Don't buy a curse")

Dingan · « **Reply #18 on:** March 04, 2018, 09:19:20 pm »

Someone please make an AI

ipofanes · « **Reply #19 on:** March 05, 2018, 04:09:26 am »

Quote from: dedicateddan on March 02, 2018, 03:18:55 pm

Dominion is a hard game. In the original AlphaGo paper, games are described by two parameters: the number of decisions, d, and the typical number of legal moves, d. For chess, d~80 and b~35, whereas for dominion, d~200 and b~10.

I would think b would be much smaller than that. Sure, you have more than 10 different cards to buy if you have the money, but most of the 200 decisions have a much lower b (discard the curse to Mountebank? discard or get $2 with Minion? upgrade the Disciple? discard Fool's Gold when opponent buys Province?). Remember that your estimation of b should roughly be a geometric mean, which means a dichotomous decision would have to be evened out by a decision of space 50.

JThorne · « **Reply #20 on:** March 06, 2018, 12:32:24 pm »

Quote

Provincial was able to get pretty far with genetic algorithms to learn buy rules + hardcoded play rules, and that included Intrique and Seaside, so I do think it's definitely possible.

I agree. I still practice games against Provincial occasionally just to remind myself how good BM is sometimes. And I still occasionally lose if I try to get too fancy. But it's also a good reminder when a disciplined engine is dominant and to be patient with greening.

There are quite a few kingdoms where the best strategy really is a BM-variant, and it's quite good at those; interestingly, it peppers in several cards, not just a couple of terminal draws. The genetic algorithm plays so many games it can optimize what the best balance is for maximum points-per-turn. It also plays action/draw engines surprisingly well, and you'll find yourself buried under an avalanche of Torturer plays whenever it's in the kingdom with extra actions. It also seems to make a fairly reasonable decision about what to go for based on how much trashing is available, because all of the simulations will eventually show whether it's possible to thin quickly enough for the engine to fire reliably. It also happily plays slogs, easily learning that a strong attack will end the game on piles with plenty of duchies, curses and misery to go around. It also does pay attention to piles and piles out with a lead if it can.

What it doesn't do well are the fiddly cases. It doesn't ever play a rush, even in the presence of something like Ironworks/Gardens. It doesn't play for a megaturn/pileout, even in the presence of King's Court and Grand Market. It doesn't play a points engine (rather than greening) even in the presence of Goons or Monument.

However, I strongly suspect that what's going on there is that the programmers weren't high-level players to begin with, so they never explored these more advanced concepts (though the missing rush surprises me.) If the deck archetypes and play commands were updated to include these options as possible win states, I'm quite certain that the genetic algorithm would correctly identify when they were the fastest road to victory, and a good order and priority for buying cards.

What's interesting, of course, is that for any given kingdom, the Provincial AI genetic algorithm needs to "chew" on it for a while, playing hundreds of simulated games, before it settles on an optimal strategy. It can only play pre-randomized kingdoms. For a really great Shuffle-IT bot AI that could play any random kingdom, it would likely have to say "please wait, thinking" for ten minutes or so before the game started while it ran hundreds of simulations! (Totally worth it, by the way.)

Incidentally, a little-known feature of Provincial is a whole bunch of fan-cards. You can enable those by editing the .ini file manually, and you can run the Provincial kingdom-generating script for hours and it will generate dozens of new Kingdoms, many of which have cards that don't exist in the real game. Fun!

The bot plays some of them pretty well. Pauper (permanent duration, gain a copper to hand now and at the beginning of each turn) is a card it will happily buy with Gardens in the kingdom, for example. However, because it lacks the ability to know that it's a thing, it never plays a points-only starvation deck with Benefactor or Heiress or Promised Land (degenerate cards that Donald X would never make.) It's reasonably good at recognizing the power of Gambler and Haunted Village. It knows enough to avoid Cursed Land at any price. It will pull off surprise wins with Champion.

Oh, right. Champion. There are several fan-cards that were created for Provincial that predate many of the current expansions, so you'll see Champion and Squire and Hex and Ruins and they don't mean the same thing as the real cards. Just FYI.

I've run the generator and made about 400 kingdoms. If anyone wants to try it out and try playing the fan cards against the AI just for kicks, if you can't figure out the fiddly bits, let me know and I'll post a link to the files.

FemurLemur · « **Reply #21 on:** March 06, 2018, 02:25:37 pm »

Quote from: JThorne on March 06, 2018, 12:32:24 pm

What's interesting, of course, is that for any given kingdom, the Provincial AI genetic algorithm needs to "chew" on it for a while, playing hundreds of simulated games, before it settles on an optimal strategy. It can only play pre-randomized kingdoms. For a really great Shuffle-IT bot AI that could play any random kingdom, it would likely have to say "please wait, thinking" for ten minutes or so before the game started while it ran hundreds of simulations! (Totally worth it, by the way.)

I'd want to avoid this if I were them. Especially for an offline version/mobile app. I can't fault anybody for doing it that way, as it's an effective way to eliminate one of the biggest challenges. But I think that if it's possible to train a bot such that it learns an intuition for the game as a whole rather than needing to be trained for each specific Kingdom, I'd want to pursue that.

Pre-generating Kingdoms is inconvenient. I think casual players would think the wait time is unreasonable. If it's done server-side then I would think that'd get costly for ShuffleIt. If it's done client-side then you have issues with users who have slower hardware (imagine the wait on mobile). You can't even save yourself time on future calculations by storing a result in a database when you're done, because there are just too many possible starting positions in Dominion.

ConMan · « **Reply #22 on:** March 06, 2018, 05:30:13 pm »

I'd be fine with a play-against-bots option that had good AI but required you played on one of a few dozen fixed kingdoms, especially if game updates added new sets over time. I don't know how the AI is stored, but I assume you could keep it in a fairly compact format to make it manageable; and presumably you could also grab versions of the AI at different stages of training to provide different difficulties (but with the proviso that the "easier" bots are more likely to make weird moves).

Fuu · « **Reply #23 on:** March 06, 2018, 10:41:28 pm »

I think the main bottleneck is resources. Dominion is a complex game with many possible choices at each moment, but it can still be broken down into discrete decisions and measurable variables (including approximations, e.g. what my opponent can possibly have in hand). With a sufficiently detailed implementation and enough compute power to run it, and to run it against itself many many times, you could in principle achieve something quite good. Such a model should be able to approximate decisions like whether to play an action if it will trigger a shuffle, whether to buy a province if there are only two remaining, and so on. It wouldn't be necessary for the system learn a human-like intuition about how new cards would work (e.g. a hypothetical vanilla +6 cards) for it to be able to make sensible decisions using a finite set of cards it has learned how to play.

yed · « **Reply #24 on:** March 07, 2018, 11:30:36 am »

My guess it would be possible to create AI which would learned Dominion by just self-play. They have done it with Dota 2:
https://blog.openai.com/dota-2/

Dominion Strategy Forum

News:

Author Topic: What's stopping AI from mastering Dominion? (Read 11849 times)