Dominion Strategy Forum

Please login or register.

Login with username, password and session length
Pages: [1]

Author Topic: DeepMind for Dominion  (Read 1394 times)

0 Members and 1 Guest are viewing this topic.

segura

  • Minion
  • *****
  • Offline Offline
  • Posts: 648
  • Respect: +289
    • View Profile
DeepMind for Dominion
« on: April 24, 2020, 07:05:47 am »
0

I read yesterday that Google's self-learning AI project, which has been mainly known for excelling at chess and Go, is also pretty good at a RTS game like Starcraft.
My hunch is that this is far more complex than a turned based game like chess which features an average of 20 choices per turn and around 80 plies.

So if that AI is brillant at dealing with the giant decision space of a real time game with hidden information, shouldn't it be able to tackle a turn-based game with randomness like Dominion?

Is this feasible, is it likely that Google would be interested in that ... and would this hypothetical scenario ruin the game or actually imply a possible balancing tool for new cards (not that this is necessary)?
« Last Edit: April 24, 2020, 07:06:59 am by segura »
Logged

faust

  • Margrave
  • *****
  • Offline Offline
  • Posts: 2672
  • Shuffle iT Username: faust
  • Respect: +3737
    • View Profile
Re: DeepMind for Dominion
« Reply #1 on: April 24, 2020, 09:41:06 am »
+2

I imagine you can train an AI to play a particular kingdom really well. The trouble comes from variability in the setup. Chess an Go always have the same moves available; I don't know much Starcraft, but I imagine that the pool of units is still significantly more limited than the pool of Dominion cards.

As neural networks have no way of "understanding" card texts, they could only "figure out" that two cards are similar through extensive training. Or I don't know, maybe you can write a more clever program that starts with a small number of cards, then checks new cards for similarities and uses its preexisting training to go on. So you could possibly do something, but it wouldn't be easy.
Logged
Since the number of points is within a constant factor of the number of city quarters, in the long run we can get (4 - ε) ↑↑ n points in n turns for any ε > 0.

pubby

  • Tactician
  • *****
  • Offline Offline
  • Posts: 418
  • Respect: +718
    • View Profile
Re: DeepMind for Dominion
« Reply #2 on: April 24, 2020, 08:37:49 pm »
+1

I'd assume machine learning could be used to estimate a general game plan (especially regarding attacks), but it wouldn't solve most of the problems with creating a Dominion AI. Mostly it'd just feed heuristics into traditional AI code. I think the more relevant problem is creating a fast way to estimate what a deck can do each turn, which AFAIK isn't a great fit for neural nets.

i imagine if they were going to do this for a card game, they'd use MtG since that's been shown to be NP-complete ("does the game end", i believe, was their qualifier problem), whereas dominion (for all non-trivial play examples, ie, player A, player B both draw and discard 5 cards, next turn) should be P, provided none of the truly unbounded vp cards (Monument, Plunder, Chariot Race) are present.

additionally, if they programmed their bot with the goal of "end the game while you have the most points" rather than "get the most points", Dominion should be solvable in P
Huh I think you mean "turing complete" instead of "NP-complete"?
Logged

heron

  • Saboteur
  • *****
  • Offline Offline
  • Posts: 1040
  • Shuffle iT Username: heron
  • Respect: +1160
    • View Profile
Re: DeepMind for Dominion
« Reply #3 on: April 24, 2020, 09:15:51 pm »
0

i imagine if they were going to do this for a card game, they'd use MtG since that's been shown to be NP-complete ("does the game end", i believe, was their qualifier problem), whereas dominion (for all non-trivial play examples, ie, player A, player B both draw and discard 5 cards, next turn) should be P, provided none of the truly unbounded vp cards (Monument, Plunder, Chariot Race) are present.

additionally, if they programmed their bot with the goal of "end the game while you have the most points" rather than "get the most points", Dominion should be solvable in P
Huh I think you mean "turing complete" instead of "NP-complete"?

it's that too but given that each "knob" (whether the deck will run out, whether life will run out, etc) is unbounded (Gaea's Blessing, Lifelink, etc), you end up dealing with infinities, so given each deck is doing non-trivial things (draw a card, end your turn being the trivial play) it's impossible given two of all possible 60 card decks to say whether a game will end. ergo NP-complete.

This is not what NP-Complete means.
Perhaps you mean noncomputable? I don't really know anything about magic though.
Logged

ftl

  • Mountebank
  • *****
  • Offline Offline
  • Posts: 2056
  • Shuffle iT Username: ftl
  • Respect: +1336
    • View Profile
Re: DeepMind for Dominion
« Reply #4 on: April 24, 2020, 10:14:13 pm »
+1

At this point, with the number of different things AI's been made to do, I think it's highly likely that you could make an AI to do Dominion too.

"But at what cost" is the relevant question. A hobbyist tinkering at some code on the weekends - yeah, probably not. Google gives the project $50 million dollars and a team of AI PhDs? Sure, of course, I don't see why not.

I have no idea whether Google's "DeepMind" is the AI to do it or if you need a different one, I don't really have enough technical know-how about how exactly DeepMind works to judge how well it would fit Dominion.
Logged

crj

  • Saboteur
  • *****
  • Offline Offline
  • Posts: 1403
  • Respect: +1559
    • View Profile
Re: DeepMind for Dominion
« Reply #5 on: April 25, 2020, 12:43:30 pm »
0

Yeah, it feels like there are three separate AI challenges, here:
  • Learn the rules of the game
  • Learn how each card works
  • Learn to play well
You could sidestep the first two by baking the rules into the AI. Or you could bring in some natural-language-processing AI and feed it the rulebooks and card texts. Or you could have it figure them out by trial and error.

Baking in a strategy for playing well would clearly defeat the point of the exercise. And I'm assuming one would want it to figure out strategy by trial and error, rather than by knowing how to read the dominionstrategy wiki. (-8
Logged

ghostofmars

  • Moneylender
  • ****
  • Offline Offline
  • Posts: 160
  • Respect: +70
    • View Profile
Re: DeepMind for Dominion
« Reply #6 on: April 27, 2020, 05:39:32 am »
0

I guess it comes down to what level of AI you expect. I think a hobby project could come up with an AI that is challenging to the average Dominion player, perhaps even to the average player on this forum.

Coming up with an AI that is challenging to the best players would be considerably harder and something of the level like AlphaZero (consistently better than the best professional players) is nothing you can do without professional support.

Furthermore, I think supervised learning might give you results faster, because you narrow the scope what an AI has to learn. You could e.g. have a categorizing AI judge whether an engine is viable and then have two separate AIs for Engine and Big Money. All these additional constraints will reduce the learning time, but also lead to a weaker AI because it can not explore the whole space of the game.

One final point: Because the Dominion ruleset still frequently changes, you need to be careful not to paint yourself in a corner. If a new Way to play the game is released, how do you ensure your AI can cope?
Logged

segura

  • Minion
  • *****
  • Offline Offline
  • Posts: 648
  • Respect: +289
    • View Profile
Re: DeepMind for Dominion
« Reply #7 on: April 27, 2020, 09:25:36 am »
0

I wonder how the AI could deal with different setups. Starcraft has hidden information but the options are identical in all games. A self teaching AI might have a hard time to meta-learn (judge Kingdoms).
Logged

ghostofmars

  • Moneylender
  • ****
  • Offline Offline
  • Posts: 160
  • Respect: +70
    • View Profile
Re: DeepMind for Dominion
« Reply #8 on: April 27, 2020, 03:39:52 pm »
0

Actually the decision whether to play engine or BM is probably easier than I thought. You just let the AI play a few games against each other and its instance which performs the best on average gets to decide. That still doesn't solve the problem how to come up with a good engine player in the first place.

I wonder how the AI could deal with different setups. Starcraft has hidden information but the options are identical in all games. A self teaching AI might have a hard time to meta-learn (judge Kingdoms).
Well ultimately every card is a source of a limited set of resources (Action, Card, Buy, $, VP, ...). So the AI just needs to learn what the expected gain of playing a certain card is and cards with similar expected gains can replace each other.

What I am most worried about would be play decisions like which card to topdeck with Harbringer or which Pawn choices you should take. The other thing that will be hard is exceptional cards that warp how the game is played (Tournament, Possession).
Logged

ftl

  • Mountebank
  • *****
  • Offline Offline
  • Posts: 2056
  • Shuffle iT Username: ftl
  • Respect: +1336
    • View Profile
Re: DeepMind for Dominion
« Reply #9 on: April 27, 2020, 10:49:29 pm »
0

I guess it comes down to what level of AI you expect. I think a hobby project could come up with an AI that is challenging to the average Dominion player, perhaps even to the average player on this forum.

I think at this point, there's just so. many. cards. that even a halfway competent AI is beyond the scope of a hobby project. There's just a lot of weird interactions that aren't hard to program but take time. Just implementing the rules of all the cards - and getting them to work right - is nontrivial. It's even more work than it seems because they have to be implemented in a way that they can be manipulated and reasoned about, even for simple internal questions like "if I play this card, will it trigger a reshuffle".

There is a heck of a lot of work that would have to go into an AI where the "AI" part of it is absolutely minimal, just a bot that picks one BM+X strategy per board.

And that stuff is "table stakes", so to speak. Price of entry, before you actually get to the interesting questions.
Logged

ftl

  • Mountebank
  • *****
  • Offline Offline
  • Posts: 2056
  • Shuffle iT Username: ftl
  • Respect: +1336
    • View Profile
Re: DeepMind for Dominion
« Reply #10 on: April 27, 2020, 11:04:21 pm »
+3

By the way, this thread led me down a rabbit hole of reading about how AlphaGo and AlphaZero work. I don't really know what resources are the best, but it was fun to google and read about, I would recommend anyone who thinks thinking about AI is cool go down that rabbit hole for a bit.

Really puts into perspective that for a real AI, questions like how to decide "BM or Engine" or "What card to topdeck with Harbinger" just aren't the sort of questions the AI designer is going to be thinking about, it's all about how to set up the problem so that ML techniques can be applied to learn the answer to *all* those decisions in the same way.
Logged

ghostofmars

  • Moneylender
  • ****
  • Offline Offline
  • Posts: 160
  • Respect: +70
    • View Profile
Re: DeepMind for Dominion
« Reply #11 on: April 28, 2020, 04:15:56 am »
0

I think at this point, there's just so. many. cards. that even a halfway competent AI is beyond the scope of a hobby project. There's just a lot of weird interactions that aren't hard to program but take time. Just implementing the rules of all the cards - and getting them to work right - is nontrivial. It's even more work than it seems because they have to be implemented in a way that they can be manipulated and reasoned about, even for simple internal questions like "if I play this card, will it trigger a reshuffle".
Sorry, I have not been clear. I was only talking about the AI part. Implementing all of Dominion would be a much longer task. If I were to attempt something like a Dominion AI, I would do a proof of principle with a selected set of cards (probably you want to some events or landmarks, too). Then you might want to talk to Stef to see if you can realize the same with the actual game.

By the way, this thread led me down a rabbit hole of reading about how AlphaGo and AlphaZero work. I don't really know what resources are the best, but it was fun to google and read about, I would recommend anyone who thinks thinking about AI is cool go down that rabbit hole for a bit.

Really puts into perspective that for a real AI, questions like how to decide "BM or Engine" or "What card to topdeck with Harbinger" just aren't the sort of questions the AI designer is going to be thinking about, it's all about how to set up the problem so that ML techniques can be applied to learn the answer to *all* those decisions in the same way.
Well there is also the aspect that they wanted to show that it can be done without any guidance. I don't know if in general the approach to supply no input at all is the best. Of course, games are also the best case scenario, because the evaluation function is well defined (you either win or you don't).

At first I wanted to add something about how the state of a Dominion game is much more difficult to describe than that of Go, but I'm not so sure anymore. There are around 200 cards in any given game and you need 4~5 bits to store where a card is. In addition ~100 bits to store the kingdom information. For Go you have a 19*19 board with 3 bits per field (empty, white, black). So I would get 900~1100 bits for Dominion and ~720 for Go.
But then the decision space of Dominion is much smaller. For every given game state there are probably less than ~30 possible choices whereas for Go you start of with 361 choices.

It is probably still harder to learn than Go though, because one bit flip in the game state (say replacing Pawn with Chapel) seems to have a much more severe impact than replacing one stone on the Go board (full disclosure, I did not play Go enough to check that this is true).
Logged

Sauter

  • Pawn
  • **
  • Offline Offline
  • Posts: 3
  • Respect: 0
    • View Profile
Re: DeepMind for Dominion
« Reply #12 on: April 28, 2020, 04:55:46 pm »
0

It is probably still harder to learn than Go though, because one bit flip in the game state...seems to have a much more severe impact than replacing one stone on the Go board

Hi, Go player of six years here. I get the point of what you're saying, but there are tons of times in an average game of Go where flipping a bit would change the outcome of the game. (I guess you can compare it to skipping someone's turn in a game of chess, since each non-empty space in Go was someone's turn).

As a human, I will say that Go has the steepest learning curve of any game, hands down. At the same time, I can see why learning to generalize to any given Dominion board would be hell for an AI.
« Last Edit: April 28, 2020, 05:02:00 pm by Sauter »
Logged

marksim

  • Ambassador
  • ***
  • Offline Offline
  • Posts: 34
  • Respect: +19
    • View Profile
Re: DeepMind for Dominion
« Reply #13 on: April 28, 2020, 09:05:23 pm »
0

My favorite notion re: AI is to petition dominion.games / @stef to allow a simple DSL to be created that gives rules for how to play the game given the presence of certain cards.  Then those scripts get loaded up and bots select them based on the cards present and the current win rates.  Every win from a bot based on the script gives a small (small!) ranking bonus to the uploader.  Losses do not give dings, so you are encouraged to try things out and upload strategies to play against, but also improve them over time.  Overall this improves bots AND improves players as they have to come up with nuanced rules and ways of systematically thinking about cards and how to win.

I think that's the only way you're ever going to actually see an AI develop is if it has thousands of games as samples and thus the cooperation of dominion.games.
Logged

ftl

  • Mountebank
  • *****
  • Offline Offline
  • Posts: 2056
  • Shuffle iT Username: ftl
  • Respect: +1336
    • View Profile
Re: DeepMind for Dominion
« Reply #14 on: April 28, 2020, 11:00:58 pm »
0

It is probably still harder to learn than Go though, because one bit flip in the game state...seems to have a much more severe impact than replacing one stone on the Go board

Hi, Go player of six years here. I get the point of what you're saying, but there are tons of times in an average game of Go where flipping a bit would change the outcome of the game. (I guess you can compare it to skipping someone's turn in a game of chess, since each non-empty space in Go was someone's turn).

I don't know Go at all, but in chess it's pretty easy to see that "changing a bit" (the color of one piece) could trivially make the game change completely (change a black queen to a white queen!) ...or moving a piece by one square in one direction can pretty obviously make the difference between a clear win and a clear loss. I'm completely unsurprised that it's like that in Go too.

I think the fact that those sort of subtleties matter is pretty universal among interesting games. If they didn't, those games would quickly stop being interesting and would quickly become solved!
Logged

ghostofmars

  • Moneylender
  • ****
  • Offline Offline
  • Posts: 160
  • Respect: +70
    • View Profile
Re: DeepMind for Dominion
« Reply #15 on: April 29, 2020, 03:51:00 am »
0

I guess I fell for the psychological fallacy that you consider the things you know more about as having more depth than things you know less about (relevant xkcd https://xkcd.com/915/). Of course there are also boring bit flips in Dominion (Copper in-the-trash or not-bought).
Logged

() | (_) ^/

  • Minion
  • *****
  • Online Online
  • Posts: 632
  • Shuffle iT Username: p4ddy0d00rs
  • Nemo dat quod non habet.
  • Respect: +520
    • View Profile
    • BGG profile
Re: DeepMind for Dominion
« Reply #16 on: April 29, 2020, 08:33:34 am »
0

I guess I fell for the psychological fallacy that you consider the things you know more about as having more depth than things you know less about (relevant xkcd https://xkcd.com/915/). Of course there are also boring bit flips in Dominion (Copper in-the-trash or not-bought).

Oh, an interesting interpretation of that xkcd! I always took its meaning to be “everything is deep.”
Logged

popsofctown

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5473
  • Respect: +2840
    • View Profile
Re: DeepMind for Dominion
« Reply #17 on: May 24, 2020, 08:53:41 am »
+1

I think people are vastly underestimating (or rather, perhaps not estimating at all and "bypassing") the number of "calculator friendly" handicaps available to a hypothetical Dominion AI.  Stuff that it's just better to have a machine do.  The machine makes railroad better than John Henry because it runs on coal.

In Jeopardy, the advantage was buzzer pressing, for the demo they let the AI get the full advantage of seeing the permission light and reacting in .0001 seconds instead of the standard 0.7 seconds and always got first crack at a question so it could answer lots of questions wrong and still win.  In Starcraft, there was infinite micro, which would be the second greatest benefit that I'm aware of, the ability to spend essentially no time telling an individual unit to do something different from its peers.  Starcraft was a balance between exploiting that advantage and having a similar "god there's so many things for your grade school teacher to cover" dimension similar to Dominion. 
In an FPS with reduced character size it'd be so trivial to exploit those kinds of advantages that people don't write AIs to do it because it's too boring and easy.  The handicap is too huge.  That's assuming the design of the FPS is such that expert players miss a little bit or have to slow down to take their shots, of course, but with very much of either it becomes so huge.

Chess's AI friendly task is checkmate puzzle solving.  But the way chess works, the really hard part happens before that even comes up, one player generates strong advantage before mating puzzles start to really be presented. So mostly chess was a matter of making the AI smart enough to win, with no calculator based advantages at all.  I don't know go well but I expect it's a pretty similar story.  I think it's possibly kind of true that the amount of time the AI had to not lose before it could start checkmate-puzzling off of information advantages is more generous in chess than in go, that is possibly the meaningful way of looking at it.  It could just be that Go midgames are flat out harder, though.

What Dominion is gonna have is greening phase and I think that should be appreciated.  The decision to start buying provinces, or to buy a final gold knowing exactly how many times you'll draw it if you really sit down and think about it, the exact probabilities involved in Duchy dancing (with perfect deck tracking.  I'm sure our highest tier players have a vague idea of what's likely soon towards the end of a dominion game, but don't think they can recite their opponent's remaining 8 cards before next reshuffle and their own consistently.  And it's almost totally about knowing what the next hands look like and not any of the big brain strategy of needing just one village mixed in with Ironmonger on X board because an expert intuits which drawing terminals are going to be the right purchases at the right time when hitting 5 - it's just stuff like "yeah if you actually counted up every single card, there's only a 87% chance your opponent can end Duchy dance prematurely not 92%, therefore you -should- actually buy that spice merchant and expect to nuts topdeck it for a province-duchy turn, or whatever, stuff like that.

And those kinds of advantages are going to be pretty big in a probablistic game where the players should be expected to trade wins anyway, even just off brutish data gluttony on the timings of village genus cards and smithy genus cards and woodcutter genus cards at certain times on certain boards from reading tons of logs.

So, my expectation is, using that handicap, beating every single human should be -easier- than go (i don't want to say chess) because a significant subsidy exists, and the difficulty in memory based card counting + essentially discovering new game specific texas-hold-em odds for endgames exists.  It's still a jillions and jillions of dollars thing though.

tl;dr if you are up a knight in chess I don't think you need a ti84 to checkmate without throwing away your win 15% of the time, if you played more correctly enough to have a similar strength deck to your opponent but with a 5$ where he has a 4$ there's a brutish way even a ti 84 would give you an appreciable winrate bump, and I think that could be considered.  It overrides the subtle difference between what ironworks and smugglers truly mean I would think.  I could see how you might think there are just so many cards that can be classified in so many ways that maybe it would take forever and ever to program the AI to where it's never hardthrowing entire games over things like, "Oh, this board has transmute on it and that's a way to gain action cards, so the principle that Graverobber generally counts as a +buy has force in this kingdom".  But I think with the millions of dollars and as many people working on it as worked on chess, you eventually knock every single one of those bowling pins over, then you get into the range where the computer is like 5% worse than Stef and largely mirroring him, but doing perfect deck tracking stuff on the ends of games and winning 17 game series consistently.
« Last Edit: May 24, 2020, 08:58:25 am by popsofctown »
Logged

segura

  • Minion
  • *****
  • Offline Offline
  • Posts: 648
  • Respect: +289
    • View Profile
Re: DeepMind for Dominion
« Reply #18 on: May 24, 2020, 09:15:01 am »
0

Chess's AI friendly task is checkmate puzzle solving.  But the way chess works, the really hard part happens before that even comes up, one player generates strong advantage before mating puzzles start to really be presented. So mostly chess was a matter of making the AI smart enough to win, with no calculator based advantages at all.  I don't know go well but I expect it's a pretty similar story.  I think it's possibly kind of true that the amount of time the AI had to not lose before it could start checkmate-puzzling off of information advantages is more generous in chess than in go, that is possibly the meaningful way of looking at it.  It could just be that Go midgames are flat out harder, though.
I don't think that this is how chess engines works. They have mainly become better due to the increase of speed of calculation and only partially due to the better design of the evaluation function. That's where humans are still far better.

The impressive thing about AlphaZero is that the machine tought itself how to play chess well without any human guidance and the resulting play is more human (e.g. less materialistic, sacrificing material for long-terman positionala advantages that are hard to evaluate and impossible to calculate to the end) than that of an ordinary chess engine.

Dominion seems far more tricky than a deterministic abstract like chess to me. It has stochastic elements, there are far more "pieces" and every game is different. I guess that DeepMind would have to play one Kingdom a zillion times over before it could move to the next one. Then it would have to learn to evaluate how the strength of a card changes during a game, partly depending on what the opponents do, and the "metagame", i.e. how the strength of a card varies among Kingdoms.
I guess it is possible but this seems like something a human mind can learn much faster, albeit less perfectly.
Logged

Wizard_Amul

  • Bishop
  • ****
  • Offline Offline
  • Posts: 113
  • Respect: +137
    • View Profile
Re: DeepMind for Dominion
« Reply #19 on: May 24, 2020, 01:06:03 pm »
0

We're one more step closer to possibly using Deep Mind for Dominion; the DeepMind team has been working on incorporating a time horizon into the AI, allowing it to put more weight towards a longer term payoff.

DeepMind blog about Agent 57 AI: https://deepmind.com/blog/article/Agent57-Outperforming-the-human-Atari-benchmark
Video by someone else discussing the Agent 57 AI:

From the blog:
"We introduced the notion of a meta-controller that adapts the exploration-exploitation trade-off, as well as a time horizon that can be adjusted for games requiring longer temporal credit assignment."

"Time horizon: Some tasks will require long time horizons (e.g. Skiing, Solaris), where valuing rewards that will be earned in the far future might be important for eventually learning a good exploitative policy, or even to learn a good policy at all. At the same time, other tasks may be slow and unstable to learn if future rewards are overly weighted. This trade-off is commonly controlled by the discount factor in reinforcement learning, where a higher discount factor enables learning from longer time horizons."
Logged

ghostofmars

  • Moneylender
  • ****
  • Offline Offline
  • Posts: 160
  • Respect: +70
    • View Profile
Re: DeepMind for Dominion
« Reply #20 on: May 25, 2020, 10:11:47 am »
0

I think people are vastly underestimating (or rather, perhaps not estimating at all and "bypassing") the number of "calculator friendly" handicaps available to a hypothetical Dominion AI.  Stuff that it's just better to have a machine do.  The machine makes railroad better than John Henry because it runs on coal.
[...]
What you are referring to is also implemented for Chess and Go. You run the game for a few steps with your current evaluation function and then assess where you stand. I would not dare to say whether this gives you more benefit in Dominion as compare to the other two games.

Dominion seems far more tricky than a deterministic abstract like chess to me. It has stochastic elements, there are far more "pieces" and every game is different. I guess that DeepMind would have to play one Kingdom a zillion times over before it could move to the next one.
In the end you have a function to project from a game state onto a given set of actions. I don't believe that there is a huge qualitative difference between the space of Dominion and the space of Go. Chess might be indeed a bit smaller.
Logged
Pages: [1]
 

Page created in 0.088 seconds with 21 queries.