Topic: What's stopping AI from mastering Dominion? (Read 11857 times)

Watno · « **Reply #25 on:** March 07, 2018, 01:02:33 pm »

While impressive, it's not true that that thing was just using self-play: https://blog.openai.com/more-on-dota-2/

yed · « **Reply #26 on:** March 13, 2018, 03:30:04 am »

I wonder if AlphaZero can learn Dominion.
It learned chess and go by just self play, the only imput was rules.
The question is whether randomness and hidden informatin is limit for AlphaZero?
https://en.m.wikipedia.org/wiki/AlphaZero
I have not read the paper, it would take me a lot of time to understand that...

https://web.stanford.edu/~surag/posts/alphazero.html
https://www.nature.com/articles/nature24270.epdf?author_access_token=VJXbVjaSHxFoctQQ4p2k4tRgN0jAjWel9jnR3ZoTv0PVW4gB86EEpGqTRDtpIz-2rmo8-KG06gqVobU5NSCFeHILHcVFUeMsbvwS-lxjqQGg98faovwjxeTUgZAUMnRQ

FemurLemur · « **Reply #27 on:** March 13, 2018, 09:50:24 am »

Quote from: yed on March 13, 2018, 03:30:04 am

I wonder if AlphaZero can learn Dominion.
It learned chess and go by just self play, the only imput was rules.
The question is whether randomness and hidden informatin is limit for AlphaZero?
https://en.m.wikipedia.org/wiki/AlphaZero
I have not read the paper, it would take me a lot of time to understand that...

The common element uniting the three different games that AlphaZero learned (Chess, Go, and Shogi) is that they are all played on square boards where only one piece can inhabit a space at a time. Chess is played on a 8X8 board, Go on a 19X19, and Shogi on a 9X9. From the relevant part of the paper:

Quote

The input to the neural network is an N X N (MT + L) image stack that represents state
using a concatenation of T sets of M planes of size N X N

It seems like the optimal way to represent a Dominion board is fundamentally not an N X N plane. There are issues to contend with like how the order of cards matters in your deck and your In-Play area, but not in your hand, discard, the trash, or your tavern mat. Additionally, the order of Kingdom card piles does not matter, but the order of the cards in those piles can matter (Knights and Split Piles. And the order of Split Pile cards can't even be assumed thanks to Encampment, as well as Ambassador shenanigans).

Also, the Deck and In-Play area theoretically have an infinite size. There is no official rule stating that you can't have more than X cards in your deck or in play (and such a rule will probably never exist, because it kinda goes against the spirit of Dominion). AlphaZero needs to be trained on a single board size at a time. The only way to get around that would be to calculate the theoretical biggest deck you could ever construct. Such a setup would probably include Black Market, Rats, Young Witch (because she adds Bane cards), a Looter (for Ruins), Urchin (for Mercenary), Hermit (for Madman), a card that gains Spoils, a Potion cost, etc. The end result will be some number one or two orders of magnitude bigger than the board sizes that AlphaZero has played on that would make it inconvenient to represent to the AI. Consider for a moment that it will almost never build a deck anywhere near as big as the biggest possible deck, yet we need it to be able to do so just to keep it from breaking should it ever encounter such a thing. The "image stack" of Dominion is not so basic, so movement from one "space" on the Dominion "board" to another can't generally be represented as a grid as AlphaZero does.

Now, I don't know how important the grid structure actually is to AlphaZero's success. For all I know, it could be the case that the AlphaZero methodology can handle Dominion if only it's reworked to be able to represent the inputs properly (though personally, I doubt it). But the point is that AlphaZero is not currently equipped to handle such a game, and it would take a deliberate effort on DeepMind's part to retool it significantly.

DG · « **Reply #28 on:** March 13, 2018, 10:34:22 am »

Quote from: FemurLemur on March 13, 2018, 09:50:24 am

The common element uniting the three different games that AlphaZero learned (Chess, Go, and Shogi) is that they are all played on square boards where only one piece can inhabit a space at a time. Chess is played on a 8X8 board, Go on a 19X19, and Shogi on a 9X9. From the relevant part of the paper:

For a while now I've been working on a pet project creating a 2D Dominion style game and the 2D aspect increases the (human) thinking time considerably, even just using a 5x5 area. I do mean a considerable increase in thinking time. Dominion essentially has two 1D sequences to worry about: the deck and the play area (although that can branch). Dominion should be simpler to evaluate than 2D games until these sequences get long, but when the sequences get long they will take a lot of evaluation.

As I mentioned before, when you integrate different decisions, such as changing card play to suit buying, the complexity increases as you are not just evaluating two decisions in series, you are evaluating card play and buy decisions in parallel and that can increase computation time significantly.

jonaskoelker · « **Reply #29 on:** March 18, 2018, 04:01:04 pm »

Quote from: FemurLemur on March 06, 2018, 02:25:37 pm

Pre-generating Kingdoms is inconvenient. I think casual players would think the wait time is unreasonable. [...] You can't even save yourself time on future calculations by storing a result in a database when you're done, because there are just too many possible starting positions in Dominion.

One obvious idea, if you have an computationally somewhat efficient client with some storage (so probably not a web browser), is that for the first 10 minutes after you create your account you can only play against humans, while your client is "training" an AI for a random kingdom.

After that, whenever you play a random kingdom, it pulls a kingdom with an AI out of storage, and generates a new kingdom-with-a-trained-AI while you play.

Then you would only incur a 10-minute wait if you wanted to throw out (or save-for-later) the generated kingdom and instead play a different one. You could mitigate this if you can plan one game ahead, and ask for the next random kingdom to be sampled from some different distribution than the one you pull out of storage.

If you only play some manageable number of kingdoms, e.g. full random, all single-set randoms, 5/5 two-set kingdoms across all expansion pairs (a la "recommended kingdoms"), and perhaps a short list of kingdoms of your own design—let's say 100-200 distributions of random kingdoms—it shouldn't be too onerous to keep that many pre-trained AIs around. At 10 minutes to train an AI, that's 33h 20m to train a full set of 2000 AIs, or the first four nights after you sign in. Until then, you'll have to make do with only full random.

Such a system is of course far from ideal, but if the reality is that you will lose the user if you compute while they're waiting, you should be creative about computing while they're not waiting.

jonaskoelker · « **Reply #30 on:** March 18, 2018, 04:31:55 pm »

Quote from: FemurLemur on March 13, 2018, 09:50:24 am

AlphaZero needs to be trained on a single board size at a time. The only way to get around that would be to calculate the theoretical biggest deck you could ever construct. Such a setup would probably include [...]

I posted a puzzle, Set up the longest sequence of known top-decked cards, which turned out to amount to the question "what's the largest number of cards there can be in the game?" once you have discovered how to topdeck all your cards.

Somewhere in the thread, I wrote a list of kingdom cards which add the largest number of cards to the game. Big winners are Black Market, Exorcist (+31) and Page/Peasant (+20). Many of the cards which add more than the baseline 10 can be shoved into the Black Market deck and still do their thing.

Drecon · « **Reply #31 on:** March 19, 2018, 07:50:29 am »

Quote from: FemurLemur on March 03, 2018, 12:09:59 pm

Quote from: trivialknot on March 02, 2018, 10:54:42 pm
How does the Lord Rattington AI work?

DXV said recently that the best way to think of ShuffleIt is that it basically doesn't have AI right now. Rattington is a placeholder that's just meant to be better than nothing. The logic is probably a combination of making random choices, with some very minor heuristics thrown in (I'm talking things as simple as "Don't buy a curse")

As far as I can tell from my limited experience with Rattongton, he looks at the deck as a whole and calculates a total score for the deck. Then he looks at his options and chooses the one that leads to a deck with the highest relative score.
It seems a relatively simple min-max algorithm as far as I can tell (with maybe some randomness thrown in, but not a lot)

(and yes, I made an account on this forum pretty much just to say this. I hope there's enough going on on the forums for me to stick around,

weesh · « **Reply #32 on:** March 19, 2018, 09:16:42 am »

Quote from: Drecon on March 19, 2018, 07:50:29 am

(and yes, I made an account on this forum pretty much just to say this.

welcome!

Fuu · « **Reply #33 on:** March 19, 2018, 09:53:21 am »

I think we should submit a research grant application to build this.

Dominion Strategy Forum

News:

Author Topic: What's stopping AI from mastering Dominion? (Read 11857 times)

Watno

Re: What's stopping AI from mastering Dominion?

yed

Re: What's stopping AI from mastering Dominion?

FemurLemur

Re: What's stopping AI from mastering Dominion?

DG

Re: What's stopping AI from mastering Dominion?

jonaskoelker

Re: What's stopping AI from mastering Dominion?

jonaskoelker

Re: What's stopping AI from mastering Dominion?

Drecon

Re: What's stopping AI from mastering Dominion?

weesh

Re: What's stopping AI from mastering Dominion?

Fuu

Re: What's stopping AI from mastering Dominion?