1
Simulation / Idea for an AI - state evaluation
« on: August 15, 2013, 12:10:23 pm »
A big problem for any attempted Dominion AI is that there are a lot of possible options on each turn, and it's hard to tell what's the best one. This is particularly true of Buy options; generally it's fairly easy to play a given hand in the best way. Since playing is relatively easy, the rest of this discussion will assume that the AI has good play rules and will focus on buy phase decisions.
My idea for an AI is one which imagines putting one card into its deck and determines what its likelihood of winning is. It does so for each possible card it could buy (you can probably cut some cards - For instance why consider Copper when you can get Silver?) and then buys the card that gives it the highest likelihood of winning. Now, how does it evaluate a state and all the hypothetical states? To evaluate any game state, it uses some sort of (basic) strategy to play out the rest of the game and determines its own probability of winning. For instance, the play-out strategy could just be to follow BMU buy rules for the rest of the game from whatever deck it now has. It also assumes that the opponent uses the same play-out strategy from its current deck.
This approach will obviously learn a strategy on any board, as its approach is independent of the cards available. BMU is an obvious play-out strategy, because it's a baseline available on any board, although other strategies are possible, including more complex options. In the case of difficult play decisions, this approach can be used as well, by imagining playing out the hand several ways with a random deck and determining the one which is the best according to some evaluation (possibly most $, but also possibly the play which yields the best possible deck state by some evaluation - this is important for TfB as you may want to Remodel a Gold or something like that). In the case of a turn with multiple buys, it can just use the buy determination model several times, with the available $ decreasing with each iteration. Here it is important to allow the bot to buy nothing, as it could otherwise end up getting a bunch of Copper by deciding that buying Copper was better than buying Curse.
Major downfalls of this approach are that it will likely skip some good strategies. For instance, since Ironworks isn't good for BMU, it would likely ignore something like Ironworks-Gardens if BMU were its play-out strategy. Also, I would be very interested in seeing what kinds of decks it builds. It may skip Villages entirely because they don't help BMU, but it may also take an approach of getting some terminals like Smithy and Militia and then discovering that maybe it should pick up a Village and then get into some sort of engine. It might also completely underrate +Buys for similar reasons.
A nice upside of this approach is that it will likely automatically play PPR, as this is just part of playing out the state to consider the effect of buying a Province when there are two left.
My idea for an AI is one which imagines putting one card into its deck and determines what its likelihood of winning is. It does so for each possible card it could buy (you can probably cut some cards - For instance why consider Copper when you can get Silver?) and then buys the card that gives it the highest likelihood of winning. Now, how does it evaluate a state and all the hypothetical states? To evaluate any game state, it uses some sort of (basic) strategy to play out the rest of the game and determines its own probability of winning. For instance, the play-out strategy could just be to follow BMU buy rules for the rest of the game from whatever deck it now has. It also assumes that the opponent uses the same play-out strategy from its current deck.
This approach will obviously learn a strategy on any board, as its approach is independent of the cards available. BMU is an obvious play-out strategy, because it's a baseline available on any board, although other strategies are possible, including more complex options. In the case of difficult play decisions, this approach can be used as well, by imagining playing out the hand several ways with a random deck and determining the one which is the best according to some evaluation (possibly most $, but also possibly the play which yields the best possible deck state by some evaluation - this is important for TfB as you may want to Remodel a Gold or something like that). In the case of a turn with multiple buys, it can just use the buy determination model several times, with the available $ decreasing with each iteration. Here it is important to allow the bot to buy nothing, as it could otherwise end up getting a bunch of Copper by deciding that buying Copper was better than buying Curse.
Major downfalls of this approach are that it will likely skip some good strategies. For instance, since Ironworks isn't good for BMU, it would likely ignore something like Ironworks-Gardens if BMU were its play-out strategy. Also, I would be very interested in seeing what kinds of decks it builds. It may skip Villages entirely because they don't help BMU, but it may also take an approach of getting some terminals like Smithy and Militia and then discovering that maybe it should pick up a Village and then get into some sort of engine. It might also completely underrate +Buys for similar reasons.
A nice upside of this approach is that it will likely automatically play PPR, as this is just part of playing out the state to consider the effect of buying a Province when there are two left.