Topic: Another approach: metric on games (Read 8526 times)

Empathy · « **on:** September 13, 2011, 10:50:40 pm »

New to the forum, but I was wondering if anyone had explored the following possibility in addition to simulations.

We have huge amounts of played games on the council room server. I don't know the exact amount, but I would assume a few millions.

Now there are many many more combinations of kingdoms than that. That being said, many kingdoms look alike: say, you group together most of the villages into a class, or disregard weak cards in a pool where a stronger strategy clearly dominates.

So, if we could manage to group games together by how close they are, it would probably be possible to "simulate" the results of a game just by hashing average scores statistically from previous games that looked similar enough.

There are clearly two problems with this approach, however. First, the grouping method must be both convincing and lead to a sufficient amounts of games to allow for useful statistical evaluations.

One possibility is to just define a metric, a distance, between two games. Then, given a set of kingdom cards (possibly never played before), you could then search for the, say, 1000 closest games given that metric and use those for your statistics and sample games.

Now defining a metric between games seems a daunting task of its own, given that a lot of (somewhat arbitrary) factors could go into it. Fine-tuning the metric could take a fair bit of time, though a good human player should be able to spot if the results "make sense". Lastly, the ideal case would be to have an algorithm that could search for "the closest games" in real time.

I have a rather simplistic metric in mind that would fulfill some of those criteria, though I would first like to hear reactions and propositions before going into any more details.

The code I have is a very crude R script, but it runs nearly in real time for a small database of 100k games. The results are somewhat convincing but the metric clearly has to be tweaked.

rrenaud · « **Reply #1 on:** September 13, 2011, 11:20:01 pm »

Nearest neighbor doesn't scale that well.

If you consider partial game states (eg, player A's turn on game y), then there are about 80 million game states (2 M games * 40 player turns/game) on councilroom. I love you guys, but I sure as hell am not going to be scanning 80M rows very often ;P.

The metric is everything and it's hard to come up with a good one by hand. If you are going to learn the metric, why not just learn the function directly?

But yeah, I think that learning the chance of winning as a function of the game state is a super interesting/useful thing to do. I'll happily give you shell access to councilroom to help you experiment.

Alternatively, you could try to beat the algorithms on the dataset here:

http://mlcomp.org/datasets/872

Though that is doing classification (predict winner) rather than regression (predict probability to win).

rspeer · « **Reply #2 on:** September 14, 2011, 01:10:29 am »

Quote from: rrenaud on September 13, 2011, 11:20:01 pm

Alternatively, you could try to beat the algorithms on the dataset here:

http://mlcomp.org/datasets/872

Though that is doing classification (predict winner) rather than regression (predict probability to win).

WHAT! There are people who actually do this for their research working on this? Why am I not just using their code to evaluate a deck?

Predicting the winner and estimating the probability of winning are basically interchangeable. We should be using the heck out of those algorithms.

Jimmmmm · « **Reply #3 on:** September 14, 2011, 01:25:18 am »

I'm somewhat intrigued by the notion of a metric on games. I suppose one approach would be to start with a metric on cards. For example, d(Village, Village) = 0, d(Village, Working Village) = 1, d(Village, Possession) = something large? ~~Of course, manually deciding on the difference between each pair of cards is impossible, as 129 cards means there are 128! ~ 3.9 * 10^215 pairs of cards.~~ (Edit: there are 8256 pairs of cards, so not an impossible task but still a lot of work.) A simpler approach, perhaps, would be to group them, and say that the distance between any card and another in the same group is 1, and then decide on differences between groups. I have a feeling, though, that many cards will be too unique to fit into any one group. I suppose that some 1-card groups would be okay...

DStu · « **Reply #4 on:** September 14, 2011, 01:39:58 am »

Quote from: Jimmmmm on September 14, 2011, 01:25:18 am

Of course, manually deciding on the difference between each pair of cards is impossible, as 129 cards means there are 128! ~ 3.9 * 10^215 pairs of cards.

It's not really so bad. Number of pairs is just 129*128/2 = 8256, not factorial. That's number of permutations.

rrenaud · « **Reply #5 on:** September 14, 2011, 02:01:58 am »

The classification vs regression problem is at least theoretically interesting.

Consider that if you gave me an oracle that correctly answered "this player is the most likely to win this game" with 100% accuracy, I still can't directly build a good AI with that. The problem is that even if one player has a comfortable margin of victory (say, 65%), no matter what they they do (could push their chance to win down to 55%), the perfect oracle will still tell them they are going to win. Likewise, their adversary can't actually use the oracle to pick a move to optimize his chance of winning, because regardless of what he does, the oracle will say he still has a losing position (even if he increased chance to win from 35% to 45%).

In practice, boosting (which has the best accuracy on my dataset according to the subset of algs I ran on mlcomp) actually gains classification accuracy but has poorly calibrated scores. http://www.cs.cornell.edu/~caruana/niculescu.scldbst.aistats04.draft.ps

Jimmmmm · « **Reply #6 on:** September 14, 2011, 02:03:02 am »

Oh. You are right of course. There are 128+127+...+1 pairs, not 128*...*1. Silly me!

rspeer · « **Reply #7 on:** September 14, 2011, 02:08:53 am »

Ah, now I see it's rrenaud and one other person. Not that that diminishes it, but I thought there was some Netflix-esque competition going on where a lot of people were predicting the winner of Dominion games. I do plan to take a look at your code now that I know it's there.

Is this what you do for your research, rrenaud?

Empathy · « **Reply #8 on:** September 14, 2011, 10:11:18 am »

Quote from: rrenaud on September 13, 2011, 11:20:01 pm

Nearest neighbor doesn't scale that well.

If you consider partial game states (eg, player A's turn on game y), then there are about 80 million game states (2 M games * 40 player turns/game) on councilroom. I love you guys, but I sure as hell am not going to be scanning 80M rows very often ;P.

The metric is everything and it's hard to come up with a good one by hand. If you are going to learn the metric, why not just learn the function directly?

But yeah, I think that learning the chance of winning as a function of the game state is a super interesting/useful thing to do. I'll happily give you shell access to councilroom to help you experiment.

Alternatively, you could try to beat the algorithms on the dataset here:

http://mlcomp.org/datasets/872

Though that is doing classification (predict winner) rather than regression (predict probability to win).

Wow, that is impressive.

My objective was much less ambitious: just put a metric on the initial state of the game (that is, the 10 kingdom cards) and essentially have a button on council room "find closest games". Then you could use these sample "similar" games to get an idea of how long the game should last, what good strategies others have come up with, and so on. Plus you could do some kinds of statistics on average turns and average points to guide your decision making.

And as Jimm suggested, the approach I took for that was first to introduce a metric on cards. I automated that rather than filled in all the 128x128 coefficients matrix by hand. Just plug in all the characteristics of the cards (+2 actions, card type, gold given, "special powers", "cursing" and so on) and fit some kind of intuitive function until you get a matrix that looks right to you.

Normalize it into a correlation matrix, then write down each kingdom as a 128 dimensional vector and correlate kingdoms. A correlation of 1 is then equivalent to having an identical pool.

This ignores non-linear effects, like gardens being more powerful in combination with workshop, which is a big problem. But it makes for very fast computing, because the correlation matrix only needs to be computed once, and then you just go through your database of games, correlating them to the kingdom you are studying.

Code: [Select]

distance <- function(card1, card2){
	d = 0;
	if(card1!= card2){
		d = d + 2 * abs(allcards[["Cost"]][card1]- allcards[["Cost"]][card2]);
		d = d + 3 * abs(allcards[["Actions"]][card1]- allcards[["Actions"]][card2]);
		d = d + 3 * abs(allcards[["Cards"]][card1]- allcards[["Cards"]][card2]);
		d = d + 4 * abs(allcards[["Action"]][card1]- allcards[["Action"]][card2]);
		d = d + 4 * abs(allcards[["Victory"]][card1]- allcards[["Victory"]][card2]);
		d = d + 4 * abs(allcards[["Treasure"]][card1]- allcards[["Treasure"]][card2]);
		d = d + 5 * abs(allcards[["Attack"]][card1]- allcards[["Attack"]][card2]);
		d = d + 1 * abs(allcards[["Reaction"]][card1]- allcards[["Reaction"]][card2]);
		d = d + 3 * abs(allcards[["Duration"]][card1]- allcards[["Duration"]][card2]);
		d = d + 2 * abs(allcards[["VP"]][card1]- allcards[["VP"]][card2]);
		d = d + 1 * abs(allcards[["Coins"]][card1]- allcards[["Coins"]][card2]);
		d = d + 1 * abs(allcards[["Buys"]][card1]- allcards[["Buys"]][card2]);
		d = d + 1 * abs(allcards[["Gain"]][card1]- allcards[["Gain"]][card2]);
		d = d + 7 * atan(abs(allcards[["Trash"]][card1]- allcards[["Trash"]][card2]));
		d = d + 6 * atan(abs(allcards[["Pollute"]][card1]- allcards[["Pollute"]][card2]));
		d = d + 3 * abs(allcards[["Combo"]][card1]- allcards[["Combo"]][card2]);
		d = d + 3 * (allcards[["Special"]][card1] + allcards[["Special"]][card2]);
		d = d + 3 * abs(allcards[["Discard"]][card1]- allcards[["Discard"]][card2]);
		d = d + 1 * abs(allcards[["Cycle"]][card1]- allcards[["Cycle"]][card2]);
		d = d + 100 * abs(allcards[["Win"]][card1]- allcards[["Win"]][card2]);
	}
	return(d);
}

This is in R the function I currently use to give a metric between cards. All the types until buys are standard. "Gain" stands for cards that can cheat their way through the 1 buy rule to gain a card (remodel, upgrade...). I gave that the same importance as a buy. Trash is how many cards the selected card can trash. Pollute is when it makes the other player gain cards in some way. Combo is basically a general term for "this card works well in action heavy decks". Anything that intuitively fits with conspirator goes in there. "Special" is a category that says that the effect of this card is unique (think: goons or gardens) and just creates distance between it and any other card. Discard is anything that reduces your opponent's hand size in some way (militia, minion, bureaucrat...). Cycle is any effect that enables you to cycle hands from your card for any kind of effect (vault, cellar, hamlet...) and win is the win rate of a card as measured by the councilroom stats.

I'd post the correlation matrix I get, except it's big and ugly. For fun, the two closest non-identical cards are apparently worker's village and farming village (0.81 correlation).

I'm sure the above formula needs to be refined a lot. That being said, I like this approach because it taps into this huge database of games we have and complements nicely the simulation approach.

rrenaud · « **Reply #9 on:** September 14, 2011, 10:16:02 am »

I don't actually do research

. I proposed that I try to make a dominion AI that learns from the game data to an ML researcher at NYU (where I take masters classes part time), but he didn't go for it. Unsurprisingly, he wanted me to do research in his group. But I have been reading tons of ML papers recently, with a focus towards ensemble methods and Random Forest in particular.

I didn't write any of those algorithms on mlcomp. The top performing one on my dataset was boosted decision stumps written by Robert E. Schapire (one of the inventors of Adaboost) and Yoram Singer (who does large scale ML research at Google).

rrenaud · « **Reply #10 on:** September 14, 2011, 10:26:18 am »

I was thinking about a method kind of like yours for learning card similarities, but it uses play data instead of the card abilities.

Imagine you have a matrix card by card matrix that has play information from the 2M games on isotropic. There are a whole bunch of different kind of information you might care about, say, x and y were played in the same turn or maybe x and y both occurred in a winning deck or maybe the win rate of x given y is in the deck.

Pick your favorite card by card stat and call the observed matrix M.

If you want to see how similar cards A and B are, compute the cosine distance of their rows in the matrix. Two cards will be similar if there relation to other cards in the deck is similar.

rspeer · « **Reply #11 on:** September 14, 2011, 11:15:49 am »

Quote from: rrenaud on September 14, 2011, 10:16:02 am

I don't actually do research . I proposed that I try to make a dominion AI that learns from the game data to an ML researcher at NYU (where I take masters classes part time), but he didn't go for it. Unsurprisingly, he wanted me to do research in his group. But I have been reading tons of ML papers recently, with a focus towards ensemble methods and Random Forest in particular.

I didn't write any of those algorithms on mlcomp. The top performing one on my dataset was boosted decision stumps written by Robert E. Schapire (one of the inventors of Adaboost) and Yoram Singer (who does large scale ML research at Google).

Okay, I just don't know how to read that site. Anyway, this is all awesome and I'm seeing if I can download the boosting one.

Fangz · « **Reply #12 on:** September 14, 2011, 11:24:02 am »

I'm actually a research student in statistics. If you are using out of the box methods, I'm pretty sure I can do better. If I had the time anyway!

rspeer · « **Reply #13 on:** September 14, 2011, 11:38:24 am »

So, what are the features those models are running on? I downloaded some code and data, and find that they're treating Dominion intermediate states as vectors of 600-ish unlabeled values.

rrenaud · « **Reply #14 on:** September 14, 2011, 12:04:18 pm »

mlcomp unfortunately uses libsvm format, which throws away all the names on all of the labels.

The feature selection code is here:

https://github.com/rrenaud/dominionstats/blob/master/game_state_features.py

My feature selection has been partially optimized (biased!) for algorithms that use axis aligned trees. This is why it is actually doing a difference of per player features and adding the difference as a feature. Remember that variable importance plot that showed big gains for the difference features? http://councilroom.com/static/feature_importance.png features starting with o are differences between my deck and opponent deck, while those with s are absolute count of my own deck. o's tend to be more important than s's.

Also note that on MLcomp, that is using about 1/30th of the actual data, and it's taking some algorithms 2 hours to do the optimization. Half of the algorithms hit a memory limit barrier and die. Off the shelf methods in R just don't scale. Also note that the best algorithms on a tiny dataset underperform the worst algorithms on a large data set, http://mlcomp.org/datasets/871 (and some of those are already taking minutes to optimize a set 1/200th of the large set, which is itself 1/30th of the real dataset). But I'd be happy to be proven wrong.

rspeer · « **Reply #15 on:** September 14, 2011, 01:43:16 pm »

That all gives me motivation to keep working on Golem and throwing Stochastic Gradient Descent at it.

SGD doesn't have to keep lots of state in memory, it just blazes through the input and either gives you a classifier at the other end, or explodes in a flaming ball of NaN. (If I can figure out how to make it stop doing that, it'll be great)

rrenaud · « **Reply #16 on:** September 14, 2011, 02:19:34 pm »

I played a bit with sofia-ml (and met and talked to the author), it learns models linear logistic quickly with sgd (on the order of minutes instead of hours), but they somewhat underperform random forests with those features (which somewhat underperforms boosting, but the RFs are optimizing the regression rather than classification metric). I picked sofia-ml because it's crazy simple (like order of magnitude less code than VW) and it already takes libsvm format rather than it's own bastard thing. I tried do add quadratic features to the sgd logistic regression model using sofia-ml, but they actually made it worse. The author suggested doing kmeans as preprocessing, and then using cluster membership as a feature to get some non-linearity, which I might try. But it would probably help kmeans to know that all of the villages (maybe except fishing village) are pretty much the same, and hence the ideas about learning card similarity.

timchen · « **Reply #17 on:** September 30, 2011, 01:00:12 am »

Quote from: Empathy on September 14, 2011, 10:11:18 am

And as Jimm suggested, the approach I took for that was first to introduce a metric on cards. I automated that rather than filled in all the 128x128 coefficients matrix by hand. Just plug in all the characteristics of the cards (+2 actions, card type, gold given, "special powers", "cursing" and so on) and fit some kind of intuitive function until you get a matrix that looks right to you.

Normalize it into a correlation matrix, then write down each kingdom as a 128 dimensional vector and correlate kingdoms. A correlation of 1 is then equivalent to having an identical pool.

Does this really work? Suppose you have a 3-card kingdom: Farming Village, Village, Walled Village. If you define the overlap of this with itself to be 1 (note that there will be 9 terms), how about Village, bridge, coppersmith and itself?

I think you need some exclusion procedure; ie, you can only match one card in one kingdom to to another card in another kingdom. Maybe you just start with the highest match and then look at the remaining 9*9 submatrix and repeat. I am not sure if this procedure actually give you the highest correlation: for example, one might want to match (Fishing Village, Bazaar) to (Bazaar, Market) if there are no other +2 action card in the second kingdom.

rrenaud · « **Reply #18 on:** September 30, 2011, 01:37:19 am »

I guess the combinatorial problem you'd want to solve is to find the maximum weight perfect matching between the two game supply, so you get the exhaustion effect tim was saying needs to happen, so that the nearly equal villages don't all get high correlation against the single other village in the set.

But in practice, usually half the kingdom cards don't really matter at all, so there is a lot of noise contributed by irrelevant cards.

Empathy · « **Reply #19 on:** October 01, 2011, 09:48:57 am »

Quote from: timchen on September 30, 2011, 01:00:12 am

Quote from: Empathy on September 14, 2011, 10:11:18 am
And as Jimm suggested, the approach I took for that was first to introduce a metric on cards. I automated that rather than filled in all the 128x128 coefficients matrix by hand. Just plug in all the characteristics of the cards (+2 actions, card type, gold given, "special powers", "cursing" and so on) and fit some kind of intuitive function until you get a matrix that looks right to you.

Normalize it into a correlation matrix, then write down each kingdom as a 128 dimensional vector and correlate kingdoms. A correlation of 1 is then equivalent to having an identical pool.

Does this really work? Suppose you have a 3-card kingdom: Farming Village, Village, Walled Village. If you define the overlap of this with itself to be 1 (note that there will be 9 terms), how about Village, bridge, coppersmith and itself?

I think you need some exclusion procedure; ie, you can only match one card in one kingdom to to another card in another kingdom. Maybe you just start with the highest match and then look at the remaining 9*9 submatrix and repeat. I am not sure if this procedure actually give you the highest correlation: for example, one might want to match (Fishing Village, Bazaar) to (Bazaar, Market) if there are no other +2 action card in the second kingdom.

I thought about this, and while the exclusion procedure would be more accurate, it is also less scalable, as rrenaud pointed out. I wanted to avoid combinatorics and proposed this very simple, hopefully scalable and probably crude method for this reason.

As for the specific problem: I define my correlation between two kingdom vectors to be x^T M y / (sqrt(x^TMx) sqrt(x^TMy)), where x^T is x transpose. Now if you, for the sake of this argument, consider dominion to only have the 5 cards cited above, then vector x would be (1,1,1,0,0) and y =(0,1,0,1,1). The fact that the first vector has three almost identical cards matters (at least with this approach), because it slightly harder for another pool to look alike. But it doesn't prevent the correlation of someone else to add up to one.

I hope this works and clears things up.

timchen · « **Reply #20 on:** October 01, 2011, 03:45:33 pm »

This approach works for the normalization problem I mentioned, but I think would produce weird skew in general... Suppose in one kingdom you have 3 village-type cards and all other cards are different. This means you are going do divide every correlation by sqrt(3+x), whereas ideally you want to get 1/3 the correlation from villages and 1 for everything else. This means that correlations between similar cards are weighted more heavily than they should.

Along the line I proposed, I think we can improve the calculation by doing monte carlo. Just pick the card pair with correlation x with some probablility p(x) which is monotonic in x (probably also quite steep). Limiting the sample size and take the highest/average correlation you get from a finite sample size N.

Empathy · « **Reply #21 on:** October 02, 2011, 09:47:22 am »

I agree, it just seemed like the fastest (laziest) thing to implement at the time.

The Monte-Carlo method to solve the high dimensionality aspect of the combinatorial problem sounds like a great fix.

Now I just have to motivate myself to code stuff.

But overall, does the idea of having a "find similar games" algorithm sound useful for councilroom?

rrenaud · « **Reply #22 on:** October 03, 2011, 02:34:51 am »

I am not going to use a metric on games for anything in councilroom, because scanning the DB to find related games is just going to be too slow. OTOH, and metric on cards could be useful for things like dimensionality reduction for training win predictors.

Dominion Strategy Forum

News:

Author Topic: Another approach: metric on games (Read 8526 times)