Dominion Strategy Forum

Dominion => Dominion General Discussion => Topic started by: ben_king on January 08, 2015, 12:52:17 pm

Title: Dominion Data Mining: Cards that correlate with skill
Post by: ben_king on January 08, 2015, 12:52:17 pm
Working off suggestions from the previous thread, I've taken the 90,000 game database of games by top-100 players and looked at how the player's skill (represented by their TrueSkill rating) correlates with what cards they gain.  Fortunately, top-100 players also play plenty of low-ranked players, so there are lots of examples of both low- and high-ranked players in the database.

The table below shows two different rankings.  Both measure the correlation coefficient between gaining a certain card and the skill of the player.  Positive numbers mean that the card tends to be bought more often by high-ranked players than low-ranked players.  Negative numbers mean that the card tends to be bought more often by low-ranked players.  The ranking on the left is an unweighted ranking, which means that the correlation is between skill and whether the card gets gained or not (gaining it multiple times doesn't make any difference.  The ranking on the right is a weighted ranking, which means that here the number of times the card is gained makes a difference.    So on the right side, you can say, for example, that good players seem to know to get lots of wharves, whereas bad players seem to overbuy some good cards like Tournament or Bishop.

Unweighted ranking_____________________Weighted ranking
RankCardCorrelationRankCardCorrelation
1Butcher0.1271Governor0.206
2JackOfAllTrades0.1062Wharf0.172
3Wishing Well0.1043JackOfAllTrades0.148
4Vineyard0.0874Hunting Party0.144
5Governor0.0855Wishing Well0.141
6Chancellor0.0816Vineyard0.138
7Duke0.0807Stonemason0.112
8Courtyard0.0778King's Court0.111
9Warehouse0.0719Apothecary0.111
10Masterpiece0.07010Butcher0.107
11Scavenger0.06511Duke0.106
12Fairgrounds0.06212Stables0.100
13Oracle0.05613Scrying Pool0.097
14Journeyman0.05614Horn of Plenty0.096
15Apothecary0.05515Fairgrounds0.095
16Masquerade0.05516Menagerie0.095
17Counterfeit0.05417Oracle0.094
18Duchess0.05118Watchtower0.093
19Stonemason0.05019Masquerade0.088
20Ambassador0.04120Warehouse0.087
......
187Moneylender-0.094187Spy-0.089
188Trade Route-0.096188Remodel-0.093
189Expand-0.096189Forge-0.096
190Treasure Map-0.099190Soothsayer-0.102
191Coppersmith-0.103191Tournament-0.102
192Marauder-0.106192Island-0.105
193Soothsayer-0.107193Expand-0.105
194Talisman-0.113194Baker-0.106
195Transmute-0.115195Feast-0.106
196Spy-0.118196Trade Route-0.107
197Golem-0.118197Transmute-0.109
198Feast-0.123198Tribute-0.114
199Mine-0.128199Bishop-0.119
200Alchemist-0.132200Treasure Map-0.127
201Tribute-0.155201Mine-0.130
202Bishop-0.158202Scout-0.145
203Scout-0.171203Marauder-0.146
204Taxman-0.177204Saboteur-0.168
205Saboteur-0.195205Taxman-0.172
206Pirate Ship-0.262206Pirate Ship-0.204
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: jsh357 on January 08, 2015, 01:29:27 pm
This is probably the most interesting data to me so far.  The bottom 5-7 on the left chart says everything I'm always thinking when I see players go for bad plans, though I am somewhat surprised Marauder is down there.  I always thought I ignored it too often.

And of course, Vineyard being near the top on the right makes sense to me.  When I play lower ranked players, I very often go for Vineyard all on my own and laugh as I have 8 9-point cards.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: liopoil on January 08, 2015, 01:50:57 pm
A few of these cards (governor, wharf, hunting party) may be inflated on the right because when both players go for them, more often then not the better player will win the split.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: Awaclus on January 08, 2015, 02:23:40 pm
A few of these cards (governor, wharf, hunting party) may be inflated on the right because when both players go for them, more often then not the better player will win the split.

Does that really make the results inaccurate, though? If the card is so good that both players usually go for it, it probably deserves to be inflated.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: liopoil on January 08, 2015, 02:26:00 pm
A few of these cards (governor, wharf, hunting party) may be inflated on the right because when both players go for them, more often then not the better player will win the split.

Does that really make the results inaccurate, though? If the card is so good that both players usually go for it, it probably deserves to be inflated.
No, this is measuring cards that are underrated/overrated by lower ranked players, not how good the card is. Fishing Village is a great card, but everyone knows that so it's not on the list. If both players go for it that game should be a wash, but because one player is just better at winning splits, it isn't.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: Awaclus on January 08, 2015, 02:54:26 pm
Fishing Village is a great card, but everyone knows that so it's not on the list.

It would be on the right list if it was a card that you need 654165541 copies of, because if everyone knew that, everyone would always try to get as many as possible, and the better player would always win the split.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: 2.71828..... on January 08, 2015, 03:10:04 pm
I recognize a full list would take too much space as a post, but could you put it in a file to download or something?
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: liopoil on January 08, 2015, 05:46:30 pm
Fishing Village is a great card, but everyone knows that so it's not on the list.

It would be on the right list if it was a card that you need 654165541 copies of, because if everyone knew that, everyone would always try to get as many as possible, and the better player would always win the split.
Better players don't put more priority in getting Governors, Hunting Parties or Wharves in most games; Better players are better at getting Governors, Hunting Parties, or Wharves. Weaker players open Taxman and then never hit $5, which is why taxman is on the list.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: Merudo on January 08, 2015, 08:32:18 pm
grsbmd, I don't think the correlation coefficient is at all adequate for the "unweighted ranking".

The main problem is that you are trying to estimate an association between two binary variables (high-ranked / low-ranked with buy/don't buy).

EDIT: I thought the player ranks had been dichotomized, but they were not. My apologies.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: ben_king on January 09, 2015, 12:13:42 am
The main problem is that you are trying to estimate an association between two binary variables (high-ranked / low-ranked with buy/don't buy).
...

Thanks for the good thought Merudo.  Fortunately, I think the problem is not quite that dire.  Player skill is actually a continuous variable, which makes the correlation much more reliable.  That being said, the skill distribution in my dataset is biased by my collection method.  Just so that everyone is aware of what kind of biases might be playing into these results, I've attached a histogram of the skill levels represented.

Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: WanderingWinder on January 09, 2015, 09:26:50 am
So, the biggest thing to note here is that all of the numbers are tiny. You're not finding anything significant. Well, maybe statistically significant, but not practically so.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: ben_king on January 09, 2015, 11:02:37 am
So, the biggest thing to note here is that all of the numbers are tiny. You're not finding anything significant. Well, maybe statistically significant, but not practically so.

I'd argue that finding correlations this large is actually fairly substantial.  If we assume that how often you buy a card is independent of other cards (which is a fairly reasonable assumption as far as independence assumptions go, since in full random the chance of getting any two specific cards in a kingdom is ~0.002%), then the r^2 values range from 0.7% to 4%.  This means that statistically, I can explain 4% of the variation in skill among players simply by looking at how often the player buys Governor.  If you sum up the top 20 cards on the weighted list, that explains 29% of the variance in the skill.

That's huge.  This doesn't even include things like how cards are played once they're bought, when to start greening, etc.  So the fact that we can explain so much of the variance in skill simply by a how often a few cards are bought is a really big deal.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: flies on January 09, 2015, 03:47:31 pm
(does anyone know why people talk about explaining X% of the variance and not the deviation? why is [unit^2] more meaningful than [unit]?)
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: flies on January 09, 2015, 03:51:17 pm
is it possible to represent these data as some kind of function like, difference in likelihood of purchase/number purchased as a functino of difference in rank?  These correlations seem hard to interpret.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: ben_king on January 09, 2015, 04:53:50 pm
(does anyone know why people talk about explaining X% of the variance and not the deviation? why is [unit^2] more meaningful than [unit]?)

Probably the main reason that we talk about variance rather than deviation is that variance is always positive (since it's a squared value), whereas deviation can be positive or negative.  So you'd have to use the absolute value of the deviation, which gets messy mathematically.  There's also a nice linear relationship with sum of squared deviations that isn't there when you try to sum absolute deviations (but that's more on the technical side).

is it possible to represent these data as some kind of function like, difference in likelihood of purchase/number purchased as a functino of difference in rank?  These correlations seem hard to interpret.

To get such a function, I would need to run linear regression on this data and pull out those coefficients.  The correlation coefficient (what's presented in the first post) simply measures how close to linear the relationship between skill and gaining a certain card is.  It ignores the strength of that relationship.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: popsofctown on January 09, 2015, 07:53:24 pm
I have a suggestion for another similar analysis.  (this one is definitely cool)

If only one player in a game buys a card, and the other player does not, what % of the time is the player who bought the card the one with a high skill rating? (You could weight it by the skill gap, but I don't think that would be that helpful)


The way I understand what you did, if both the good and bad player frequently buy a card, it waters down the card's positive score.  Since both players are usually at least good enough to know Ambassador is a good idea it has a weaker score, while Vineyard is much higher up because there's a lot more people out there that haven't advanced enough to know that Vineyards dominants a heck of a lot of boards.  The statistics given are very interesting in a way that is neither better than worse but I'm interested in the result of my suggestion if it interests you.

If I totally misunderstood and that's already what you did, oops.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: c4master on January 10, 2015, 07:41:27 am
Can someone estimate the likelyhood of a card getting +/-0.100 correlation just by chance? Or do we need further assumptions? We have more than 200 cards and I'm wondering whether some outliers could be explained by this.

I'm very surprised to see Moneylender on this list. Seriously, when do you ignore it? When there's better trashing, I guess. Or, when you're going for BM or a slog, maybe. I thought, it would be pretty easy to see when Moneylender is good and when it's not.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: theblankman on January 10, 2015, 10:45:44 am
Working off suggestions from the previous thread, I've taken the 90,000 game database of games by top-100 players and looked at how the player's skill (represented by their TrueSkill rating) correlates with what cards they gain. 
I must've missed the previous thread... has someone put up that whole DB to be downloaded?  I have an experiment or two I'd like to try myself. 
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: noughtpointzero on January 10, 2015, 01:22:17 pm
Can anyone explain why wishing well is so high?
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: A Drowned Kernel on January 10, 2015, 01:35:31 pm
Can anyone explain why wishing well is so high?

It's a good card that lower-level players frequently underestimate. Good deck tracking can make the draw quite reliable.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: Mic Qsenoch on January 10, 2015, 01:38:42 pm
Can anyone explain why wishing well is so high?

Because it's awesome.

It's a good substitute for Silver in the opening. It gives you a good shot at hitting $5 after the first shuffle, but then doesn't get in the way of drawing your deck like a Silver will. And of course throughout the game you can still expect to get the Lab effect occasionally and sometimes more than occasionally if you're tracking your deck. And there are some decent combos as well: Apothecary, Wandering Minstrel, Cartographer.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: Jack Rudd on January 10, 2015, 02:27:42 pm
Can anyone explain why wishing well is so high?
I explain it thus (http://dominionstrategy.com/2011/05/30/annotated-game-8). Or rather, theory explains it thus.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: popsofctown on January 10, 2015, 04:14:15 pm
The only surprise is that Wishing Well is not even higher.  You have

1: Wishing Well is more engine friendly than Silver, and high skill players are more engine leaning than low skill players in general.
2: High skill players are more likely to recognize the good things about Wishing Well and know that it is good on the board, just like most of the other stuff that did well on this list

And the big bazooley
3: Since good players deck track better than bad players, there's a nontrivial amount of the time that it is correct play for the good player to buy it AND for the bad player not to buy it.  There is an actual difference in how the card is going to perform for each of the players.  That's true of few other cards. 
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: DG on January 10, 2015, 04:30:04 pm
I played Rabid last week and we'd emptied the wishing well pile before his 5th turn, which is quite good going. Essentially in any deck where silver is a poor card a wishing well is likely to be a strong card. Top players will buy and gain them in preference to silver (at the right times).

The Council Room stats for isotropic included win rates for when players gained/bought (or didn't buy/gain) a kingdom card. These might be interesting figures to compare again now we have all the cards on Goko.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: HiveMindEmulator on January 10, 2015, 07:55:09 pm
Can anyone explain why wishing well is so high?

Because it's Stef's avatar.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: ben_king on January 10, 2015, 11:44:27 pm
I have a suggestion for another similar analysis.  (this one is definitely cool)

If only one player in a game buys a card, and the other player does not, what % of the time is the player who bought the card the one with a high skill rating? (You could weight it by the skill gap, but I don't think that would be that helpful)

Definitely.  That's one of the things I'm planning to do next with this data.

Can someone estimate the likelyhood of a card getting +/-0.100 correlation just by chance? Or do we need further assumptions? We have more than 200 cards and I'm wondering whether some outliers could be explained by this.

A correlation of +/- 0.100 would be about 10 standard deviations from the mean, so it's pretty safe to say that none of the results in the first post are due to chance.

I must've missed the previous thread... has someone put up that whole DB to be downloaded?  I have an experiment or two I'd like to try myself. 

It's actually quite easy to get this data.  The data is from gokosalvager.com.  All I did was download all the game logs for each of the top 100 players.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: Davio on January 11, 2015, 04:48:10 am
Another idea for analysis would be to look at the impact certain cards have on a kingdom.

Which cards cause other cards to be bought more or less?
Which cards are always useful (low standard deviation in how often they're bought) and which cards require a more specific kingdom?

We sort of already know the answers to these questions, but maybe the math can reveal something.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: WanderingWinder on January 11, 2015, 07:31:10 am
Can someone estimate the likelyhood of a card getting +/-0.100 correlation just by chance? Or do we need further assumptions? We have more than 200 cards and I'm wondering whether some outliers could be explained by this.

A correlation of +/- 0.100 would be about 10 standard deviations from the mean, so it's pretty safe to say that none of the results in the first post are due to chance.

How do you come up with that?
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: Ratsia on January 11, 2015, 08:12:37 am
A correlation of +/- 0.100 would be about 10 standard deviations from the mean, so it's pretty safe to say that none of the results in the first post are due to chance.
How do you come up with that?
Can't say how he did that, but generally permutation tests are very good for questions like that. Simply randomly permute the player IDs so that no information about the true good/bad player dichotomy remains and re-compute the correlation. Repeat 1000 times or so. This gives readily both the mean and the variance for the null-distribution and (if one is into such things) enables trivial statistical significance testing by counting how many of the 1000 replicates are above the observed value.

It rarely pays off to do any other kinds of tests, and the only thing one has to think about is what exactly to permute (for example, here it probably makes a difference whether the permutation is for each player-rating pair of for each individual match).
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: ben_king on January 11, 2015, 02:08:27 pm
How do you come up with that?

I used bootstrap sampling.  http://en.wikipedia.org/wiki/Bootstrapping_(statistics) (http://en.wikipedia.org/wiki/Bootstrapping_(statistics))

The highest standard deviations are for Promo cards (which have the least data), so you should be slightly more wary of Promo cards in the original list, but even Prince only has a standard deviation of 0.03.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: WanderingWinder on January 11, 2015, 03:34:48 pm
How do you come up with that?

I used bootstrap sampling.  http://en.wikipedia.org/wiki/Bootstrapping_(statistics) (http://en.wikipedia.org/wiki/Bootstrapping_(statistics))

The highest standard deviations are for Promo cards (which have the least data), so you should be slightly more wary of Promo cards in the original list, but even Prince only has a standard deviation of 0.03.

This still doesn't tell me what you did. Yes, I understand bootstrapping - statistics is my job - but I don't know, for instance, what .03 is the standard deviation of. So you took a bootstrap - of what? Games? Players? How big was the bootstrap? How big is the entire thing? You realize that these things aren't uncorrelated, right?

The reason I'm so curious is that 10 standard deviations means actually nothing. Seriously. You make the claim that it means these are really big effects you wouldn't see by random chance. The problem is, if you're implying there's a normal distribution, the chance of getting a result like you claim is so small, I can't get a computer to give me that calculation (at least, within a few minutes on my home machine; point is, it's REALLY small, small enough you need to use a non-standard data type). Even if it's a wacky distribution, Chebyshev's theorem tells us we shouldn't really be getting these kinds of results. So, if I were doing this calculation and getting this result, my natural assumption would be that I had done something wrong in my calculation.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: ben_king on January 11, 2015, 03:57:32 pm
This still doesn't tell me what you did. Yes, I understand bootstrapping - statistics is my job - but I don't know, for instance, what .03 is the standard deviation of. So you took a bootstrap - of what? Games? Players? How big was the bootstrap? How big is the entire thing? You realize that these things aren't uncorrelated, right?

The reason I'm so curious is that 10 standard deviations means actually nothing. Seriously. You make the claim that it means these are really big effects you wouldn't see by random chance. The problem is, if you're implying there's a normal distribution, the chance of getting a result like you claim is so small, I can't get a computer to give me that calculation (at least, within a few minutes on my home machine; point is, it's REALLY small, small enough you need to use a non-standard data type). Even if it's a wacky distribution, Chebyshev's theorem tells us we shouldn't really be getting these kinds of results. So, if I were doing this calculation and getting this result, my natural assumption would be that I had done something wrong in my calculation.

I feel like I get more scrutiny here than with peer review.  The reason I post these is so that people can learn from them if they find them useful.  I have my own thesis I need to finish, so I don't have time to write another one about Dominion.

Normality is a reasonable assumption in the absence of evidence to the contrary, which is why I make that assumption.  I haven't had time yet to see if the distribution of correlation coefficients seems to fit that well.  Your post might be that evidence to the contrary.  The standard deviation of the coefficient averaged over all cards when I run 100 bootstraps on games is 0.01.  The range is 0.007 to 0.032.  I've also done a second experiment where I take each game and ignore which players buy which cards and instead assign these randomly.  This also produces a standard deviation on the correlation coefficient of ~0.01.  In 20+ runs times 206 cards per run, I was not able to get a chance correlation of greater than 0.04.

So it seems very unlikely to me that a card could achieve a correlation of +/- 0.100 simply by chance.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: rrenaud on January 13, 2015, 03:33:50 pm
Maybe you should get better peer reviewers? :) 

Great job with this! 

FWIW, I wouldn't take criticisms as offenses, but more like challenges.  Can I make the explanation better/clearer?  Is there another analysis that can demonstrate the same point without the noted problems?

But of course, only do it so much as you enjoy it.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: Polk5440 on January 13, 2015, 05:10:37 pm
I feel like I get more scrutiny here than with peer review. 

You would be correct.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: Davio on January 14, 2015, 03:48:16 am
I love watching experts go head to head on a topic I know very little about. ;D
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: SwitchedFromStarcraft on January 14, 2015, 08:10:12 am
Want some popcorn?
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: WanderingWinder on January 30, 2015, 08:33:05 am
So I wanted to say some more things here, because this has been bugging me a bit.

First of all, it wasn't my intention to try to insult or attack grbsmd or his work here. It seems some people have gotten that impression, and this was not my intention. Most specifically to grbsmd himself, if you feel this way, I really am sorry for that.


I also feel like while my gut told me that something was wrong of the analysis that 'these are significant findings', I didn't present a terribly good explanation of why that isn't true. I'm going to try to do that now.

That the details of the bootstrap procedure isn't really important (it would be in a peer-reviewed scholarly paper, but whatever, this isn't, we can just take your word here that you did it 'a lot'). You don't need to give standard deviations on this (I'm presuming what you quoted are actually standard errors, but again, whatever, no big deal), though, because when you're bootstrapping, you can just give the exact percentage of bootstraps you crossed your threshold. Also, Standard Deviations of a metric like r don't really make sense, as it is a summary statistic; it would be like asking what the standard deviation of mean player skill is, or the standard deviation of the maximum. But anyway, that's not the point - even if I don't believe what 10 SE implies, these numbers probably are quite assuredly different from zero in a 'statistical significant' sense. But basically what that means is, we're really sure they're different from zero. It doesn't tell us how far from zero they are.



The biggest point I have to make is my original one: these numbers show that players' abilities to correctly rate the general strength of cards is not a very big chunk of how strong they are. I'm going to quote the rebut of my original statement here:


So, the biggest thing to note here is that all of the numbers are tiny. You're not finding anything significant. Well, maybe statistically significant, but not practically so.

I'd argue that finding correlations this large is actually fairly substantial.  If we assume that how often you buy a card is independent of other cards (which is a fairly reasonable assumption as far as independence assumptions go, since in full random the chance of getting any two specific cards in a kingdom is ~0.002%), then the r^2 values range from 0.7% to 4%.  This means that statistically, I can explain 4% of the variation in skill among players simply by looking at how often the player buys Governor.  If you sum up the top 20 cards on the weighted list, that explains 29% of the variance in the skill.

That's huge.  This doesn't even include things like how cards are played once they're bought, when to start greening, etc.  So the fact that we can explain so much of the variance in skill simply by a how often a few cards are bought is a really big deal.

First of all, as has been pointed out, some of this is down to cause vs effect. I usually win when I get more provinces. Is that because I value province more? No - in fact, I'm pretty sure I value province less than most players. But when I get more, I am just more likely to win. It's like the old John Madden quote "the team that scores more points - well, they usually win the game".

Moreover, I don't think you're looking at these numbers correctly. I assume that you are taking gain rate when available in the kingdom, rather than cards bought per game (in which case, you'd only get information about set ownership really drowning out most everything else). Which means you can't really combine all these different values together. Moreover, you can't add the 'variance explained' at all. If you wanted to do that, you would want to multiply, 96% of the variance remains unexplained from the first card, 96% from the second leads us to a bit more than 92% remaining unexplained after two. The difference is pretty small between two cards, but once you're compounding 200 times, it will add up.

Most importantly, though, you really can't combine these together at all. You ran a whole bunch of correlations between the single card's gain rate and player skill. This gives you a bunch of different things. However, what you WANT to do is run one multiple correlation. You really should only get one combined r. And the independence assumption breaks down, HARD. If I don't buy an A, that means I probably bought a B. The things are absolutely related to each other, though again, some of what you'll see on the right is that better players buy more stuff overall, but that is because they are better, not why they are better. On the left, you're going to end up seeing that overall, it's going to be something like 5% (or less) of the variance in skill is explained by knowing if a card is good or bad. On the right, it will be somewhat higher, but again, I think this difference is mostly down to the cause/effect imbalance.

I mean, quick back-of-the-napkin shows that, because you only ever have 10 kingdom cards (yes, there are edge cases), if you multiply across a kingdom, even if it's close to the highest-scoring one, you're going to get 1-.99^10, ~= 9.6% of the variation in the skill between the players comes from knowing which cards are more rawly powerful than the others. Once you correctly take independence into consideration, and/or take an average set of 10 cards, I expect that will come down a lot from this even.



The thing is, yes, some cards are better than others, but by far, there is a lot more skill in knowing what is good or bad on a particular board. And then more skill yet in knowing how to sequence things, adjust to the gamestate and opponents' plans, etc. Knowing raw card skill is just a really small thing, and one that's pretty easy to pick up on.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: werothegreat on January 30, 2015, 09:30:18 am
I find it interesting how, even unweighted, Masterpiece is bought more often by experienced players, especially given the other list where Masterpiece was the least-gained $3 card.
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: DStu on January 30, 2015, 02:03:00 pm
For me, 0.1 is not tiny, it's a buisness case...
Title: Re: Dominion Data Mining: Cards that correlate with skill
Post by: TheExpressicist on January 30, 2015, 02:50:42 pm
Well done. This is interesting and useful data.

I am not sure why we are even entertaining the notion that these numbers might explain what makes up a player's skill. The data is already very useful, it doesn't need to be a gauge of people's skill. But yeah, it doesn't matter how strong, weak, valid or invalid the correlation is: there's nothing in the numbers that suggests Gain%  impacts Skill, rather than the other way around. Common sense would dictate that a player's skill is what impacts a card's gain rate, not the other way around.

As I mentioned, the data is useful enough as it is. It shows us which cards are purchased more often by good players and more often by bad players. If you had asked me beforehand what impacts a card's gain %, I would say it's a combination of player skill, other kingdom cards, and isolated card strength. So it's nice to have at least one of those variables knocked out.