Dominion Strategy Forum

Please login or register.

Login with username, password and session length
Pages: 1 [2]  All

Author Topic: Trying to infer card similarities, some late night councilroom hacking  (Read 12095 times)

0 Members and 1 Guest are viewing this topic.

rrenaud

  • Administrator
  • *****
  • Offline Offline
  • Posts: 991
  • Uncivilized Barbarian of Statistics
  • Respect: +1197
    • View Profile
    • CouncilRoom
+5

I worked on the presentation some more.  Rotated graphs, split into cheap/expensive, got the fresh data, fixed the adventurer bug.

http://mine.councilroom.com/~rrenaud/cheap_group_win_prob.png
http://mine.councilroom.com/~rrenaud/expensive_group_win_prob.png
Logged

qmech

  • Torturer
  • *****
  • Offline Offline
  • Posts: 1918
  • Shuffle iT Username: qmech
  • What year is it?
  • Respect: +2320
    • View Profile
0

This seems to have produced some accurate pairings:
Quote
Ambassador-Chapel(-Remake-)Steward-Masquerade
Bishop-Monument
Silk Road-Gardens
Smithy-Envoy(-Courtyard-Oracle)
Swindler-Black Market (which often fit the better-than-nothing terminal Silver opening category)
Horse Traders-Feast
Warehouse-Cellar
Worker's Village-Hamlet
[Villages] and Throne Room
Upgrade-Apprentice
Jester-Haggler
Wharf-Margrave
Witch-Mountebank
(Possession-Outpost)
(Philosopher's Stone-Counting House)

the vector I form is the current cards win rate and chance of being purchased when the other cards are in the set, rather than the other cards chance of being purchased given this card is in the set.
Another low level summary: the script groups together cards that play similarly if you swap out one for the other.


Logged

blueblimp

  • Margrave
  • *****
  • Offline Offline
  • Posts: 2849
  • Respect: +1560
    • View Profile
+2

Do you think it would be possible to do this for players too? That is, use their per-card stats like %+, win rate, and so on, to estimate similarities. Or is this awkward to do?
Logged

rrenaud

  • Administrator
  • *****
  • Offline Offline
  • Posts: 991
  • Uncivilized Barbarian of Statistics
  • Respect: +1197
    • View Profile
    • CouncilRoom
+1

I think it could be cool to compute player vs player similarities via popular buys stats.    I'd guess the %+ per card would be pretty reasonable feature.  Other interesting high level dimensions might be actions played per game, probability of opening double terminal, fraction of purchases that go to money vs action, time to get first province/duchy, probability of mega turn.

Right now the pop buys/player pages are just dog slow.  Caching the internal computations there is probably an important enabler for something like what you propose.
Logged

theory

  • Administrator
  • *****
  • Offline Offline
  • Posts: 3603
  • Respect: +6125
    • View Profile
    • Dominion Strategy
0

What do the different colors mean?
Logged

rrenaud

  • Administrator
  • *****
  • Offline Offline
  • Posts: 991
  • Uncivilized Barbarian of Statistics
  • Respect: +1197
    • View Profile
    • CouncilRoom
0

The colors came from the dendrogram library implementation, I didn't control it.  I guess they roughly correspond to large, distinct areas in the clustering.
Logged

Kirian

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 7096
  • Shuffle iT Username: Kirian
  • An Unbalanced Equation
  • Respect: +9416
    • View Profile
0

This is interesting, but I feel the data are skewed somehow by certain card similarities and a few other things (that are tough to separate).  For instance:

Pawn-Pearl Diver-Haven-Embargo:  This group is "$2 cards picked up as afterthoughts when +Buy is around."  Haven and Pawn certainly have no other similarity to each other.

Island-Great Hall is obvious, but what are Caravan and Tournament doing there?  The answer is higher up the graph: the IW-Workshop-Talisman group.  All four of these cards are cards that are nice to have multiples of, and/or are great to get with Workshops and Ironworks.

What's the relationship between Chancellor, Fortune Teller, Smugglers, Woodcutter, and Develop?  They're great targets for Swindling Silvers.  So the algorithm is counting forced gains as well actual useful gains.  Hence, also, the bright red $4s:  mostly they're not bought, they're gained by Swindling.  I think these two groups in particular, and the one above in a more general sense, indicate that the algorithm is too heavily weighting a similarity where one card works great with a card, and nothing else does.  Thus, Coppersmith and Bureaucrat have their vectors "oriented" toward Swindler and nothing else.

Removing the forced gains from the algorithm would be better, but I'm not sure that's possible.

Splitting the graph into two separate ones makes for some crazy groupings; KC and TR should be together, and Remodel and Expand should be together, but they're split across graphs.  (Similarly, Witch/MB vs. Hag/YW.)

How does the algorithm decide which splits are outgroups?  (I'm moving into biology terminology here.)  So, for instance, the $5 chain group (Minion-Cartographer) has Tactician as an outgroup (odd), but then that family has Alchemist as an outgroup, when it clearly belongs in the first group.  In a different way, consider the $5 Treasures.  Cache and Contraband are by far more similar to each other and the other Treasures in the group than they are to... Tribute?  Huh?

And then there are the crazy ones.  Trading Post with IGG?  With Forge and Mint as outgroups??  Jack as outgroup to Tunnel and FG??  Sab with Mine, Swindler with BM, Shanty Town with Lighthouse??  Weird.  I wonder if those would make more sense without the split between expensive and cheap.
Logged
Kirian's Law of f.DS jokes:  Any sufficiently unexplained joke is indistinguishable from serious conversation.

rrenaud

  • Administrator
  • *****
  • Offline Offline
  • Posts: 991
  • Uncivilized Barbarian of Statistics
  • Respect: +1197
    • View Profile
    • CouncilRoom
0

> What's the relationship between Chancellor, Fortune Teller, Smugglers, Woodcutter, and Develop?

$3 terminals that suck.

Swindler games are < 10% of the data, they just can't possibly be that influential.

To be honest, I am not sure how the dendrogram is implemented.  I could try pre-processing the input to the dendrogram function to get rid of things that I think are singletons.
Logged

Kirian

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 7096
  • Shuffle iT Username: Kirian
  • An Unbalanced Equation
  • Respect: +9416
    • View Profile
+1

> What's the relationship between Chancellor, Fortune Teller, Smugglers, Woodcutter, and Develop?

$3 terminals that suck.

Swindler games are < 10% of the data, they just can't possibly be that influential.

Ah, but since you're comparing Card vs. Card, that same percentage ( actually ~6%) applies to all the data.  Consider the CR.com data for Chancellor:  0.12 buys per available game.  0.04 gains per available game.  Even if we're charitable and suggest that 50% of the Chancellor gains are from BV/WS/IW--and I'm betting that number is closer to 5% than 50%--that's 0.3 gains (0.02/0.06) per Chancellor/Swindler set.  It's probably closer to 0.6 gains per paired set.  In other words, you're 3-5 times as likely to have Chancellor in your deck if Swindler is also on the board than on all other boards (on average).

Since you're almost certainly going to have Swindler as well, decks containing both Chancellor and Swindler are going to account for about 15-20% of decks containing Chancellor, and a similar (but smaller) percentage of wins with Chancellor in deck.  Only Stash will compete with that number; other correlations are going to be in the, well, 6% range.  Similar numbers for a lot of other crappy cards, with Navigator, Harvest, and Thief coming pretty close.

Edit:  This is especially true if you're using N-dimensional vector cosine distance as your metric.  Having one dimension so big compared with others is going to make the cosine distance tiny...
« Last Edit: July 17, 2012, 08:21:16 pm by Kirian »
Logged
Kirian's Law of f.DS jokes:  Any sufficiently unexplained joke is indistinguishable from serious conversation.

rrenaud

  • Administrator
  • *****
  • Offline Offline
  • Posts: 991
  • Uncivilized Barbarian of Statistics
  • Respect: +1197
    • View Profile
    • CouncilRoom
0

Reading the code, both the popular buys data and the supply_win data (which is the basis for the similarities) are already only own turn gains/buys.

OTOH, "$3 terminals that suck" and "$3 cards often swindled to" are basically saying the same thing.

This particular part of councilroom is reasonably independent from most of the other parts; you don't need do download all the game data or run your own database to play with it since it grabs the summary data from supply_win.  You can investigate/explore/test your hypothesis yourself, or even just develop better clusterings if you are so inclined.  I'll happily  help you get it set up on your machine if you want.
Logged

Kirian

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 7096
  • Shuffle iT Username: Kirian
  • An Unbalanced Equation
  • Respect: +9416
    • View Profile
+1

Reading the code, both the popular buys data and the supply_win data (which is the basis for the similarities) both are already only own turn gains/buys.

Hm.  This appears to be true for the supply_win data, but certainly not true for the popular buys page; the latter shows an average of 1.21 gains of Curse per game...
Logged
Kirian's Law of f.DS jokes:  Any sufficiently unexplained joke is indistinguishable from serious conversation.

DG

  • Governor
  • *****
  • Offline Offline
  • Posts: 4074
  • Respect: +2624
    • View Profile
0

Quote
What's the relationship between Chancellor, Fortune Teller, Smugglers, Woodcutter, and Develop?

Although these cards have little relationship when played you might still gain them in the say way (upgrade), open with similar partner cards on a 3/4 split, buy them in low spending (cursed out) games, and see them ruled out of many kingdoms by the same power terminals at cost 4 and 5. That's probably a closer fit to each other than they are to anything else.
Logged

blueblimp

  • Margrave
  • *****
  • Offline Offline
  • Posts: 2849
  • Respect: +1560
    • View Profile
0

What's the relationship between Chancellor, Fortune Teller, Smugglers, Woodcutter, and Develop?

Chancellor, Fortune Teller, and Woodcutter are all weak $3 terminal silvers, so that's something. Sometimes you want a cheap terminal silver, even if it's weak. Smugglers and Develop are both usually-bad $3 gainers. I can't explain the connection between these two groups, though.
Logged
Pages: 1 [2]  All
 

Page created in 0.04 seconds with 21 queries.