Dominion Strategy Forum

Please login or register.

Login with username, password and session length
Pages: [1] 2 3  All

Author Topic: More data mining: Answering Dominion questions with data  (Read 20646 times)

0 Members and 1 Guest are viewing this topic.

TheExpressicist

  • Conspirator
  • ****
  • Offline Offline
  • Posts: 200
  • Respect: +203
    • View Profile
More data mining: Answering Dominion questions with data
« on: January 30, 2015, 03:40:32 pm »
+3

Update:
There have been a few requests for additional data and I've done some more research that I wanted to share but I don't want to create yet another thread. I'll update this first post with additional data as I collect it.

Card "Strength"

This uses the data gathered from the "Individual Player Analysis" tool, from the games of the Top-20 players as ranked by Isotropish.

Preface:
Firstly, I use the term "strength" quite loosely. In laymen's terms, this chart is a graph of how often a card is gained combined with how often games are won with that card. Each card is assigned a score that will always be a value between 0 and 100. A score of 100 would mean that the card is gained in 100% of the games and you win 100% of the time you gain it. Secondly, the chart uses "Adjusted Win Rate", which is just (Card Win Rate) minus (Average Win Rate). An Adjusted Win Rate of 0% means the card is completely average. (For what it's worth , the average win rate is like 65%).

I feel the need to highlight this: there are many, many, many explanations for why a card with a LOW win%/gain%. There are much fewer explanations for why a card has a HIGH win%/gain%. So although it is safe to assume that a card with a high score is a good card, it is NOT a reasonable assumption that a card with a low score is a bad card..

To reiterate: I would be very careful about how you interpret cards at the middle-to-lower end of the scale. There's a lot more room for cause-effect here: certain cards may make sense to buy only when you're already in a losing position. This does not necessarily reflect on the strength of the card itself. 

Card NameSCOREAdjusted Win%Gain%
Butcher:93.15.7%83%
King'sCourt:93.15.9%82.2%
CandlestickMaker:90.35%79.3%
Squire:90.24.6%81.5%
Peddler:88.45.5%73.6%
Forager:88.43.4%86.1%
WanderingMinstrel:88.34%80.7%
Goons:87.93.1%89.6%
Chapel:87.73.1%88%
GrandMarket:86.84.7%73.5%
Ambassador:86.12.9%84.8%
BorderVillage:862.8%86.5%
Haggler:83.95.2%67.3%
Menagerie:83.13.3%74.7%
Hermit:82.22.6%78.4%
Conspirator:81.94.1%68.5%
Remake:81.83.1%73.3%
City:81.32.9%73.6%
Festival:80.34.8%63.9%
Witch:79.92.2%76.9%
Baker:79.72.9%71.5%
Apprentice:78.13.4%66.4%
Pawn:77.94.3%62.4%
Market:77.64.5%61.3%
Inn:77.53.9%63.3%
Plaza:76.91.1%82.6%
Hamlet:76.81.1%82.3%
NativeVillage:763.9%61.5%
BanditCamp:75.94.1%60.8%
Counterfeit:75.91.5%76.2%
Tactician:75.73.6%62.2%
FarmingVillage:75.54.7%58.5%
Cellar:75.43.6%61.9%
Fortress:75.12.5%66.7%
Wharf:74.50.3%88.9%
ThroneRoom:74.32%69.1%
Masquerade:74.20.4%87.3%
Scheme:73.21.1%74.7%
FishingVillage:72.40.2%83.4%
Ill-GottenGains:72.23.9%57.1%
MerchantGuild:72.12.9%61.3%
Remodel:724.7%54.6%
Minion:71.70.3%80.1%
Altar:71.62.5%62.3%
HuntingParty:71.40.1%82.4%
Cultist:71.21%72.5%
MarketSquare:71.11%71.5%
ScryingPool:701.5%66.8%
GreatHall:69.80.7%72.5%
MiningVillage:681%67.6%
WishingWell:67.22.5%57.2%
Vineyard:66.83.4%53.1%
JunkDealer:66.7-0.3%76.3%
Courtyard:66.3-0.1%73.9%
Tunnel:65.80.6%67.5%
Bridge:64.60.7%64.9%
Mint:64.13.3%50.4%
HornofPlenty:645%44.9%
JackOfAllTrades:64-0.3%72%
GhostShip:63.81.9%56.5%
Apothecary:63.64.6%45.4%
Tournament:63.3-2%93.9%
HuntingGrounds:63.34.6%45%
Worker'sVillage:63.2-1.1%78.3%
Margrave:63.20%67.9%
Crossroads:63.1-0.9%75.7%
Lighthouse:62.6-0.5%71.7%
Mountebank:62.6-1.7%85.7%
Jester:62.33.1%49.3%
Journeyman:62.34.4%44.4%
Quarry:61.70.8%61%
Nobles:61.7-1.4%78.2%
Bazaar:61.2-0.3%67.9%
Warehouse:61.1-1.7%81.2%
Stables:61.1-0.5%68.8%
TradeRoute:60.83.3%46.7%
Caravan:60.5-0.9%71.6%
Treasury:60.34.9%40.8%
Moneylender:60.22.7%48.6%
Possession:606.3%37.3%
Haven:60-0.5%67.8%
Ironworks:59.60.8%58.1%
Alchemist:58.21.9%50.4%
Salvager:58.1-0.4%64.3%
Herald:57.3-2.1%76.9%
Advisor:56.91.9%48.9%
SeaHag:56.90.7%55.9%
Steward:56.9-2.5%79.7%
Swindler:56.4-2.7%80.2%
Island:56.3-1.3%68.1%
Armory:56.14.8%35.3%
Cartographer:56.12.4%45.4%
Upgrade:56.1-1.9%72.6%
Beggar:55.55.3%32.7%
Scavenger:55.31.9%47.1%
ShantyTown:54.5-1.8%69.6%
Stonemason:54.3-1.6%67.9%
BandofMisfits:54.22.3%44%
Watchtower:53.7-0.7%60.7%
Village:53.5-1.1%63.4%
Herbalist:53.56.4%25.5%
YoungWitch:53.21.2%48.8%
Laboratory:53.1-1.7%66.9%
Highway:52.6-2.2%69.4%
Expand:52.52.3%42%
Bishop:520.8%50%
Embargo:51.82.4%40.6%
Oasis:51.7-1.4%62.9%
Library:51.33%36.5%
Militia:51.2-0.8%58.8%
Storeroom:50.71.8%42.3%
Vagrant:49.9-1.7%62.6%
Doctor:49.8-0.2%53.4%
Lookout:49.80.4%50.1%
Contraband:49.76.7%9.6%
Urchin:49.6-4.3%77.6%
Harvest:49.36.5%8.6%
Baron:49.22%39.3%
Ironmonger:48.7-6.2%84.7%
Rebuild:48.6-2.8%67.6%
SpiceMerchant:48.5-2.9%68%
Sage:48.3-1.1%57.2%
MerchantShip:48.23.4%28.8%
Mandarin:485.2%13.7%
Outpost:47.53.3%27.8%
Rabble:47.50.6%46.3%
Mystic:47.11.6%39.2%
Farmland:46.7-0.5%52.3%
Procession:45.51.6%37.1%
Familiar:45-1.2%54.1%
Monument:44.9-1.6%56.3%
Develop:44.81.6%36%
Workshop:44.41.8%34.2%
Torturer:43.9-2%57.4%
Fool'sGold:43.1-2.7%60%
Catacombs:42.7-0.8%49.6%
Duchess:41.80.7%38.6%
PearlDiver:41.8-1.3%51.3%
TradingPost:41.70.4%40.6%
Bank:41.50.8%37.1%
Gardens:40.2-3%58.3%
Explorer:40.12.2%20.3%
Smithy:40-1.4%50%
Smugglers:39.6-1.2%48.5%
HorseTraders:39.5-3.5%59.5%
Fairgrounds:38.7-2%51.5%
University:37.6-5.2%62.6%
Rogue:37.50.4%34.9%
Count:36.9-5.7%62.9%
Feast:36.60.6%31.2%
Marauder:35.5-3.6%55.2%
CouncilRoom:35.2-1.8%46.7%
SilkRoad:33.8-3%51.2%
Harem:32.9-2.2%46.3%
Taxman:32.70.7%22.2%
Soothsayer:32.7-4.5%55.1%
FortuneTeller:32.40.3%26.6%
CountingHouse:30.70.8%9.1%
Masterpiece:29.6-0.1%25.3%
Embassy:29.4-2.8%45.3%
Trader:29.2-1.4%36.9%
Hoard:28.9-4.9%52.1%
Golem:26.2-2.5%39.6%
Duke:26.1-4.5%47.8%
Moat:25.8-3.3%43.5%
Feodum:25.3-2.4%38.2%
DeathCart:24.9-1.2%28.2%
Philosopher'sStone:24-0.4%9.9%
Oracle:23.5-4.1%43.6%
Venture:23.1-1.6%28.6%
Woodcutter:22.8-1.3%24%
Vault:22.1-5.2%45.2%
PoorHouse:21.7-2.6%34%
NomadCamp:21.6-4.2%41.6%
RoyalSeal:20.6-1.3%17.9%
Navigator:19.1-1.7%18.7%
Loan:19.1-3.8%36.9%
Forge:16.9-4.8%37.5%
Scout:16.8-1.8%8.6%
Chancellor:15.7-2.3%16.3%
Cutpurse:14.7-4.7%34.2%
Spy:14.5-3.2%24.8%
PirateShip:13.7-2.5%12%
Mine:12.1-2.9%11.5%
SecretChamber:11.3-4.1%24.2%
Rats:9.9-8.4%33.2%
Talisman:9.8-4.8%25.3%
NobleBrigand:9-4.9%23.6%
Pillage:8.1-5.1%22%
Graverobber:7.9-9.2%30.2%
Tribute:5.7-6%19.9%
Coppersmith:4.7-5.2%7.5%
Saboteur:3.7-8.4%19.5%
Transmute:3.6-5.8%6.5%
Adventurer:3.6-5.8%6.1%
Bureaucrat:3.5-6.3%11.4%
TreasureMap:3.2-8.5%17.5%
Cache:1.2-10.3%8.9%
Thief:1.1-17.2%8.5%
   

*Score is calculated as follows:
(CDF[((Card Win%)-(Average Win%))/(Win% StDev))]+CDF[((Card Gain%)-(Average Gain%))/(Gain% StDev))])*50

In other words: I convert the card's Win and Gain% to a Z-score, then convert that Z-score to a cumulative distribution. I add the cumulative distribution for Wins and Gains together. This results in a number on a scale of 0-2. I multiply that by 50 to result in a scale of 0-100.

In laymen's terms: if the cards were all lined up and ranked according to their gain% and win% (e.g. the most gained card would be 1, the least would be 200something), and you added those two ranks together.

------------------------------------------------------------------------------------------------------------------------------

Win Rates for cards when they are only gained by one player. *Note: The average win rate for top-20 players is 65%. Naturally, the average win rate for all players is 50%.
Card NameAll Players%Top-20 Players Only%
Colony:78.6%93.5%
Goons:68.2%87.5%
Butcher:67.4%85.3%
Province:66.7%85.3%
Minion:66.6%77.1%
Vineyard:65.3%80.2%
King'sCourt:65.3%77.6%
Witch:62.5%77.7%
Baker:61.5%74.4%
Journeyman:61.3%76.3%
GrandMarket:60.8%73.2%
Masterpiece:60.6%67.6%
Mountebank:60.2%80.7%
Peddler:59.9%79.6%
Platinum:59.3%81.8%
BorderVillage:58.1%80.4%
Wharf:57.9%72.7%
Margrave:57.8%75.3%
WishingWell:57.5%69%
Catacombs:57.2%70.6%
Governor:57.1%73.9%
MerchantGuild:57%74.2%
Beggar:56.8%75.6%
GhostShip:56.2%77.1%
Explorer:55.9%72.5%
HornofPlenty:55.8%76.3%
Mint:55.8%78.8%
Library:55.8%69.9%
HuntingParty:55.7%70.3%
HuntingGrounds:54.7%78.6%
Ill-GottenGains:54.5%66.3%
MerchantShip:54.4%69.2%
Estate:54.1%69.9%
Laboratory:54.1%75.9%
MiningVillage:53.5%70.4%
Bazaar:53.3%72%
Masquerade:53.2%76.3%
Rogue:52.9%75.4%
Warehouse:52.8%61.6%
Pawn:52.8%73.3%
Festival:52.8%75.6%
Nobles:52.8%71.2%
Farmland:52.8%72%
ScryingPool:52.7%72.1%
Armory:52.7%70.6%
Apothecary:52.6%73%
Stonemason:52.6%63.2%
Possession:52.5%73.2%
Crossroads:52.3%68.8%
CountingHouse:52.2%69.7%
Menagerie:52.1%71.4%
JackOfAllTrades:52%63.4%
Copper:52%67.8%
CandlestickMaker:51.9%66.2%
Embassy:51.7%68.5%
WanderingMinstrel:51.6%66.7%
Hamlet:51.4%64.8%
Apprentice:51.3%71.6%
Altar:51.3%68%
BanditCamp:51.2%66.7%
Chancellor:51%69.6%
Familiar:51%74.7%
Scheme:50.9%69.3%
Mystic:50.9%65.5%
Hermit:50.7%64.3%
Scavenger:50.6%66.7%
Conspirator:50.5%72.2%
Quarry:50.5%67.4%
Highway:50.5%68.3%
Oracle:50.3%57.6%
HorseTraders:50.3%62.5%
Vault:50%60.2%
Fairgrounds:50%61.5%
ThroneRoom:50%64.8%
Harem:49.9%67.1%
Bridge:49.9%74.3%
Herald:49.9%62.9%
Counterfeit:49.8%70%
Workshop:49.8%62.3%
PearlDiver:49.8%65.4%
Worker'sVillage:49.8%66.7%
PoorHouse:49.7%62.7%
TradingPost:49.7%68.7%
JunkDealer:49.7%66.7%
Market:49.7%63.6%
City:49.5%80.9%
GreatHall:49.5%61.9%
FishingVillage:49.5%68.1%
Duchess:49.4%65.3%
Watchtower:49.3%68.1%
Inn:49.3%72%
Duke:49.1%64.6%
Treasury:49%71.2%
Contraband:49%75.8%
Rabble:48.9%72.9%
Fool'sGold:48.8%64.1%
Venture:48.7%67.7%
FarmingVillage:48.7%74.1%
Herbalist:48.6%72.1%
Ironmonger:48.6%62.5%
Stables:48.6%62.7%
Steward:48.6%54.8%
Courtyard:48.6%64.7%
Salvager:48.5%72.6%
Tournament:48.5%80%
Torturer:48.4%67.9%
Fortress:48.3%66.7%
Ambassador:48.2%62.5%
Sage:48.1%61.6%
SeaHag:48%63.1%
Squire:47.9%69.8%
Village:47.9%62.3%
Moneylender:47.9%75.6%
RoyalSeal:47.8%60%
Chapel:47.8%63.2%
Haggler:47.8%66.2%
BandofMisfits:47.8%65.8%
Feast:47.7%64.8%
Plaza:47.7%62.7%
Count:47.6%56.8%
Duchy:47.6%62.7%
Haven:47.5%58.1%
Swindler:47.4%65.2%
Woodcutter:47.4%66.1%
Gardens:47.2%62.2%
Remodel:47.2%65.4%
Mine:46.9%61.5%
BlackMarket:46.9%69.1%
Hoard:46.9%61.7%
Adventurer:46.8%73.9%
Gold:46.8%66.9%
ShantyTown:46.7%62.1%
Embargo:46.6%68.9%
Outpost:46.4%69.2%
Bank:46.3%65.3%
Urchin:46.2%57.6%
Smithy:46.1%66.4%
Cultist:46%72.1%
Lighthouse:46%58.8%
Procession:45.9%63.5%
NativeVillage:45.9%58%
Remake:45.8%70.6%
CouncilRoom:45.8%59.1%
Doctor:45.8%59%
NomadCamp:45.6%61.2%
Loan:45.6%62.1%
Mandarin:45.5%68%
Rebuild:45.4%61.5%
Vagrant:45.4%59.5%
Tactician:44.9%63%
Cartographer:44.9%62.8%
Expand:44.9%64.1%
TradeRoute:44.8%63.9%
Ironworks:44.7%62%
FortuneTeller:44.7%55.3%
Militia:44.6%57.5%
Marauder:44.5%68.3%
Baron:44.4%63.1%
Navigator:44.3%56.3%
Cellar:44.1%68.1%
Monument:44%71%
Bureaucrat:43.9%58.3%
Upgrade:43.8%57.1%
Taxman:43.6%71.2%
Potion:43.6%64.2%
Storeroom:43.6%59.7%
Moat:43.5%66.7%
Graverobber:43.3%51.9%
MarketSquare:43.2%60.5%
Caravan:43.2%69.2%
Island:43%57.9%
Lookout:42.7%64.6%
Smugglers:42.6%61.6%
Cutpurse:42.6%56.5%
Cache:42.5%46.4%
Harvest:42.2%71.4%
YoungWitch:42%64.7%
Advisor:41.6%64.4%
Alchemist:41.6%71.4%
Silver:41.5%54.8%
Golem:41.5%58.9%
Soothsayer:41.5%57.4%
TreasureMap:41.4%54%
Bishop:41.3%61%
Curse:41.3%54.3%
Feodum:41.2%55.4%
Trader:41%66.7%
DeathCart:41%56.8%
Develop:40.9%63%
Tunnel:40.8%62.5%
SpiceMerchant:40.7%63.2%
Pillage:40.7%60.3%
Rats:40.5%46.5%
Oasis:40.4%56.2%
Spy:40.3%53.1%
SilkRoad:40.2%43.3%
Philosopher'sStone:39.3%53.1%
SecretChamber:38.8%50%
Forager:38.7%52%
Forge:38.2%57.6%
Jester:38%59.4%
NobleBrigand:37.7%47.3%
Talisman:37.5%53.3%
Transmute:36%60.9%
Saboteur:35.9%50.9%
Thief:35.7%38.9%
Scout:35.6%62.9%
University:33.1%41.1%
Coppersmith:32.5%54.2%
Tribute:31.1%54.1%
PirateShip:24.3%46.4%

------------------------------------------------------------------------------------------------------------------------------

Impact of First-Shuffle Luck on Win %
In other words, how big of an impact does missing one or both of your T1/T2 purchases before the second reshuffle have on the Win % of the Top-20 players? Measured in "Adjusted Win %", see first post for explanation.

When opening Action/Action (63.5% of the time):
Hit Both Actions: +4%
Hit One Action: -3%
Hit Neither Action: -11%

When opening Action/Treasure (32% of the time):
Hit Action + Treasure: +1.5%
Hit Treasure Only: -11%
Hit Action Only: -0.5%
Hit Neither: -11%

When opening Treasure/Treasure (4.5% of the time):
Hit Both Treasures: +2%
Hit 1 Treasure: -12%
Hit Neither Treasure: -40%

------------------------------------------------------------------------------------------------------------------------------
Top-20 players' "adjusted win rate" compared to when they played their first $5 card.

T3/T4/T5: +4%
T6/T7: +0%
T8/T9 +0%
T10+: -3%

Of course, this is across the board and doesn't target specific high-value $5 cards like Witch or Mountebank.


------------------------------------------------------------------------------------------------------------------------------

Adjusted Win Rate of 5/2 vs. 4/3:
5/2: + 3.5%
4/3:  - 1%
 
« Last Edit: February 02, 2015, 07:07:08 am by TheExpressicist »
Logged

werothegreat

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 8172
  • Shuffle iT Username: werothegreat
  • Let me tell you a secret...
  • Respect: +9625
    • View Profile
Re: More data mining: Card "strength"
« Reply #1 on: January 30, 2015, 03:48:08 pm »
0

Huh.  I don't think I've ever actually bought a Butcher.
Logged
Contrary to popular belief, I do not run the wiki all on my own.  There are plenty of other people who are actively editing.  Go bother them!

Check out this fantasy epic adventure novel I wrote, the Broken Globe!  http://www.amazon.com/Broken-Globe-Tyr-Chronicles-Book-ebook/dp/B00LR1SZAS/

jsh357

  • Margrave
  • *****
  • Offline Offline
  • Posts: 2577
  • Shuffle iT Username: jsh357
  • Respect: +4340
    • View Profile
    • JSH Gaming: Original games
Re: More data mining: Card "strength"
« Reply #2 on: January 30, 2015, 03:56:47 pm »
+8

Butcher is pretty powerful, you should work on that.
Logged
Join the Dominion community Discord channel! Chat in text and voice; enter dumb tournaments; spy on top players!

https://discord.gg/2rDpJ4N

LastFootnote

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 7495
  • Shuffle iT Username: LastFootnote
  • Respect: +10721
    • View Profile
Re: More data mining: Card "strength"
« Reply #3 on: January 30, 2015, 04:05:28 pm »
0

I'm pretty shocked that Candlestick Maker is so high.
Logged

JW

  • Jester
  • *****
  • Offline Offline
  • Posts: 968
  • Shuffle iT Username: JW
  • Respect: +1781
    • View Profile
Re: More data mining: Card "strength"
« Reply #4 on: January 30, 2015, 04:19:50 pm »
+1

I'm pretty shocked that Candlestick Maker is so high.

Pawn stands out too. I'd attribute this to the fact that this is a data set of Top 20 players who win most of their games and are better than most of their opponents. In games where Candlestick Maker and Pawn are gained, it is more likely to be an engine-friendly board on which higher skill players excel (Candlestick Maker is not a strong Big Money card to take over Silver, for example, but is a good source of +buy for an engine). These also have a high gain rate due to being nonterminals costing $2 (for example, the lowly Pearl Diver is bought on 51% of boards, vs. 62% for Pawn and 79% for Candlestick Maker).

Quote
CandlestickMaker:   90.3   5%   79.3%
Pawn:   77.9   4.3%   62.4%

Similarly, a card like Beggar has a very high Adjusted Win % of 5.3% because the sloggy boards where the top players buy it are complicated, but Beggar isn't bought that often because it's not a strong card most of the time. Possession is another card where if it's bought, that's an indication of a complex board and a high adjusted win %, but top players buy it on only 37% of boards.

Quote
Possession:   60   6.3%   37.3%
Beggar:   55.5   5.3%   32.7%
Logged

microman

  • Moneylender
  • ****
  • Offline Offline
  • Posts: 164
  • Stop "Making Fun" of me!
  • Respect: +67
    • View Profile
Re: More data mining: Card "strength"
« Reply #5 on: January 30, 2015, 04:20:04 pm »
0

I'm pretty shocked that Candlestick Maker is so high.
I actually am not.  Its cheapness, +buy, +action, makes it very spamable and then it makes it very deadly as a 3 pile ender where a 3 pile game maybe wasnt an option.  Not to mention its most important function, those very useful coin tokens.  Its a must buy for me almost everytime!
Logged

-Stef-

  • 2012 & 2016 DS Champion
  • *
  • Offline Offline
  • Posts: 1574
  • Respect: +4419
    • View Profile
Re: More data mining: Card "strength"
« Reply #6 on: January 30, 2015, 04:34:56 pm »
+7

I'm pretty shocked that Candlestick Maker is so high.
I actually am not.  Its cheapness, +buy, +action, makes it very spamable and then it makes it very deadly as a 3 pile ender where a 3 pile game maybe wasnt an option.  Not to mention its most important function, those very useful coin tokens.  Its a must buy for me almost everytime!

Candlestick maker is very not-spammable.

The first one is great, the second one ok, the 3rd one bad and from there on they're all terrible.
I win a lot of games because my opponent just keeps buying them.
With vineyards sure, but otherwise no thanks.
Logged
Join the Dominion League!

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4381
    • View Profile
    • WanderingWinder YouTube Page
Re: More data mining: Card "strength"
« Reply #7 on: January 30, 2015, 04:39:04 pm »
0

How are you calculating the Standard Deviation? It's the St Dev across... what?

TheExpressicist

  • Conspirator
  • ****
  • Offline Offline
  • Posts: 200
  • Respect: +203
    • View Profile
Re: More data mining: Card "strength"
« Reply #8 on: January 30, 2015, 04:54:04 pm »
0

How are you calculating the Standard Deviation? It's the St Dev across... what?

It's the SD of the "Gain %" column. Likewise with "Win %" column.

in other words: StDev{ (Card A Games Bought)/(Card A Games Available), (Card B Games Bought)/(Card B Games Available), (Card C Games Bought)/(Card C Games Available), .... etc.}
or for Win%: StDev{ (Card A Games Won)/(Card A Games Bought), (Card B Games Won)/(Card B Games Bought), (Card C Games Won)/(Card C Games Bought), .... etc.}
Logged

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4381
    • View Profile
    • WanderingWinder YouTube Page
Re: More data mining: Card "strength"
« Reply #9 on: January 30, 2015, 05:13:46 pm »
0

How are you calculating the Standard Deviation? It's the St Dev across... what?

It's the SD of the "Gain %" column. Likewise with "Win %" column.

in other words: StDev{ (Card A Games Bought)/(Card A Games Available), (Card B Games Bought)/(Card B Games Available), (Card C Games Bought)/(Card C Games Available), .... etc.}
or for Win%: StDev{ (Card A Games Won)/(Card A Games Bought), (Card B Games Won)/(Card B Games Bought), (Card C Games Won)/(Card C Games Bought), .... etc.}

So the standard deviation between cards? It's the same for every single card then?

TheExpressicist

  • Conspirator
  • ****
  • Offline Offline
  • Posts: 200
  • Respect: +203
    • View Profile
Re: More data mining: Card "strength"
« Reply #10 on: January 30, 2015, 05:42:31 pm »
+1

How are you calculating the Standard Deviation? It's the St Dev across... what?

It's the SD of the "Gain %" column. Likewise with "Win %" column.

in other words: StDev{ (Card A Games Bought)/(Card A Games Available), (Card B Games Bought)/(Card B Games Available), (Card C Games Bought)/(Card C Games Available), .... etc.}
or for Win%: StDev{ (Card A Games Won)/(Card A Games Bought), (Card B Games Won)/(Card B Games Bought), (Card C Games Won)/(Card C Games Bought), .... etc.}

So the standard deviation between cards? It's the same for every single card then?

Not sure what you mean by it being the same for every single card. It's the standard deviation of the set of all cards.
Logged

TheExpressicist

  • Conspirator
  • ****
  • Offline Offline
  • Posts: 200
  • Respect: +203
    • View Profile
Re: More data mining: Card "strength"
« Reply #11 on: January 30, 2015, 06:11:19 pm »
0

PS. I tried to include as many disclaimers as possible to prevent people from misinterpreting the data (e.g. BUY MORE CANDLESTICK MAKERS). I'm very aware of the limitations of 1. trying to objectively quantify a subjective concept like "strength". 2. Trying to model a complex system with a few simple variables. 3. Mixing abstract and concrete numbers. 4. Creating custom "metrics". 5. Trying to combine two fundamentally different measurements, etc.

But, all that said, I deliberately made the "Score" rating as abstract as possible so as to try to prevent people from trying to use those numbers to draw spurious, concrete conclusions.
Logged

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4381
    • View Profile
    • WanderingWinder YouTube Page
Re: More data mining: Card "strength"
« Reply #12 on: January 30, 2015, 06:56:21 pm »
0

How are you calculating the Standard Deviation? It's the St Dev across... what?

It's the SD of the "Gain %" column. Likewise with "Win %" column.

in other words: StDev{ (Card A Games Bought)/(Card A Games Available), (Card B Games Bought)/(Card B Games Available), (Card C Games Bought)/(Card C Games Available), .... etc.}
or for Win%: StDev{ (Card A Games Won)/(Card A Games Bought), (Card B Games Won)/(Card B Games Bought), (Card C Games Won)/(Card C Games Bought), .... etc.}

So the standard deviation between cards? It's the same for every single card then?

Not sure what you mean by it being the same for every single card. It's the standard deviation of the set of all cards.
Ok, cool. My point is that when you are caulculating your 'score' metric, you divide by a st dev, then take the Normal CDF. But in that formula, the st dev you are dividing by is the same for all cards.


Edit: Which isn't inherently wrong, just I don't know that this is what you want to do/something which is really meaningful. Mostly still just trying to wrap my head around what it really means.
« Last Edit: January 30, 2015, 06:59:46 pm by WanderingWinder »
Logged

JW

  • Jester
  • *****
  • Offline Offline
  • Posts: 968
  • Shuffle iT Username: JW
  • Respect: +1781
    • View Profile
Re: More data mining: Card "strength"
« Reply #13 on: January 30, 2015, 07:13:13 pm »
0

Pawn stands out too. I'd attribute this to the fact that this is a data set of Top 20 players who win most of their games and are better than most of their opponents. In games where Candlestick Maker and Pawn are gained, it is more likely to be an engine-friendly board on which higher skill players excel (Candlestick Maker is not a strong Big Money card to take over Silver, for example, but is a good source of +buy for an engine). These also have a high gain rate due to being nonterminals costing $2 (for example, the lowly Pearl Diver is bought on 51% of boards, vs. 62% for Pawn and 79% for Candlestick Maker).

Candlestick maker is very not-spammable.

The first one is great, the second one ok, the 3rd one bad and from there on they're all terrible.
I win a lot of games because my opponent just keeps buying them.

Or as Stef alludes to, another explanation for why Candlestick Maker is so high up is that the top players know not to buy too many, but everyone else buys far too many Candlestick Makers and thus loses to the top players disproportionately on Candlestick Maker boards (in which the top player buys a judicious number of Candlestick Makers). This could help explain Squire's stats too. Just because it's easy to use Squire to get two more Squire doesn't mean that you want so many non-drawing cards.

Quote
Squire:   90.2   4.6%   81.5%
Logged

TheExpressicist

  • Conspirator
  • ****
  • Offline Offline
  • Posts: 200
  • Respect: +203
    • View Profile
Re: More data mining: Card "strength"
« Reply #14 on: January 30, 2015, 07:21:13 pm »
0

How are you calculating the Standard Deviation? It's the St Dev across... what?

It's the SD of the "Gain %" column. Likewise with "Win %" column.

in other words: StDev{ (Card A Games Bought)/(Card A Games Available), (Card B Games Bought)/(Card B Games Available), (Card C Games Bought)/(Card C Games Available), .... etc.}
or for Win%: StDev{ (Card A Games Won)/(Card A Games Bought), (Card B Games Won)/(Card B Games Bought), (Card C Games Won)/(Card C Games Bought), .... etc.}

So the standard deviation between cards? It's the same for every single card then?

Not sure what you mean by it being the same for every single card. It's the standard deviation of the set of all cards.
Ok, cool. My point is that when you are caulculating your 'score' metric, you divide by a st dev, then take the Normal CDF. But in that formula, the st dev you are dividing by is the same for all cards.


Edit: Which isn't inherently wrong, just I don't know that this is what you want to do/something which is really meaningful. Mostly still just trying to wrap my head around what it really means.

The effect I'm approximating is similar to if you ranked each card , then added the two ranks together. That actually would have been a much easier way to do it. Such is the statistician's curse
: taking the most complex and circuitous route to explaining something that common sense could tell you.

My goal was to create a metric that is abstract enough to avoid people drawing incorrect conclusions like "X% of a players skill can be explained by Y" as we saw in the other data mining thread. I definitely sacrificed some statistical rigor but I think that's okay since we are already dealing with a lot of fuzziness.
Logged

Throwaway_bicycling

  • Moneylender
  • ****
  • Offline Offline
  • Posts: 151
  • Respect: +140
    • View Profile
Re: More data mining: Card "strength"
« Reply #15 on: January 30, 2015, 08:04:57 pm »
+1

How are you calculating the Standard Deviation? It's the St Dev across... what?

It's the SD of the "Gain %" column. Likewise with "Win %" column.

in other words: StDev{ (Card A Games Bought)/(Card A Games Available), (Card B Games Bought)/(Card B Games Available), (Card C Games Bought)/(Card C Games Available), .... etc.}
or for Win%: StDev{ (Card A Games Won)/(Card A Games Bought), (Card B Games Won)/(Card B Games Bought), (Card C Games Won)/(Card C Games Bought), .... etc.}

So the standard deviation between cards? It's the same for every single card then?

Not sure what you mean by it being the same for every single card. It's the standard deviation of the set of all cards.

I am clearly not Wandering Winder, but I think I share his confusion. What I took his question to be is that the denominator of the first term contributing to score and the second term contributing to score (the SD of the relevant columns) is the same for every card that has a score. And that seems to be the case. So...you don't really need it? I mean, you're taking the CDF for both terms, which varies between 0 and 1, which is how you get a 0-100 scale after you multiply by 50. Normalizing first doesn't really get you anything. Unless I'm missing something.

I also see a deeper confusion here, which is the distinction between how strong a card is by itself and how much it allows a strong player to amplify his or her skill. So Rebuild has a pretty mediocre win_rate here, but it's hardly a weak card; the problem is that there is only so much you can do to eek out extra wins against competent opposition, since shuffle luck is clearly important. Similarly, Swindler looks like a bad card here by win_rate...but that's because your opponent can swindle, too, and is similarly swingy. Also Highway: a very good card, but even a nimrod like me can use it effectively (it was key on the one board where I beat a Top 20 player). Indeed, for the strength of cards alone, in many cases, negative means the card itself is so strong that, in combination with shuffle luck even a top-rated player is going to have a hard time coming out ahead. Similarly, cards that excel in BM games are not going to look good here because it is easier for most to play BM well.

What is clear here is that cards that are best in well-constructed engines will look really good on this list because the strongest players are waaaaay better at constructing engines than even pretty good players, so they actually amplify the value of those cards.

And actually, the really interesting cards are the ones that are not bought much by strong players, but, when they are, impart impressive benefits. Contraband is +6.7% and Harvest is +6.5%, which I guess is a 71% overall win rate or so.

That said, I think there are a few cards here that might be legitimately traps for better players, at least right now. I have personally beaten stronger players more often than not when they indulge in Golem, for example. If I am right, the -2.5% performance here mostly comes from games where the strong player buys Golem, the weaker one ignores it, and capitalizes on the slowness; maybe not enough to win a lot, but more than expected. Graverobber is another card that I think sometimes looks better on paper than in your deck. Also maybe Forge, although I guess that could be bought a lot in high junk games with not much other trashing, which would impede engine building, which is the Strong player's comparative advantage.

And all those Scout buys *must* come from really sloggy games where desperation sets in. :-)
Logged

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4381
    • View Profile
    • WanderingWinder YouTube Page
Re: More data mining: Card "strength"
« Reply #16 on: January 30, 2015, 08:11:14 pm »
0

How are you calculating the Standard Deviation? It's the St Dev across... what?

It's the SD of the "Gain %" column. Likewise with "Win %" column.

in other words: StDev{ (Card A Games Bought)/(Card A Games Available), (Card B Games Bought)/(Card B Games Available), (Card C Games Bought)/(Card C Games Available), .... etc.}
or for Win%: StDev{ (Card A Games Won)/(Card A Games Bought), (Card B Games Won)/(Card B Games Bought), (Card C Games Won)/(Card C Games Bought), .... etc.}

So the standard deviation between cards? It's the same for every single card then?

Not sure what you mean by it being the same for every single card. It's the standard deviation of the set of all cards.

I am clearly not Wandering Winder, but I think I share his confusion. What I took his question to be is that the denominator of the first term contributing to score and the second term contributing to score (the SD of the relevant columns) is the same for every card that has a score. And that seems to be the case. So...you don't really need it? I mean, you're taking the CDF for both terms, which varies between 0 and 1, which is how you get a 0-100 scale after you multiply by 50. Normalizing first doesn't really get you anything. Unless I'm missing something.
You're missing something. He's dividing before taking the Normal CDF.

Throwaway_bicycling

  • Moneylender
  • ****
  • Offline Offline
  • Posts: 151
  • Respect: +140
    • View Profile
Re: More data mining: Card "strength"
« Reply #17 on: January 30, 2015, 08:49:15 pm »
0

How are you calculating the Standard Deviation? It's the St Dev across... what?

It's the SD of the "Gain %" column. Likewise with "Win %" column.

in other words: StDev{ (Card A Games Bought)/(Card A Games Available), (Card B Games Bought)/(Card B Games Available), (Card C Games Bought)/(Card C Games Available), .... etc.}
or for Win%: StDev{ (Card A Games Won)/(Card A Games Bought), (Card B Games Won)/(Card B Games Bought), (Card C Games Won)/(Card C Games Bought), .... etc.}

So the standard deviation between cards? It's the same for every single card then?

Not sure what you mean by it being the same for every single card. It's the standard deviation of the set of all cards.

I am clearly not Wandering Winder, but I think I share his confusion. What I took his question to be is that the denominator of the first term contributing to score and the second term contributing to score (the SD of the relevant columns) is the same for every card that has a score. And that seems to be the case. So...you don't really need it? I mean, you're taking the CDF for both terms, which varies between 0 and 1, which is how you get a 0-100 scale after you multiply by 50. Normalizing first doesn't really get you anything. Unless I'm missing something.
You're missing something. He's dividing before taking the Normal CDF.

Yes, I saw that like three seconds after I posted. A victim of my own R coding practices of yore, where I generally used ecdf (the empirical CDF). And then I saw TheExpressicist's post that ranks would be basically as good, and I agree there.

But I'm still not sure what you gain from making a single index here. Especially when we are talking about *card* strength.

Imagine a card called "Winner" that costs $0 and allows you to win when played. Everybody would gain it like crazy (100% gain, for the purposes of this thread) and it would be (nearly?) a total crap shoot to win with it, so I guess it would be like -15% or something in this scheme and its score might even be somewhere south of Rebuild's.
Logged

TheExpressicist

  • Conspirator
  • ****
  • Offline Offline
  • Posts: 200
  • Respect: +203
    • View Profile
Re: More data mining: Card "strength"
« Reply #18 on: January 30, 2015, 09:38:04 pm »
+1

Incidentally, if you take the "Gain % Rank" and subtract the "Win % Rank" you get a pretty reasonable illustration of the "swingiest" cards:

1. Ironmonger (?? I don't quite get this one.)
2. Urchin (Makes sense; whoever collides their Urchins first has a huge advantage).
3. Tournament (duh.)
4. Swindler (also duh.)
5. Mountebank (also also duh.)

The justification being; these are cards powerful enough to justify top-tier players buying them more often than not. But also brainless enough that anyone can play them and have a decent shot at winning.

IN OTHER WORDS: WHEN YOU PLAY GOOD PEOPLE BUY ALL THE URCHINS AND SWINDLERSSSS


« Last Edit: January 30, 2015, 09:59:27 pm by TheExpressicist »
Logged

c4master

  • Moneylender
  • ****
  • Offline Offline
  • Posts: 167
  • Respect: +56
    • View Profile
Re: More data mining: Card "strength"
« Reply #19 on: January 31, 2015, 08:38:53 am »
0

I'm pretty shocked that Candlestick Maker is so high.
I actually am not.  Its cheapness, +buy, +action, makes it very spamable and then it makes it very deadly as a 3 pile ender where a 3 pile game maybe wasnt an option.  Not to mention its most important function, those very useful coin tokens.  Its a must buy for me almost everytime!

Candlestick maker is very not-spammable.

The first one is great, the second one ok, the 3rd one bad and from there on they're all terrible.
I win a lot of games because my opponent just keeps buying them.
With vineyards sure, but otherwise no thanks.

CM enables, or rather: supports, a lot of double Tactician boards.

It's also great with draw-to-X-engines although I would be careful about spamming them here. You can still get your share of 4-6 CMs, but you shouldn't buy all of them in order to not stall with a hand full of CMs.

At least, that's my experience.

-----------

Incidentally, if you take the "Gain % Rank" and subtract the "Win % Rank" you get a pretty reasonable illustration of the "swingiest" cards:

1. Ironmonger (?? I don't quite get this one.)
2. Urchin (Makes sense; whoever collides their Urchins first has a huge advantage).
3. Tournament (duh.)
4. Swindler (also duh.)
5. Mountebank (also also duh.)

The justification being; these are cards powerful enough to justify top-tier players buying them more often than not. But also brainless enough that anyone can play them and have a decent shot at winning.

IN OTHER WORDS: WHEN YOU PLAY GOOD PEOPLE BUY ALL THE URCHINS AND SWINDLERSSSS

Ironmonger is just almost always a good card. It's a Village in a thinned engine-deck. It's more than a Peddler in almost any other kind of deck because of it's discard ability. It's a good opener.

I don't think, it's a card that signals any specific strategy. So I guess, it's win rate should be about average, right?

edit: No, I'm wrong. It has -6.2% win rate, but more than 88% gain rate. That's really strange. Maybe it's just overrated even by the top players?
« Last Edit: January 31, 2015, 08:43:45 am by c4master »
Logged

-Stef-

  • 2012 & 2016 DS Champion
  • *
  • Offline Offline
  • Posts: 1574
  • Respect: +4419
    • View Profile
Re: More data mining: Card "strength"
« Reply #20 on: January 31, 2015, 12:02:22 pm »
0

Ironmonger is just almost always a good card. It's a Village in a thinned engine-deck. It's more than a Peddler in almost any other kind of deck because of it's discard ability. It's a good opener.

I don't think, it's a card that signals any specific strategy. So I guess, it's win rate should be about average, right?

edit: No, I'm wrong. It has -6.2% win rate, but more than 88% gain rate. That's really strange. Maybe it's just overrated even by the top players?

Suppose we're playing a board with Mountebank and Ironmonger. I open with Ironmonger and you don't. Most likely scenario is that I'm way behind now.

Not saying that that is the reason, but it is really tricky to draw any conclusions at all based on these kinds of numbers.
Logged
Join the Dominion League!

TheExpressicist

  • Conspirator
  • ****
  • Offline Offline
  • Posts: 200
  • Respect: +203
    • View Profile
Re: More data mining: Card "strength"
« Reply #21 on: January 31, 2015, 12:14:59 pm »
+1

Blaah. That post about "swingiest cards" was supposed to be a new post. I guess I edited over my old post. I'll try to do the condensed version:

How are you calculating the Standard Deviation? It's the St Dev across... what?

It's the SD of the "Gain %" column. Likewise with "Win %" column.

in other words: StDev{ (Card A Games Bought)/(Card A Games Available), (Card B Games Bought)/(Card B Games Available), (Card C Games Bought)/(Card C Games Available), .... etc.}
or for Win%: StDev{ (Card A Games Won)/(Card A Games Bought), (Card B Games Won)/(Card B Games Bought), (Card C Games Won)/(Card C Games Bought), .... etc.}

So the standard deviation between cards? It's the same for every single card then?

Not sure what you mean by it being the same for every single card. It's the standard deviation of the set of all cards.

I am clearly not Wandering Winder, but I think I share his confusion. What I took his question to be is that the denominator of the first term contributing to score and the second term contributing to score (the SD of the relevant columns) is the same for every card that has a score. And that seems to be the case. So...you don't really need it? I mean, you're taking the CDF for both terms, which varies between 0 and 1, which is how you get a 0-100 scale after you multiply by 50. Normalizing first doesn't really get you anything. Unless I'm missing something.
You're missing something. He's dividing before taking the Normal CDF.

Yes, I saw that like three seconds after I posted. A victim of my own R coding practices of yore, where I generally used ecdf (the empirical CDF). And then I saw TheExpressicist's post that ranks would be basically as good, and I agree there.

But I'm still not sure what you gain from making a single index here. Especially when we are talking about *card* strength.

Imagine a card called "Winner" that costs $0 and allows you to win when played. Everybody would gain it like crazy (100% gain, for the purposes of this thread) and it would be (nearly?) a total crap shoot to win with it, so I guess it would be like -15% or something in this scheme and its score might even be somewhere south of Rebuild's.

It's a perfectly valid point which is why I tried to drench this thing with as many disclaimers as possible.

It raises the need to consider what this metric is actually measuring: cards that confer an advantage on good players. Take the "Winner" card. Yes, it has what would be the most powerful effect in the game. But, it would actually make good players worse because it drags their win rate from 65% down to closer to 50%.

In general, cards that are powerful but simple enough to use that they barely require any skill are dangerous to good players because it removes their primary advantage: skill. I would venture a guess that cards that are gained at a very high rate by good players but do not have a correspondingly high win-rate are cards that fall under this category. (Note: this was the context and lead-in to the 'swingiest cards' post).
Logged

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4381
    • View Profile
    • WanderingWinder YouTube Page
Re: More data mining: Card "strength"
« Reply #22 on: January 31, 2015, 12:28:14 pm »
0

Yeah. I think this list has a very (very) rough relationship with how good the cards are, but that isn't really good for much at all.

More important, these are overall rankings. On any given board, you can throw them out the window - it doesn't matter that 195 other cards exist which might make card X good or bad. You only have the other 9 which exist right then and there, and that's basically always going to be a lot different than the general case.

It's a similar phenomenon to MtG, where what format is super important in determining how good a card is. Mental Misstep is unplayable in limited, because nobody runs 1-drops, certainly not impactful ones, but it's ubiquitous in Vintage, where you are countering things like Ancestral Recall for free.





Stef: You think Ironmonger is bad on Mountebank boards? Really? And you think it's significantly worse than, say, silver?





Re: the metric itself. The source of my head-scratching is, you applied a normal CDF to it. People do this all the time, and I'm not sure why - my guess is that it's because they did it all the time in their one stats course. But the thing is, most data isn't normally-distributed. So I can kind of understand wanting to do some kind of normalization to get things on the same scale, and I can kind of understand how how often the card is gained and how much you win when it is vs when it isn't are related to how good the card is, but even beyond the limitations of the approach from that perspective, I can't really get behind applying the Gaussian transformation.

TheOthin

  • Witch
  • *****
  • Offline Offline
  • Posts: 459
  • Shuffle iT Username: TheOthin
  • Respect: +447
    • View Profile
Re: More data mining: Card "strength"
« Reply #23 on: January 31, 2015, 12:46:36 pm »
0

Despite Ironmonger's frequent Village function later, it's not so bad as an opener, is it? An Ironmonger play after opening Ironmonger/Village means it has good odds of producing at least $2, like a Silver; it'll only fall short of that if it draws one or more Estates. Plus it cycles. I can see it being weaker than opening Silver/Silver if you don't want the chance of falling short of a potential $5, and it doesn't like hitting Curses, but...

Plus, the fact that Ironmonger is worried about drawing Estates means it gets better if you have an Estate in your hand in the first place, which is when you need a $2 card to hit $5.
Logged

Awaclus

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 11809
  • Shuffle iT Username: Awaclus
  • (´。• ω •。`)
  • Respect: +12847
    • View Profile
    • Birds of Necama
Re: More data mining: Card "strength"
« Reply #24 on: January 31, 2015, 12:48:16 pm »
+6

Stef: You think Ironmonger is bad on Mountebank boards? Really? And you think it's significantly worse than, say, silver?

The player who doesn't open Ironmonger probably opens Mountebank.
Logged
Bomb, Cannon, and many of the Gunpowder cards can strongly effect gameplay, particularly in a destructive way

The YouTube channel where I make musicDownload my band's Creative Commons albums for free
Pages: [1] 2 3  All
 

Page created in 0.611 seconds with 21 queries.