Dominion Strategy Forum

Dominion => Dominion General Discussion => Topic started by: markus on September 17, 2018, 06:10:36 pm

Title: Dominion Log Statistics
Post by: markus on September 17, 2018, 06:10:36 pm
Weíve had some fun with that already on the Dominion Discord (https://discord.gg/2rDpJ4N) and I thought it was time to write up a summary.
Using ceviriís tool woodcutter (http://ceviri.me/woodcutter/) Iíve logged games of top players played since the start of this year. Games qualified, if at least one player had skill (=mu) of at least 1.9 at the time. For all conclusions that you draw from this, keep in mind that this is really the right tail of the skill distribution (top 0.9%).
In addition, about 12% of the logged games are because of specific players that played them. Iím dropping all games that ended before turn 3 and those with more than 2 events/landmarks. The result are about 24,000 games that I use - and for which the logs can be found on Woodcutter.

The results of the log analysis can be found starting from this google sheet: https://docs.google.com/spreadsheets/d/1M2L7hcY3sbA33OwuZhgPYJWVlMFgJYBdK8cnkbJHmbo/edit#gid=0
Summary of the results can be found for each card in the form of images in this album: https://1drv.ms/a/s!AgOcGYxKWHVDnXKCXradFogAJnMu

Some caveats first: for about 5% of the games grabbing the log was not successful. Some of them might be because I lost connection, more worrisome are the ones that are not randomly dropped: Smugglers had a bug that made some of its games unloadable; when the last decision is an autoplay by the bot (primarily Changeling) it canít be loaded; and there are some internal errors. Thereís also a bug with Band of Misfits and Overlord such that it counts as the copy of the card it is when it is in play at game end Ė so I try to exclude those games when that matters.
A limitation of the logs is that the last decision is not recorded. That could be an innocuous ďend buy phaseĒ, but also buying the last Province.

In this post, I primarily want to describe what I did and what you can find there. Thereís a lot of data so expect to find some outliers, if you start searching for them. For example, itís intuitive that itís good to have a 5-2 opening on a board that has Witch. That itís good to have 5-2 on a board with Fountain is more likely to be noise.

Letís start with the information included in the graphs, using Rebuild (https://onedrive.live.com/?authkey=%21AIJetp0WiAAmcy4&v=photos&cid=4375584A8C199C03&id=4375584A8C199C03%213670&parId=4375584A8C199C03%213826&o=OneUp) as an example:

(https://am3pap003files.storage.live.com/y4mzJkLJpg-iFlW3LqNhlFhJrX83gpNxQLKeVoexd2plnOxk7T2kjLTIWftuMIJbMWr_Kn_kv6xnjjTOBnegKQt-dmhd48ss2-N7UMVbOs_WqOIr42QNrRoNuPPd97H3lyCj7XqYALgxOYQnTQDVlcaL_aF52-STvf20godL16zapHVmR5fHrs7_mfOOXMM581nEUSuoqz0PK-cFb0gQoLNOw/298%20Rebuild.png?psid=1&width=1341&height=1072)

There are 741 boards with Rebuild.
The first player has won 62% (more precisely, if both players had the same strength, the first player would win 62%). This is slightly higher than the 59% estimated across all boards. But the standard error of this estimate is 2.1%, so itís not a (statistically) significant deviation from the usual first player advantage. This is an observation that Iíve made more generally: changes in first player advantage tend to be small: there is little signal relative to the noise, so donít try to interpret too much into it - even if it makes sense that FPA should be higher on Rebuild boards. On the flipside, ďlittle signalĒ means that we can be relatively sure that there are no cards which make the first player win 70% or more of the games.

What I call the ďskill multiplierď is 0.93 for Rebuild, which indicates that it favours the weaker player as itís less than 1. The motivation for this estimate is that in theory the win probability of a player with skill difference ∆mu is given by winprob=1/(1+exp(-∆mu)). The skill multiplier is the factor that multiplies the skill difference in this formula such that the observed results on Rebuild boards are explained best:  winprob=1/(1+exp(-∆mu*skill_multiplier)). A value less than 1 means that the difference in mu between the players gets effectively shrinked Ė the better player wins less than they should according to their skill advantage. For example, a player with a positive ∆mu=1 (that is 7.5 levels) should win 73.1% of the games in general, but only wins 71.7%. Again, I show the standard error for this estimate showing that itís not significantly smaller than 1. Also note that the estimated skill multiplier across all boards is 0.94, such that better players always tend to underperform a bit. (My short explanation would be that for top players their skill estimate mu is too swingy Ė my mu has fluctuated between 1.9 and 2.3 this year and I donít believe that a lot of this was actual skill changes. As a result, when my mu is low after a bad streak I outperform expectations and vice versa.)

Next on the top left are the usual game endings with that card on the board. As the last decision is not logged, the classification might not always be exact, but Iím following the rules: if thereís at most 1 Province or Colony in supply it counts as Province ending; if the supply is at most 1 card (not Province or Colony) away from a three-pile ending it counts as such. All other games count as resignation. Note that some games will be classified as both Province and 3-pile ending (more than should be in reality) and some 3-piles might wrongly count as resignation (e.g. two Ports left in supply that are bought with last buy for 3-pile). Over all games there are 39% Province endings, 28% 3-piles, and 35% resignations. Governor (https://onedrive.live.com/?authkey=%21AIJetp0WiAAmcy4&v=photos&cid=4375584A8C199C03&id=4375584A8C199C03%213528&parId=4375584A8C199C03%213826&o=OneUp) leads to many Province and few 3-pile endings and Goons (https://onedrive.live.com/?authkey=%21AIJetp0WiAAmcy4&v=photos&cid=4375584A8C199C03&id=4375584A8C199C03%213527&parId=4375584A8C199C03%213826&o=OneUp) is the other way around. Tournament (https://onedrive.live.com/?authkey=%21AIJetp0WiAAmcy4&v=photos&cid=4375584A8C199C03&id=4375584A8C199C03%213754&parId=4375584A8C199C03%213826&o=OneUp) games have a high rate of resigns.

In the bottom panels there is the histogram with the share of games in which each player gains a certain number (left) and for the difference in the number of gains (right).
You can roughly see whether that led to more wins or losses from the colours in those bars and the top panels have the details: first, the blue coloured lines show the estimate for the win rate with a certain number of gains as well as the 95% confidence interval. For Rebuild (https://onedrive.live.com/?authkey=%21AIJetp0WiAAmcy4&v=photos&cid=4375584A8C199C03&id=4375584A8C199C03%213670&parId=4375584A8C199C03%213826&o=OneUp) this suggests that a player who doesnít gain any Rebuild wins more than 50% of the games and a player with 1 Rebuild wins fewer than 50% of the games. But this might reflect that better players are more likely to skip Rebuild and they would also win more often if they go for Rebuild. To take out this effect, I estimate the version corrected for skill in green. This version uses the skill difference as an explanatory variable such that the result is an estimate for how well the players do against an equal opponent. This reduces the effect of playing without Rebuild to basically 0 (49% win rate). The right panel shows the same using the difference in gains instead of the absolute number Ė in the case of Rebuild thereís nothing statistically significant there.

Some gain statistics are also summarized on the top:

Some thoughts on interpreting those numbers:

Finally, let me remind you that this only uses the games of the top, and you would likely find different results for lower ranked players.

So much for the summary stats for each card. Most of the underlying information and much more can be found on the google sheets starting from here (https://docs.google.com/spreadsheets/d/1M2L7hcY3sbA33OwuZhgPYJWVlMFgJYBdK8cnkbJHmbo/edit#gid=0): I hope that it is more or less self-explanatory for someone who wants to dig deeper. Iíll just point out what else can be found there. First there are tabs on that sheet with stats for the whole database. Then, there are separate sheets for the different players that have a bunch of games (or were interested in them). Those are linked from the overview tab. Most useful for a general audience are:

On the general sheets, there is a tab with


The individual sheets have a tab that compares the buys / gains / trashes of the named baseline player with their opponents and one tab that shows the distribution of the number of buys / gains / trashes of each card.
Then they have the ďgain 1st" and ďgain 1st Qvist" tabs for that player only.
The boards tab has some aggregate statistics for boards with that card: average number of turns, average number of buys / gains / trashes. Then it has the first player advantage (not a lot of effect there) and the change in the win probability for that player. For the named players and the 5-2 opener it shows how much they outperformed expectations when that card is on the board. (e.g. being the only player to open 5-2 on a Witch board gives you a 15% outperformance, that is a 65% win chance against an equal opponent with random start.) For the better player sheet this column shows the skill multiplier (whether skill difference is more or less important on those boards).

I also tried to classify cards on the better-boards sheet (https://docs.google.com/spreadsheets/d/1h6mrF-8h0lPftNtyO7paVtS6OPbytDhzFKHpEi-YLsE/edit#gid=1033611182) in terms of being village, draw, trasher, gainer (and +buy), alt-VP, attack and types of attack. The idea was to see how the presence or absence / combination of these affects the win probability. Now, you could fill threads discussing the cards, my first try was to have them at value 0, 0.5, or 1 and then round down. (if thereís only a 0.5 Village on the board like Necropolis, the board counts as not having Villages).
Finally the logged game numbers used for the sheet with the kingdoms are on the last tab.

Have fun with the numbers and let me know what else you'd like to see!
Title: Re: Dominion Log Statistics
Post by: trivialknot on September 17, 2018, 09:51:58 pm
That's really neat!

I haven't looked through the spreadsheet yet, and I was just browsing the images.  A few random observations...
-People who gained Dame Anna apparently didn't have a significantly improved chance of winning.
-Going by win rates for only one player receiving, the best boons are Earth, Field, Sea, and Wind (57%), and the worst ones are Moon, Mountain, and Sky (54%).  But the confidence intervals are all 1.4% so they're all pretty close.
-The worst hexes to receive are Greed (43%), War (44%), and Misery (44%), and the least bad ones are Envy (48%), Delusion (47%), Bad Omens (47%), and haunting (47%).  Confidence intervals are 1.5%.  (Edit: as pointed out by markus, these are error bars not confidence intervals)
-Trusty Steed is the most popular prize (gained 77% of the time), closely followed by Followers (71%). Princess is 57%, Diadem is 32%, and Bag of Gold is 31%.
-Save is bought 5-6 times per game on average.  That's more than Alms (4-5 times), so I think it might card with the highest buy/gain/receive rate.

Hey, is it possible to sort the cards by skill multiplier?
Title: Re: Dominion Log Statistics
Post by: faust on September 18, 2018, 01:47:28 am
What is very interesting to me is that a lot of trashers have negative gain advantage. Amulet is at -11%, Raze even at -18%, and Chapel and Steward both at -4% (numbers where only 1 player gains it). That seems to indicate that trashing is overvalued in the current metagame.
Title: Re: Dominion Log Statistics
Post by: markus on September 18, 2018, 02:21:58 am
Hey, is it possible to sort the cards by skill multiplier?
This is in the sheets found here (https://docs.google.com/spreadsheets/d/1h6mrF-8h0lPftNtyO7paVtS6OPbytDhzFKHpEi-YLsE/edit#gid=1033611182).
(It's called skill factor there and also includes games with Band of Misfits and Overlord as the only problem with them is identifying how many were gained.)
You can have "temporary filters" in google sheet to sort it directly there, but if you want to crunch the numbers a bit more I'd recommend downloading the sheet.

Top cards for the skill factor are Mountain Pass, Secret Cave, Donate, Bishop, Peasant. (going up to 1.3 such that you'd win for example 78.6% instead of 73.1%)
Bottom cards are Swindler, Chariot Race, Familiar, Fool, Hunting Grounds (going down to 0.7 such that you'd win for example 66.8% instead of 73.1%)


I just want to point out the +/- numbers are standard errors such that you have to add/subtract the number twice to get about a 95% confidence interval. And then keep in mind that with 400 estimates, you'd expect to see 20 that are outside of this interval - and that the large/small numbers in a top/bottom list are more likely to be affected by noise.
Title: Re: Dominion Log Statistics
Post by: Cave-o-sapien on September 18, 2018, 03:11:51 am
What is very interesting to me is that a lot of trashers have negative gain advantage. Amulet is at -11%, Raze even at -18%, and Chapel and Steward both at -4% (numbers where only 1 player gains it). That seems to indicate that trashing is overvalued in the current metagame.

Or people are choosing the wrong trasher when presented with several options.
Title: Re: Dominion Log Statistics
Post by: faust on September 18, 2018, 11:00:30 am
Another surprising bit of data: Tax has a completely average first player advantage.
Title: Re: Dominion Log Statistics
Post by: trivialknot on September 18, 2018, 11:20:59 am
Castles!

The likelihood that each castle will be gained are, in order:
73%, 60%, 51%, 47%, 43%, 38%, 33%, 26%.

Lower-ranked players are more likely to gain each Castle, so I think the most common situation is a lower-ranked player going for the Castles pile, and the higher-ranked player tactically swiping a few.  Which ones are the best swiping targets?

Gain % for higher-ranked player / gain % for lower-ranked player:
35/38, 25/35, 22/29, 23/23, 19/23, 16/22, 16/17, 15/11

So the favorite swiping targets appear to be Humble, Haunted, Grand, and King's.
Title: Re: Dominion Log Statistics
Post by: trivialknot on September 18, 2018, 11:36:39 am
What is very interesting to me is that a lot of trashers have negative gain advantage. Amulet is at -11%, Raze even at -18%, and Chapel and Steward both at -4% (numbers where only 1 player gains it). That seems to indicate that trashing is overvalued in the current metagame.
I'm not sure that's true of trashers in general.  Sentry is at +13%, Plan is +3%, Cemetery is +1%.  There isn't an easy way to look at them all together though, so I'm not sure.

Raze in particular is interesting, because it's -18% if you correct for skill, and -1% if you don't correct for skill.  That suggests to me that Raze really is overrated by higher-ranked players.
Title: Re: Dominion Log Statistics
Post by: Awaclus on September 18, 2018, 12:16:26 pm
Another surprising bit of data: Tax has a completely average first player advantage.

That's not that surprising IMO. A lot of the time, you just buy the exact same cards in the opening as your opponent does regardless of Tax.
Title: Re: Dominion Log Statistics
Post by: trivialknot on September 20, 2018, 07:41:59 pm
Another question: can you extract statistics on Mountain Pass bids?
Title: Re: Dominion Log Statistics
Post by: markus on September 21, 2018, 02:45:02 am
You should join Discord where I already posted that in the past.

First bidder is usually the player that didn't gain first Province:

Games with Mountain pass bidding taking place: 73%
Average turns before bidding taking place: 22.8
First bidder wins bid: 40%
Bid winner wins game: 55%
First bidder wins game: 39%
Average winning bid: 14.2
Median winning bid: 14
Title: Re: Dominion Log Statistics
Post by: aku_chi on February 01, 2019, 08:46:21 am
markus is still collecting stats and presenting them better than ever!  I recently made a video where I talk about how to find the stats, interpret them, and why I find them valuable.  Hopefully this interests some people.

https://www.youtube.com/watch?v=QeZJuwO4bq8