Topic: Openings that are better for P1 or P2 (Read 14738 times)

ben_king · « **on:** December 23, 2014, 01:23:15 pm »

Recently, I collected logs for about 90,000 games involving the top 50 players on the Isotropish ranking, and one thing I've been analyzing is openings. I noticed something unexpected: there are certain openings that are much better for either P1 or P2. If I take the openings and exclude everything that's been opened less than 50 times, here's what falls out.

Top openings that are much better for P1 than P2

Opening	Type	Number of occurences	P1 win %	P2 win %
Moneylender-Fishing Village	4/3	124	76%	29%
Soothsayer-Chapel	5/2	64	80%	32%
Soothsayer-Silver	5/3	72	81%	34%
Remake-Swindler	4/3	82	77%	30%
Salvager-Market Square	4/3	87	72%	28%

Of course, Dominion has a strong first-player advantage built in, so those results aren't especially interesting. What I find much more interesting is that there are openings that are substantially stronger for P2 than for P1.

Top openings that are much better for P2 than P1

Opening	Type	Number of occurences	P1 win %	P2 win %
Rebuild-Silver	5/3	90	36%	67%
Spice Merchant-Lookout	4/3	51	33%	60%
Caravan-Workshop	4/3	59	39%	64%
Quarry-Trade Route	4/3	64	35%	57%
Quarry-Lookout	4/3	77	35%	57%

My first impression here is that the openings in the second table look like higher risk/higher reward openings, which would lend some evidence to the recent discussion about changing strategy/tactics when you're ahead or behind in the game.

Any other thoughts?

Mic Qsenoch · « **Reply #1 on:** December 23, 2014, 02:26:05 pm »

Neat!

I have some questions about the logs you looked at. Did you restrict it to Pro games only? Does it include bot games? These are just games where one of the players is in the top 50 isotropish right?

Is someone a statistician who can tell us how significant these results are given the number of games (~100) in each sample?

Maybe you should restrict it to games where the top 50 person does the particular opening? It's possible with these smaller numbers that a particular opening just happened to have the weaker player as P1 (or P2) in an above average number of games. Maybe it makes no difference.

It might be interesting to only look at games where both players are in the top 50 or top 100. Also looking at samples where the openings are mirrors could be cool also. Do you have the overall P1/P2 win percentage?

JW · « **Reply #2 on:** December 23, 2014, 02:28:59 pm »

Quote from: grsbmd on December 23, 2014, 01:23:15 pm

Of course, Dominion has a strong first-player advantage built in, so those results aren't especially interesting. What I find much more interesting is that there are openings that are substantially stronger for P2 than for P1.

Small sample size. It's not surprising that if you have hundreds of openings with 50-100 occurrences, that for some of those openings player 2 will win substantially more games. There is no way that a Rebuild-Silver (Baker-board) opening by both players is actually more favorable for player 2 than player 1. Regular tests of statistical significance will miss that there are so many openings to test, some are bound to come back with P2 doing better than P1 just by chance.

That said, it might be less of an advantage to be P1 with Rebuild-Silver than with, say, a Junker-Silver opening because junkers have more first-player advantage, and Rebuild can be played very frequently with good shuffles. However, that doesn't imply that you would (or wouldn't) want to take Rebuild-Silver over, say, Cultist-Silver, as either player 1 or player 2.

Edit: I would be interested in statistics on the general first-player advantage. For example, in your entire data set (or, ideally, just pro, non-bot, 2-player games), what percent of the time does the first player win the game (and tie the game)?

ben_king · « **Reply #3 on:** December 23, 2014, 03:15:57 pm »

The original post included only Pro, 2-player games with no bots. Over all the games, the average win percentage for P1 was 57% and for P2 was 43%.

I took your recommendation and restricted it to only games between top-100 players. This leaves us with only ~17,000 openings to analyze. Here are the same results for the more restricted dataset:

Top openings that are better for P1 than P2

Opening	Type	Number of occurences	P1 win %	P2 win %
Potion-Sage	4/3	53	62%	22%
Fishing Village-Fishing Village	3/3	73	65%	27%
Oracle-Silver	3/3	112	73%	34%
Swindler-Silver	3/3	133	68%	32%
Tournament-Ambassador	4/3	52	71%	38%

Top openings that are better for P2 than P1

Opening	Type	Number of occurences	P1 win %	P2 win %
Hermit-Silver	3/3	157	43%	52%
Upgrade-<nothing>	5/2	67	51%	58%
Chancellor-Silver	3/3	60	43%	50%
Storeroom-Silver	4/3	63	50%	52%
Bishop-Silver	4/3	77	52%	54%

So in this restricted set, any advantage for P2 is much less pronounced. It seems very possible, as JW was saying, that the top P2 openings from the original post were simply statistical anomalies. Maybe I'll try to look at mirrors next.

liopoil · « **Reply #4 on:** December 23, 2014, 03:35:45 pm »

Awesome! We've been waiting for this sort of thing for a while. We'd apprecoate any other stats you've gathered for sure. One thing that surprises me: Why are two card openings like tournament-ambassador almost half as frequent as regular one-card openings like swindler/silver or oracle/silver? Also unless I missed something all the openings that had weird winrates were replaces when you narrowed the dataset, which suggests anomoly to me.

Some data on the best openings would be cool too, we haven't had that since before dark ages. My money is on chapel/goons.

EDIT: oracle, tournament, amb, and swindler are cards that favor P1. Meanwhile bishop favors P2 and upgrade/nothing suggests games where the opening gives a player a large advantage/disadvantage, so that would trend towards 50/50. So there is probably some truth to this. What if you narrow it to openings with at least, say, 100 occurences? What's the winrate of silver/silver?

JW · « **Reply #5 on:** December 23, 2014, 03:42:10 pm »

Quote from: liopoil on December 23, 2014, 03:35:45 pm

One thing that surprises me: Why are two card openings like tournament-ambassador almost half as frequent as normal two-card openings like swindler/silver or oracle/silver?

It's because to have a Tournament/Ambassador opening, both Ambassador and Tournament need to be in the Kingdom. To have a Swindler/Silver opening, only Swindler needs to be in the Kingdom. So even though it's more common to open Ambassador/Tournament in a Kingdom that has both of those cards than it is to open Swindler/Silver in a kingdom with Swindler, Kingdoms with Swindler comes up much more often than Kingdoms with the two card combination of Ambassador and Tournament.

liopoil · « **Reply #6 on:** December 23, 2014, 03:43:59 pm »

Quote from: JW on December 23, 2014, 03:42:10 pm

Quote from: liopoil on December 23, 2014, 03:35:45 pm
One thing that surprises me: Why are two card openings like tournament-ambassador almost half as frequent as normal two-card openings like swindler/silver or oracle/silver?

It's because to have a Tournament/Ambassador opening, both Ambassador and Tournament need to be in the Kingdom. To have a Swindler/Silver opening, only Swindler needs to be in the Kingdom. So even though it's more common to open Ambassador/Tournament in a Kingdom that has both of those cards than it is to open Swindler/Silver in a kingdom with Swindler, Kingdoms with Swindler comes up much more often than Kingdoms with the two card combination of Ambassador and Tournament.

This is exactly what I am saying. I would expect it to be on the order of maybe a tenth as frequent, so I am surprised that it is almost half as common. Sorry, I didn't explain that well at first.

JW · « **Reply #7 on:** December 23, 2014, 03:50:39 pm »

Quote from: liopoil on December 23, 2014, 03:43:59 pm

This is exactly what I am saying. I would expect it to be on the order of maybe 10 times less frequent, so I am surprised that it is almost half as common.

Tournament/Ambassador is a stronger opening than Oracle/Silver or Swindler/Silver, so usually people go for it on Tournament/Ambassador boards while there are plenty of alternatives to Swindler/Silver and Oracle/Silver.

This presumably would be an interesting calculation from this top 100 players vs. each other data set: on Tournament/Ambassador (and probably exclude Baker) boards, what % of the time do players with a 4/3 split go for Tournament/Ambassador. And then the same calculation for Oracle/Silver on boards with Oracle (but not Baker), and so on. That would tell you how frequently top players go for each opening when it is available.

markusin · « **Reply #8 on:** December 23, 2014, 04:05:55 pm »

Yeah the new results for P1 advantage are interesting. You have Potion-Sage and double Fishing Village favouring P1. My guess is that those openings are for setting up a deck with cards that can give a real P1 advantage later on, like Familiar and Torturer.

ben_king · « **Reply #9 on:** December 23, 2014, 04:35:51 pm »

Quote from: JW on December 23, 2014, 03:50:39 pm

Quote from: liopoil on December 23, 2014, 03:43:59 pm
This is exactly what I am saying. I would expect it to be on the order of maybe 10 times less frequent, so I am surprised that it is almost half as common.

Tournament/Ambassador is a stronger opening than Oracle/Silver or Swindler/Silver, so usually people go for it on Tournament/Ambassador boards while there are plenty of alternatives to Swindler/Silver and Oracle/Silver.

This presumably would be an interesting calculation from this top 100 players vs. each other data set: on Tournament/Ambassador (and probably exclude Baker) boards, what % of the time do players with a 4/3 split go for Tournament/Ambassador. And then the same calculation for Oracle/Silver on boards with Oracle (but not Baker), and so on. That would tell you how frequently top players go for each opening when it is available.

I actually already have this data calculated as well, but I was going to make a separate topic for it, because there's just a lot of data. But just as a teaser, if we order openings according to the ratio of # of times bought to number of times available, here's where openings that you mentioned fall in the ranking:

Rank	Opening	Type	# times bought	# times available	Ratio
...
13	Tournament-Ambassador	4/3	126	194	0.649
...
1563	Swindler-Silver	3/3	369	4266	0.086
...
1816	Oracle-Silver	3/3	287	4074	0.070

My script for calculating these correctly handles coin-token openings (e.g. a Gold-Chapel opening is only counted as available if Baker is on the board and the player gets a 5-2 split). It currently doesn't handle Nomad Camp openings correctly, which I'm going to try to fix before I post the full list.

GeoLib · « **Reply #10 on:** December 23, 2014, 05:13:45 pm »

Echoing some others here, I'm pretty sure that the player advantage ones are likely just due to the small sample size. If we look at a lot of openings and assume they're all really 50/50 (or 57/43), we're bound to get some that stray one way or the other. Someone better at stats than I could calculate how likely it would be to find, say an opening with >60% win rate for P2 from the given pool if all openings were 57/43.

JW · « **Reply #11 on:** December 23, 2014, 05:35:10 pm »

Quote from: GeoLib on December 23, 2014, 05:13:45 pm

Echoing some others here, I'm pretty sure that the player advantage ones are likely just due to the small sample size. If we look at a lot of openings and assume they're all really 50/50 (or 57/43), we're bound to get some that stray one way or the other. Someone better at stats than I could calculate how likely it would be to find, say an opening with >60% win rate for P2 from the given pool if all openings were 57/43.

Only the original poster could answer this question exactly because no one else knows the details of the pool of openings he has analyzed.

But here's a sample calculation that is much simpler: if there are 400 openings that have 50 games each, and in actuality P1 wins 57% for every opening (with no ties), the chance that at least one of those 400 openings would have >=60% win rate for P2 is about 95%.

Details: The standard deviation of P1's win rate for each opening is about 7%, so (by the central limit theorem for simplicity) the chance that each of the 400 openings has a <=40% win chance for P1 is about 0.76%. So the chance that at least one of the openings has a <=40% win chance for P1 is 1-(the chance that none of the openings have this). Assume that the results of games with one opening doesn't depend on results from other openings (not quite true, because each game will be counted twice since each player has an opening) and you get to 95%.

JW · « **Reply #12 on:** December 23, 2014, 06:16:22 pm »

Quote from: grsbmd on December 23, 2014, 04:35:51 pm

My script for calculating these correctly handles coin-token openings (e.g. a Gold-Chapel opening is only counted as available if Baker is on the board and the player gets a 5-2 split). It currently doesn't handle Nomad Camp openings correctly, which I'm going to try to fix before I post the full list.

Presumably Doctor could also cause difficulties in the analysis. If I open Doctor-Tournament, that might be Doctor on $3 and Tournament on $4. Or I might have had $5, overpaid for Doctor trashing 2 estates, and then drawn 2 coppers and 1 estate, shuffled, and drawn 2 more coppers.

Those openings shouldn't be compared to each other. Also, in the latter case I could buy Tournament on turn 2 after buying Doctor for $5 on turn 1 only because of good shuffle luck in hitting two estates with Doctor. So buying Doctor on turn 1 with $5 won't be as good as the Doctor ($5)-Tournament opening's stats make it seem.

Edit: One way to deal with this would be to include the amount paid for Doctor on turn 1 but not the card bought on turn 2 in Doctor openings. However, this loses some information.

blueblimp · « **Reply #13 on:** December 23, 2014, 09:22:56 pm »

I'm not a statistician, but one way to mitigate the statistical noise would be to randomly separate the data into two sets (maybe 50/50?). Collect the top 10 openings on the first set, then calculate their win rates on the second set of data and report a binomial confidence interval. This can help show whether they scored high on the first data set by fluke or because there's really bias inherent to the opening.

scott_pilgrim · « **Reply #14 on:** December 23, 2014, 10:47:52 pm »

Quote from: Mic Qsenoch on December 23, 2014, 02:26:05 pm

Maybe you should restrict it to games where the top 50 person does the particular opening? It's possible with these smaller numbers that a particular opening just happened to have the weaker player as P1 (or P2) in an above average number of games. Maybe it makes no difference.

Assuming it's random whether the better player is first or second player, then these sample sizes are big enough to be meaningful, I think. Like, if you flipped 100 coins, it would be very rare to get more than 66 heads. In particular, the first row really stands out to me:

Quote from: grsbmd on December 23, 2014, 01:23:15 pm

Top openings that are much better for P2 than P1
Opening Type Number of occurences P1 win % P2 win %
Rebuild-Silver 5/3 90 36% 67%
Spice Merchant-Lookout 4/3 51 33% 60%
Caravan-Workshop 4/3 59 39% 64%
Quarry-Trade Route 4/3 64 35% 57%
Quarry-Lookout 4/3 77 35% 57%

Putting it in a binomial distribution calculator (http://stattrek.com/online-calculator/binomial.aspx), you can see that if you assume it's random who wins (50% chance for each player), the probability of getting 60 or more wins out of 90 is 0.103%, about 1/1000. So obviously you want to say that these results are meaningful (99.9% chance it's not coincidence), but man, I have no idea why Rebuild/Silver would favor P2.

This is how stats works, right? Someone who knows this stuff better than me should check, but this feels right to me.

TheOthin · « **Reply #15 on:** December 23, 2014, 11:00:41 pm »

I'm confused about the data. Some of the results show the P1 win rate and P2 win rate adding up to less than 100%, while some of them show them adding up to more than 100%. Either one of these would be understandable in a vacuum, by counting ties as a victory either for neither or for both, but I'd expect that to mean total win rates either never more than 100% or never less than 100%. Is it possible for the results to end up on both sides of 100% with a consistent method of processing the data?

Unless it gets thrown off by double-counting games where both players take the same opening. Hmm.

heron · « **Reply #16 on:** December 23, 2014, 11:04:00 pm »

Quote from: TheOthin on December 23, 2014, 11:00:41 pm

I'm confused about the data. Some of the results show the P1 win rate and P2 win rate adding up to less than 100%, while some of them show them adding up to more than 100%. Either one of these would be understandable in a vacuum, by counting ties as a victory either for neither or for both, but I'd expect that to mean total win rates either never more than 100% or never less than 100%. Is it possible for the results to end up on both sides of 100% with a consistent method of processing the data?

Unless it gets thrown off by double-counting games where both players take the same opening. Hmm.

They don't add up to 100% because of the games where players take different openings. For example, you would expect the win rates for a curse/curse opening to sum to about 0%.

scott_pilgrim · « **Reply #17 on:** December 23, 2014, 11:20:59 pm »

Wait, these aren't the same players opening the same thing against each other? Okay, I get it now. Yeah, disregard my entire previous post, this is not a binomial distribution. I'm not sure the results mean anything then; I think you would only care about games where both players go for the same opening.

Edit: Wait, so the number of occurrences is just the total number of games where that opening happened at all? But do we know how many times P1 opened that way, and how many times P2 did? That seems like it would be helpful to know.

ben_king · « **Reply #18 on:** December 24, 2014, 12:07:51 am »

Quote from: JW on December 23, 2014, 05:35:10 pm

Only the original poster could answer this question exactly because no one else knows the details of the pool of openings he has analyzed.

But here's a sample calculation that is much simpler: if there are 400 openings that have 50 games each, and in actuality P1 wins 57% for every opening (with no ties), the chance that at least one of those 400 openings would have >=60% win rate for P2 is about 95%.

Details: The standard deviation of P1's win rate for each opening is about 7%, so (by the central limit theorem for simplicity) the chance that each of the 400 openings has a <=40% win chance for P1 is about 0.76%. So the chance that at least one of the openings has a <=40% win chance for P1 is 1-(the chance that none of the openings have this). Assume that the results of games with one opening doesn't depend on results from other openings (not quite true, because each game will be counted twice since each player has an opening) and you get to 95%.

It's kind of surprising that with this many games, it's still possible to lack statistical significance, but there are so many possible openings that most of the openings don't have enough data to really be sure. Even so, I took the data and looked at wins/losses using the binomial test for p-value with 0.57 and 0.43 the expected values for wins and losses for P1 (and vice-versa for P2). Here are some things that we have enough data to say with 95% confidence (p < 0.05):

Either player

Opening Mountebank-<nothing> is significantly better than average (winning percentage of 58.0% averaged over P1 and P2)
Opening Junk Dealer-<nothing> is significantly better than average (winning percentage of 63.2%)
Opening Potion-Silver is significantly worse than average (winning percentage of 45.9%)
Opening Potion-Shanty Town is significantly worse than average (winning percentage of 34.9%)
Opening Smugglers-Silver is significantly worse than average (winning percentage of 40.4%)
Opening Plaza-Silver is significantly worse than average (winning percentage of 40.9%)

P1 openings

Opening Salvager-Silver is significantly better than average (P1 winning percentage of 68.3%)
Opening Oracle-Silver is significantly better than average (P1 winning percentage of 72.5%)
Opening Potion-Storeroom is significantly better than average (P1 winning percentage of 70.3%)
Opening Sea Hag-Silver is signicantly better than average (P1 winning percentage of 64.1%)
Opening Swindler-Silver is significantly better than average (P1 winning percentage of 68.4%)
Opening Hermit-Silver is significantly worse than average (P1 winning percentage of 43.0%)
Opening Wharf-<nothing> is significantly worse than average (P1 winning percentage of 44.2%)

P2 openings

Opening Upgrade-<nothing> is significantly better than average (P2 winning percentage of 58.8%)
Opening Potion-Sage is significantly worse than average (P2 winning percentage of 22.2%)
Opening Fishing Village-Fishing Village is significantly worse than average (P2 winning percentage of 26.9%)
Opening Lookout-Silver is significantly worse than average (P2 winning percentage 27.0%)

There's not a whole lot here that's surprising, except maybe that Wharf-<nothing> is actually a bad opening for P1 (and not that great for P2 either). One other interesting thing that the lists above don't show is that while Swindler-Silver is a good opening for P1, it just missed the significance cutoff for being listed as worse than average for P2.

Edit: a note about Rebuild-Silver, which featured prominently in the first post. The tables in the first post used all 90,000 games, while this analysis used only games between two top-100 players (about 17,000 games). In the smaller dataset, Rebuild-Silver was opened less than 50 times, and didn't make it into this analysis.

GeoLib · « **Reply #19 on:** December 24, 2014, 12:38:42 am »

Quote from: scott_pilgrim on December 23, 2014, 10:47:52 pm

Quote from: Mic Qsenoch on December 23, 2014, 02:26:05 pm
Maybe you should restrict it to games where the top 50 person does the particular opening? It's possible with these smaller numbers that a particular opening just happened to have the weaker player as P1 (or P2) in an above average number of games. Maybe it makes no difference.

Assuming it's random whether the better player is first or second player, then these sample sizes are big enough to be meaningful, I think. Like, if you flipped 100 coins, it would be very rare to get more than 66 heads. In particular, the first row really stands out to me:

Quote from: grsbmd on December 23, 2014, 01:23:15 pm
Top openings that are much better for P2 than P1
Opening Type Number of occurences P1 win % P2 win %
Rebuild-Silver 5/3 90 36% 67%
Spice Merchant-Lookout 4/3 51 33% 60%
Caravan-Workshop 4/3 59 39% 64%
Quarry-Trade Route 4/3 64 35% 57%
Quarry-Lookout 4/3 77 35% 57%

Putting it in a binomial distribution calculator (http://stattrek.com/online-calculator/binomial.aspx), you can see that if you assume it's random who wins (50% chance for each player), the probability of getting 60 or more wins out of 90 is 0.103%, about 1/1000. So obviously you want to say that these results are meaningful (99.9% chance it's not coincidence), but man, I have no idea why Rebuild/Silver would favor P2.

This is how stats works, right? Someone who knows this stuff better than me should check, but this feels right to me.

I believe you're missing something. Certainly, if we were to select an opening and then run 100 trials with it and we got 66 heads, that would be some indication that we have an unfair coin. This is different, however. We have flipped 100 different coins each 100 times and then selected the ones that were far from 50/50. Here we would expect some of the sets of 100 coins to have 66 heads even if we thought all the coins were fair. For more info on this check out the Texas sharpshooter fallacy.

scott_pilgrim · « **Reply #20 on:** December 24, 2014, 01:28:42 am »

Quote from: GeoLib on December 24, 2014, 12:38:42 am

Quote from: scott_pilgrim on December 23, 2014, 10:47:52 pm
Quote from: Mic Qsenoch on December 23, 2014, 02:26:05 pm
Maybe you should restrict it to games where the top 50 person does the particular opening? It's possible with these smaller numbers that a particular opening just happened to have the weaker player as P1 (or P2) in an above average number of games. Maybe it makes no difference.

Assuming it's random whether the better player is first or second player, then these sample sizes are big enough to be meaningful, I think. Like, if you flipped 100 coins, it would be very rare to get more than 66 heads. In particular, the first row really stands out to me:

Quote from: grsbmd on December 23, 2014, 01:23:15 pm
Top openings that are much better for P2 than P1
Opening Type Number of occurences P1 win % P2 win %
Rebuild-Silver 5/3 90 36% 67%
Spice Merchant-Lookout 4/3 51 33% 60%
Caravan-Workshop 4/3 59 39% 64%
Quarry-Trade Route 4/3 64 35% 57%
Quarry-Lookout 4/3 77 35% 57%

Putting it in a binomial distribution calculator (http://stattrek.com/online-calculator/binomial.aspx), you can see that if you assume it's random who wins (50% chance for each player), the probability of getting 60 or more wins out of 90 is 0.103%, about 1/1000. So obviously you want to say that these results are meaningful (99.9% chance it's not coincidence), but man, I have no idea why Rebuild/Silver would favor P2.

This is how stats works, right? Someone who knows this stuff better than me should check, but this feels right to me.

I believe you're missing something. Certainly, if we were to select an opening and then run 100 trials with it and we got 66 heads, that would be some indication that we have an unfair coin. This is different, however. We have flipped 100 different coins each 100 times and then selected the ones that were far from 50/50. Here we would expect some of the sets of 100 coins to have 66 heads even if we thought all the coins were fair. For more info on this check out the Texas sharpshooter fallacy.

Yeah, I misunderstood what the percentages meant. I thought all of the games represented were ones in which both players went for the same opening. If we had 90 games in which both players went for a particular opening, and in 60 of those games, P2 had won, that would be strong evidence that the opening is advantageous for P2, right?

Anyway, I now realize that that's not what these statistics are representing, and I think JW's post correctly addresses the actual situation.

Wait, you're talking about a different problem in my reasoning. Yeah, that's true, but still, if the probability is low enough it can be statistically significant, right? If you flipped 1000 different coins 1000 times each, and one of those sets got 600+ heads, you would still conclude that that coin is unfair, because that's just way more likely than that you got it by coincidence, even though it was just 1 set out of 1000.

Anyway all of this is a moot point since it was based on my incorrect understanding of the stats in the OP.

(BTW, the probability of getting 66 or more heads out of 100 flips is 0.0895%, so even if we flipped 100 coins 100 times, there would only be an 8.56% chance to see at least one set with 66 or more heads.)

GeoLib · « **Reply #21 on:** December 24, 2014, 02:08:56 am »

Quote from: scott_pilgrim on December 24, 2014, 01:28:42 am

Quote from: GeoLib on December 24, 2014, 12:38:42 am
Quote from: scott_pilgrim on December 23, 2014, 10:47:52 pm
Quote from: Mic Qsenoch on December 23, 2014, 02:26:05 pm
Maybe you should restrict it to games where the top 50 person does the particular opening? It's possible with these smaller numbers that a particular opening just happened to have the weaker player as P1 (or P2) in an above average number of games. Maybe it makes no difference.

Assuming it's random whether the better player is first or second player, then these sample sizes are big enough to be meaningful, I think. Like, if you flipped 100 coins, it would be very rare to get more than 66 heads. In particular, the first row really stands out to me:

Quote from: grsbmd on December 23, 2014, 01:23:15 pm
Top openings that are much better for P2 than P1
Opening Type Number of occurences P1 win % P2 win %
Rebuild-Silver 5/3 90 36% 67%
Spice Merchant-Lookout 4/3 51 33% 60%
Caravan-Workshop 4/3 59 39% 64%
Quarry-Trade Route 4/3 64 35% 57%
Quarry-Lookout 4/3 77 35% 57%

Putting it in a binomial distribution calculator (http://stattrek.com/online-calculator/binomial.aspx), you can see that if you assume it's random who wins (50% chance for each player), the probability of getting 60 or more wins out of 90 is 0.103%, about 1/1000. So obviously you want to say that these results are meaningful (99.9% chance it's not coincidence), but man, I have no idea why Rebuild/Silver would favor P2.

This is how stats works, right? Someone who knows this stuff better than me should check, but this feels right to me.

I believe you're missing something. Certainly, if we were to select an opening and then run 100 trials with it and we got 66 heads, that would be some indication that we have an unfair coin. This is different, however. We have flipped 100 different coins each 100 times and then selected the ones that were far from 50/50. Here we would expect some of the sets of 100 coins to have 66 heads even if we thought all the coins were fair. For more info on this check out the Texas sharpshooter fallacy.

Yeah, I misunderstood what the percentages meant. I thought all of the games represented were ones in which both players went for the same opening. If we had 90 games in which both players went for a particular opening, and in 60 of those games, P2 had won, that would be strong evidence that the opening is advantageous for P2, right?

Anyway, I now realize that that's not what these statistics are representing, and I think JW's post correctly addresses the actual situation.

Wait, you're talking about a different problem in my reasoning. Yeah, that's true, but still, if the probability is low enough it can be statistically significant, right? If you flipped 1000 different coins 1000 times each, and one of those sets got 600+ heads, you would still conclude that that coin is unfair, because that's just way more likely than that you got it by coincidence, even though it was just 1 set out of 1000.

Anyway all of this is a moot point since it was based on my incorrect understanding of the stats in the OP.

(BTW, the probability of getting 66 or more heads out of 100 flips is 0.0895%, so even if we flipped 100 coins 100 times, there would only be an 8.56% chance to see at least one set with 66 or more heads.)

Oh all of the numbers in my post are completely made up. My point was just that it's not necessarily statistically significant. It still might be, but someone needs to do the math to figure it out.

JW · « **Reply #22 on:** December 26, 2014, 04:33:27 pm »

Quote from: grsbmd on December 24, 2014, 12:07:51 am

It's kind of surprising that with this many games, it's still possible to lack statistical significance, but there are so many possible openings that most of the openings don't have enough data to really be sure. Even so, I took the data and looked at wins/losses using the binomial test for p-value with 0.57 and 0.43 the expected values for wins and losses for P1 (and vice-versa for P2). Here are some things that we have enough data to say with 95% confidence (p < 0.05):

Here P<0.05 only means that if the true expected rate were 0.57/0.43, that you would get a result "this extreme" less than 0.05 of the time. Since you presumably tested many hundreds of different hypotheses (the win rates for each opening, both overall and for Player 1/player 2 separately), we would expect about 1/20th of them to fall outside those bounds even if all openings had a 0.57/0.43 expected win rate.

This isn't to say that I think Mountebank isn't a good opening

. But when testing many hundreds of hypotheses, p<0.05 isn't strict enough to avoid many false positives.

One correction (the Bonferroni correction) that minimizes the chance that multiple comparisons lead to false positives is that if you apply a 0.05 threshold for the p-value a single comparison, for N comparisons apply a 0.05/N threshold.

ben_king · « **Reply #23 on:** January 02, 2015, 12:36:16 am »

Quote from: JW on December 26, 2014, 04:33:27 pm

Quote from: grsbmd on December 24, 2014, 12:07:51 am
It's kind of surprising that with this many games, it's still possible to lack statistical significance, but there are so many possible openings that most of the openings don't have enough data to really be sure. Even so, I took the data and looked at wins/losses using the binomial test for p-value with 0.57 and 0.43 the expected values for wins and losses for P1 (and vice-versa for P2). Here are some things that we have enough data to say with 95% confidence (p < 0.05):

Here P<0.05 only means that if the true expected rate were 0.57/0.43, that you would get a result "this extreme" less than 0.05 of the time. Since you presumably tested many hundreds of different hypotheses (the win rates for each opening, both overall and for Player 1/player 2 separately), we would expect about 1/20th of them to fall outside those bounds even if all openings had a 0.57/0.43 expected win rate.

This isn't to say that I think Mountebank isn't a good opening . But when testing many hundreds of hypotheses, p<0.05 isn't strict enough to avoid many false positives.

One correction (the Bonferroni correction) that minimizes the chance that multiple comparisons lead to false positives is that if you apply a 0.05 threshold for the p-value a single comparison, for N comparisons apply a 0.05/N threshold.

Totally agree. Unfortunately, I don't have close to enough data to establish significance even with the Bonferroni correction. Even so, here are some results from analyzing mirror openings. I suspect that these are real effects, the significance of which could be established with enough data.

Mirror openings that are better for P1 than P2

Opening	Number of occurences	P1 win%	Why this might be a real effect
Oracle-Silver	112	73%	Since Oracle messes with the top of the opponent's deck, playing it right after the 1st shuffle can keep an opponent from hitting 5
Swindler-Silver	133	68%	P1's Swindler might hit P2's Swindler and keep him from ever getting to play it
Tournament-Ambassador	52	71%	Tournament is a race, and winning simultaneously gets you a good prize and denies that prize to your opponent

There's only one opening favoring P2 that seems to be a real effect. But I don't have a good explanation for why it would be better for P2.
Mirror openings that are better for P2 than P1

Opening	Number of occurences	P1 win%	Why this might be a real effect
Hermit-Silver	157	43%	(Remember that 57% is the baseline win percentage for P1, so this is way below that)

It does seem to me (though it's not backed up by the data) that there could be some cards like Trade Route, Forager, or Graverobber that could legitimately be better cards for P2 to open in a mirror since he could get P1 to help activate it for him.

JW · « **Reply #24 on:** January 02, 2015, 01:09:14 am »

It's easy to see why Tournament Ambassador had extra player 1 advantage: junking attacks favor first player who can reshuffle before junk cards end up in your deck. Also Tournament favors first player because if you buy province you can use it to block opposing tournaments so your opponent may not even get to province, plus the race to prizes that you mention.

Dominion Strategy Forum

News:

Author Topic: Openings that are better for P1 or P2 (Read 14738 times)