1

**Dominion Online at Shuffle iT / Re: How many Players?**

« **on:**May 06, 2018, 04:35:39 pm »

Those are only the accounts that have played rated 2 player games.

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

1

Those are only the accounts that have played rated 2 player games.

2

Yes, you'd usually expect that. For Feodum/Masterpiece both cards are not getting gained much without the other, but gained a lot when they are both present. For Capital/Mandarin I'd expect that Capital increases gain% of Mandarin more than the other way around.Or like, on board where you go for a Stash deck and you buy Duchy, Duchess is a "might as well" addition. Or am I misinterpreting the statistic.

It's the other way around: on a board with Duchess, you're more likely to go for a Stash deck. (We don't know the number for Duchess on a Stash board.)

But there's overlap isn't there? Like, Masterpiece/Feodum has a similarly high synergy factor to Feodum/Masterpiece.

3

Or like, on board where you go for a Stash deck and you buy Duchy, Duchess is a "might as well" addition. Or am I misinterpreting the statistic.

It's the other way around: on a board with Duchess, you're more likely to go for a Stash deck. (We don't know the number for Duchess on a Stash board.)

4

For 3) the difficulty is whose card categories do I use? I hope that the PCA analysis, when I do it, will provide some insight into card categories.I know, that people can argue what defines a Village. I would suggest that you pick whatever definition you like. Given that most villages are uncontroversial, it shouldn't matter too much for the results.

I'm skeptical that PCA will deliver anything useful, but I'd be happy to be wrong about that.

5

But that runs against the idea of "how much does the presence of the card change the gain probabilities of cards?". By that definition, the presence of Counting House doesn't affect the probability of gaining Counting House conditional on it being present.Yeah, to put it another way, if, in a game with Counting House, you're not much more likely to gain Counting House than you are in a gameI don't like the own-effect of the impact factor, because the presence of the card in the kingdom shouldn't affect the average probability to gain it.It made sense to me because the idea was to see how much the card changes your gains, and surely the presence of a card changes your ability to gain itself.withoutCounting House, that's evidence that Counting House is a fairly low-impact card whose presence on the board doesn't usually matter.

I wouldn't call it "synergy factor", if it sums up the absolute values of synergy and anti-synergy.

6

I don't like the own-effect of the impact factor, because the presence of the card in the kingdom shouldn't affect the average probability to gain it.

But shouldn't it be smaller for cards that show up more frequently (Platinum), as it boils down to:

P(gain X|X is in supply) * (1 - P(X is in supply)),

and P(X is in supply) is higher for Platinum? (It can be offset by a higher probability of gaining Platinum.)

The other thing that I noticed with the current formulat is that a card Y that is gained independently of all other cards, still contributes to their impact factors:

P(gain Y|Y is in supply) * (P(Y is in supply|X is in supply) - P(Y is in supply) )

In that sense, I like the alternate formula more than the current one.

Wishlist:

1) Also show a version that excludes Copper, Curse, Ruins, Estate (am I missing something?)

2) For each card, show the top 3 cards that contribute the most in a positive or in a negative way to the impact factor.

3) I (still) would like to see the impact factor (and number of cards gained from that pile) for categories of cards.

4) Analysis with games from Shuffle iT.

But shouldn't it be smaller for cards that show up more frequently (Platinum), as it boils down to:

P(gain X|X is in supply) * (1 - P(X is in supply)),

and P(X is in supply) is higher for Platinum? (It can be offset by a higher probability of gaining Platinum.)

The other thing that I noticed with the current formulat is that a card Y that is gained independently of all other cards, still contributes to their impact factors:

P(gain Y|Y is in supply) * (P(Y is in supply|X is in supply) - P(Y is in supply) )

In that sense, I like the alternate formula more than the current one.

Wishlist:

1) Also show a version that excludes Copper, Curse, Ruins, Estate (am I missing something?)

2) For each card, show the top 3 cards that contribute the most in a positive or in a negative way to the impact factor.

3) I (still) would like to see the impact factor (and number of cards gained from that pile) for categories of cards.

4) Analysis with games from Shuffle iT.

7

I missed that the absolute amount of change is added up.

8

I don't think that this measure is useful. If I understand it correctly, it is high when you're more likely to gain all other cards in its presence.

Wouldn't that at first glance mean that the card is weak, because I'm more likely to want other cards?

Junkers make people gain Curses, Ruinss, etc, such that they end up high on the list, but not because people want to gain those cards.

Or it really favours engines, such that I want to gain a lot of different cards.

In the end, I can maybe explain why a card has a high or low impact factor, but I wouldn't know the reason, without knowing the card.

I think a more useful version would define categories for cards (Village, Smithy,...) and calculate the impact on those, e.g. am I more likely to gain Villages in the presence of Rebuild or not.

Wouldn't that at first glance mean that the card is weak, because I'm more likely to want other cards?

Junkers make people gain Curses, Ruinss, etc, such that they end up high on the list, but not because people want to gain those cards.

Or it really favours engines, such that I want to gain a lot of different cards.

In the end, I can maybe explain why a card has a high or low impact factor, but I wouldn't know the reason, without knowing the card.

I think a more useful version would define categories for cards (Village, Smithy,...) and calculate the impact on those, e.g. am I more likely to gain Villages in the presence of Rebuild or not.

9

I'm calculating mid season forecasts of the league that simulate the outstanding games based on your rating. The results can be found here. They show the expected number of points after the season and the probability of finishing 1st to 6th for each player. (A-Division accounts for the champion match when it comes to the win probability but not for the expected points.)

To show the evolution over time, there are graphs for each division. They show the expected points for each player after each result that has been reported.

There are also graphs for each player. They show the probability of finishing 1st to 6th, calculated after each day of the season.

The spreadsheet and graphs should be updated at least once daily.

Methodology:

All outstanding games are simulated 100,000 times. In each simulation, a player's skill is drawn from a normal distribution with mean mu and standard deviation phi as given by the current official leaderboard.

Tie probability is set to 2%, and the win probability is 98%/(1+exp(-(skill1+FPA-skill2))). FPA is the first player advantage set to 0.5 for each player in 3 of the 6 games. That corresponds to about a 60% win chance for the first player against an equal opponent.

10

I'm glad that the horrible idea of going for a simple Rebuild strategy after two hours of heavier thinking can at least be used as reference.

11

I think that it's rather weak, because it's slow, costs 5, and can only gain cards costing up to 4.

Therefore, you want to get it as soon as possible. Whether you do want it, still depends on the alternatives that cost 5 ... often there's something better and the game is too short to get much out of Cobbler. On weak boards it can be good, however. If there's a weak engine, it might be nice to have 2 Cobblers alternating to ensure that you can kick off.

Therefore, you want to get it as soon as possible. Whether you do want it, still depends on the alternatives that cost 5 ... often there's something better and the game is too short to get much out of Cobbler. On weak boards it can be good, however. If there's a weak engine, it might be nice to have 2 Cobblers alternating to ensure that you can kick off.

12

For War it should be enough that your first card is not trashed but discarded to cover your Cursed Village.

14

The following board made me calculate the probabilities in this post, but the lesson learned is hopefully useful more broadly on Remake boards:

Having $3/$4, I opened Haven/Remake. My opponent opened Silver/Remake (as was also suggested in spectator's chat).

So why did I prefer Haven over Silver? The first thought is that you really don't want your Remake to miss T3/4, because you'd fall behind on trashing and Ghost Ship attacks will hurt a lot. The other advantage of Haven is that you can make your Remake find the Estates more likely, if you draw it on turn 3, and it makes a Copper or Estate miss the shuffle, if you draw it turn 4.

On the other hand, opening Silver makes getting Ghost Ship more likely on T3/4. But it might be a bit awkward to play, if you draw it without Village on T5/6. (You could draw your Remake dead, or you already have Remake in your hand such that you prefer trashing.)

A third possible opening would be Village/Remake. On T3/T4 this is going to be worse than Haven/Remake, but you really want many Villages, so maybe that's worth it. (In my opinion it beats the Silver/Remake opening, but I'd still prefer Haven/Remake.)

I calculated the outcomes on turn 3 and 4 under the assumptions that you want to maximize the number of trashed Estates (e.g. on T3 you set aside Remake, if there aren't 2 Estates in your hand; or you set aside an Estate, if Remake is not in your hand), and then you prefer to get to $5. On $3 or $4 you take Villages, but the 3rd time you take a Silver, if you didn't open it. (Maybe only the 4th one should be Silver, but this is irrelevant for this post, as I will only consider T3/4 and you can think of the additional Silver being a Village, if you prefer that.)

Here are the trashing probabilities and the probability to hit 5 (when Remake is not in your hand):

Haven and Village openings have a 91% chance to trash, whereas opening Silver the chance is only 83%. Given that you'd be in a very bad position in those 8% of games that seems a significant advantage of opening cantrip with Remake. The advantage of Haven over Village is that more Estates are trashed on average (+0.25 vs Village and +0.42 vs Silver). Chances to hit $5 are low, however.

Here is what the average deck looks like after turn 4 ("net stop cards" is defined as the number of Estates, Coppers, Silvers, Remake minus the Ghost Ships):

The advantage of 0.4 fewer stop cards with Haven might not seem that big, but there's a bit more to consider:

1) After T4, you draw 2 stop cards from the bottom of your deck with a Silver opening compared to 1 with Haven/Village.

2) With Haven, there's a 42% chance that you played Haven on turn 4, such that it set aside a Copper or Estate that misses the shuffle together with Haven.

Therefore, on average you draw about 3 of your 7.9 remaining stop cards with Silver and 4 of your remaining 8.1 stop cards with Haven. (Ghost Ship could make you draw more, but you're unlikely to have Ghost Ship and Village in your hand on T5 when opening Silver.) Hence, you are significantly more likely to draw Remake for the second time on T5 with Haven and would trigger another shuffle after T5 or during T6. (Due to this consideration it might be better to not take a Silver on T3/4, reducing your average net stop cards to 9.1.)

Finally, those are all the outcomes on turns 3 and 4 that above calculations are based on:

15

I think you should mention that it benefits a lot from starting your turn with a larger hand (duration draw, expedition) or sifting (Dungeon, but also other sifters can be fine).

Without those, Shepherd is often not reliable enough in my opinion. For example, how well does it do with just Market Square and Trade Route (as light trashing)?

Without those, Shepherd is often not reliable enough in my opinion. For example, how well does it do with just Market Square and Trade Route (as light trashing)?

16

Bugrrport: Stef's game against should16 shows in lates games as 423 days ago.This means that the game doesn't count for the ranking as the user has been banned.

http://dominion.lauxnet.com/scavenger/?user=Stef&num_results=10

17

I'll be playing Jan (Netherlands) on Wednesday at UTC 19:30.

18

Asking for automatch to work with created tables would be a lot more feasible. I would totally agree with that.What does automatch with created tables mean? If there are two players at separate tables that want to play their next game with Black Market that they should get matched automatically? In practice, you would wait forever.

19

Why is the update of mu proportional to phi^2?It's intuitive that the update should increase in phi as higher uncertainty about the skill makes you update your beliefs more when new information comes in. But I guess your question is why it's the square. For that you'd have to consult the Glicko paper.

20

I think the correlation between games played and µ is hiding some important information: the historical record of all games played on other systems. Of course you don't have that data, and maybe it's missing completely at random, but I doubt it.That is true to some extent. I only took accounts with at least 100 games. By then mu should be about where the starting skill is due to previous experience. That explains the dispersion at the left end. But the other problem is that I'm just looking at the cross-section of players right now.

So let me attempt something different, that aims at seeing how a player's skill changes over time. Here I'm only taking the 1745 players with at least 1000 games and look at how their mu has changed since game 100:

21

For this analysis, I’m using the same data as Scavenger. (If you don’t know it, check it out!)

I’m using all rated 2-player games played until January 29th.

**Rating System**

I’ll mostly talk about mu, so let’s start with a quick summary of the rating system. You can find some more info also here and the links contained therein.

**1) mu (µ)**: this is the best measure of your skill and everyone starts with mu=0. It’s a relative measure and the expected win percentage between two players mostly depends on the difference between the two players’ mu. For example, a difference of 1 corresponds to about 73% chance of winning (ties always count as half a win). Here’s a graph that shows this probability in general:

**2) phi (ϕ)**: the second parameter measures the uncertainty around the skill mu. In 95% of the cases a player’s true skill should lie in the interval [mu-2*phi,mu+2*phi]. Players start with phi=0.75.

**3) Level**: the level is simply calculated as 50+7.5*(mu-2*phi). It is therefore a conservative measure of your skill as it takes the lower bound of the interval given above. That also means that players with fewer games (recently) are on average underrated in terms of their level. But you can’t sit on your high level after some (lucky) wins.

**4) sigma (σ)**: this is a measure for the stability of your skill. Players start with sigma=0.033 and it doesn’t move much, because stability of mu is hard to estimate given the few games per rating period (=1 day). Given this assumed parameter, the skill of a typical player either gains or loses 0.033 of skill on a day. This makes the estimate of the skill less certain when a player doesn’t play (much) and phi increases.

**How does the rating change?**

In theory, it’s simple: mu increases, if you win more games than you were expected to. Scavenger also calculates that for you. How much mu changes also depends on your uncertainty phi. The more certain your rating is, the less it will change.

In particular, the formula is:

mu_change = phi^2*(actual_wins – expected_wins)

So, if your phi=0.2, winning or losing a game makes a difference of mu=0.04 (or level=0.3). If you were expected to win with 75%, then winning adds mu=0.01 and losing subtracts mu=0.03.

Uncertainty phi decreases with each game played and increases due to sigma. If your opponent is closer to your skill, phi will decrease more as the result is more informative (what matters is (win_probability*(1-win_probability)). If you play a constant number of games per day, your phi will converge to a certain value. (if you play less afterwards, it will increase again and vice versa.)

For example, if you play 1/5/10 games per day, phi will end up around 0.26/0.17/0.15.

**Games Played**

Here’s the number of those rated 2-player games recorded per day and the number of what I defined as “active players”, i.e. having played at least 10 games in the last 30 days.

*Edit: the number of games in the left graph should be halved because each game is counted for each player, hence twice.*

There are around~~20,000 ~~10,000 games played per day and active players are around 5,000. You can notice the reduction in games played in late October, when Nocturne preview was available.

**Distribution of Skill**

Here’s the histogram of the current skill of all players, only active players, and the one weighted by the number of games played (in that one mu is the value on the day the game was played:

The following heat maps show which players get matched most frequently. The right one zooms in one games with at least one player having mu=1.5:

You can see above that the distribution is not centred on mu=0 anymore, but the average is negative. Here is how the average has evolved since the start of the leaderboard:

First, let me be clear that this decline is not a big problem, because what matters is not the absolute value of mu but the difference between two players.

But what’s the reason? As described above, the change of mu depends on the difference between actual wins and expected wins and phi. The former is symmetric: if player 1 outperforms expectations, player 2 underperforms by the same amount. But phi can differ between the two. In particular, if the underperformer has a higher phi than the overperformer, mu of the underperformer will fall more than mu of the overperformer increases and average mu falls. This could happen, because new players (high phi) are doing worse than expected (mu=0) or players that have been away for some time (higher phi) are playing worse than before.

Something to note are the two breaks in the red curve of active players above: end of May the decline stopped when the matching system was changed to make the default match more even (smaller level difference allowed). The second break was end of July, when the parameters of the ratings system were changed. That increased the level of new players to 38.75 and made matches of new players with experienced positive mu players more likely.

(Note: I calculated each player’s mu from the start using the current parameters, such that there’s no break in the method. Lowering starting phi from 2 to 0.75 helped to keep average mu more stable, because new players don’t lose that much rating on their first losses anymore. If I calculated today’s ratings with the original parameters, the average would be at -0.85 for all players and -0.6 for active players.)

To round this up, here are the upper percentiles and how they have evolved:

**Beat the Expectation?**

If you want to increase your mu, you need to play better than expected. A question that regularly comes up is whether it’s more beneficial to play a better or weaker opponent. For that I look at the difference between expected outcome and actual outcome for different bins of level difference (I use level here, because that’s what you can set in your matching options). I restrict the sample to the better player being at least level 45. The result is the left panel of this graph:

It shows that a better player slightly underperforms when facing a weaker player. But the difference is hardly significant: playing someone 8 levels higher would give you a 1% better outcome than playing someone 8 levels lower. Therefore, when averaged over all players, the theoretical win probability shown in the first graph matches the outcome well. Some players might still do better when facing someone stronger or weaker.

The right graph shows the overperformance in the n-th game of a player on a given day (only using players with already 100 games). You might think that it’s harder to focus on many games in a row, but that graph doesn’t show a strong effect, either. The caveat is that I can only use the rating day, such that I can’t see whether there’s been some hours of break between games. If someone plays around 0:00 UTC, then games also count for two days.

What you can see from the right graph is that there is an outperformance on average for those players with 100+ games. That means that those players tend to increase their rating when they play. So let’s have a look at the correlation between games played and skill in the following heat map:

There is a mildly positive relationship between the total number of games played and a player’s mu. But you can also see that there’s a lot of variance and playing many games is not sufficient for becoming a good player. Hence, you might want to spend some time on the other sections of this forum or the discord channel.

I’m using all rated 2-player games played until January 29th.

I’ll mostly talk about mu, so let’s start with a quick summary of the rating system. You can find some more info also here and the links contained therein.

In theory, it’s simple: mu increases, if you win more games than you were expected to. Scavenger also calculates that for you. How much mu changes also depends on your uncertainty phi. The more certain your rating is, the less it will change.

In particular, the formula is:

mu_change = phi^2*(actual_wins – expected_wins)

So, if your phi=0.2, winning or losing a game makes a difference of mu=0.04 (or level=0.3). If you were expected to win with 75%, then winning adds mu=0.01 and losing subtracts mu=0.03.

Uncertainty phi decreases with each game played and increases due to sigma. If your opponent is closer to your skill, phi will decrease more as the result is more informative (what matters is (win_probability*(1-win_probability)). If you play a constant number of games per day, your phi will converge to a certain value. (if you play less afterwards, it will increase again and vice versa.)

For example, if you play 1/5/10 games per day, phi will end up around 0.26/0.17/0.15.

Here’s the number of those rated 2-player games recorded per day and the number of what I defined as “active players”, i.e. having played at least 10 games in the last 30 days.

There are around

Here’s the histogram of the current skill of all players, only active players, and the one weighted by the number of games played (in that one mu is the value on the day the game was played:

The following heat maps show which players get matched most frequently. The right one zooms in one games with at least one player having mu=1.5:

You can see above that the distribution is not centred on mu=0 anymore, but the average is negative. Here is how the average has evolved since the start of the leaderboard:

First, let me be clear that this decline is not a big problem, because what matters is not the absolute value of mu but the difference between two players.

But what’s the reason? As described above, the change of mu depends on the difference between actual wins and expected wins and phi. The former is symmetric: if player 1 outperforms expectations, player 2 underperforms by the same amount. But phi can differ between the two. In particular, if the underperformer has a higher phi than the overperformer, mu of the underperformer will fall more than mu of the overperformer increases and average mu falls. This could happen, because new players (high phi) are doing worse than expected (mu=0) or players that have been away for some time (higher phi) are playing worse than before.

Something to note are the two breaks in the red curve of active players above: end of May the decline stopped when the matching system was changed to make the default match more even (smaller level difference allowed). The second break was end of July, when the parameters of the ratings system were changed. That increased the level of new players to 38.75 and made matches of new players with experienced positive mu players more likely.

(Note: I calculated each player’s mu from the start using the current parameters, such that there’s no break in the method. Lowering starting phi from 2 to 0.75 helped to keep average mu more stable, because new players don’t lose that much rating on their first losses anymore. If I calculated today’s ratings with the original parameters, the average would be at -0.85 for all players and -0.6 for active players.)

To round this up, here are the upper percentiles and how they have evolved:

If you want to increase your mu, you need to play better than expected. A question that regularly comes up is whether it’s more beneficial to play a better or weaker opponent. For that I look at the difference between expected outcome and actual outcome for different bins of level difference (I use level here, because that’s what you can set in your matching options). I restrict the sample to the better player being at least level 45. The result is the left panel of this graph:

It shows that a better player slightly underperforms when facing a weaker player. But the difference is hardly significant: playing someone 8 levels higher would give you a 1% better outcome than playing someone 8 levels lower. Therefore, when averaged over all players, the theoretical win probability shown in the first graph matches the outcome well. Some players might still do better when facing someone stronger or weaker.

The right graph shows the overperformance in the n-th game of a player on a given day (only using players with already 100 games). You might think that it’s harder to focus on many games in a row, but that graph doesn’t show a strong effect, either. The caveat is that I can only use the rating day, such that I can’t see whether there’s been some hours of break between games. If someone plays around 0:00 UTC, then games also count for two days.

What you can see from the right graph is that there is an outperformance on average for those players with 100+ games. That means that those players tend to increase their rating when they play. So let’s have a look at the correlation between games played and skill in the following heat map:

There is a mildly positive relationship between the total number of games played and a player’s mu. But you can also see that there’s a lot of variance and playing many games is not sufficient for becoming a good player. Hence, you might want to spend some time on the other sections of this forum or the discord channel.

22

Group A

Markus 6 - 0 crymeariver

Markus 6 - 0 crymeariver

23

Group A

Markus (Germany) 6 - 0 Jean-Michel (Finland)

Markus (Germany) 6 - 0 Jean-Michel (Finland)

24

Sunday 28th January at 17.00 UTC:

**Group A:**

**Germany - **** Finland**

markus - Jean-Michel

markus - Jean-Michel

25

A: markus 4-2 drsteelhammer

I'll be back for next season.

I'll be back for next season.