1

**Dominion General Discussion / Re: Bully award**

« **on:**February 15, 2012, 10:44:13 pm »

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Filter to certain boards:

Pages: [**1**]

Pages: [

2

Open Silver/Silver on a board with strong 5s and nothing else, then fail to hit $5 until turn 7

3

Also at the part of the game where adventurer will net you 4, the cycling it gives you is bad, not good.

4

Some of the cards here make me think I am bad at Dominion. Nevertheless:

Most bought/gained:

Chapel (98.

Crossroads (93.1)

Fishing Village (99.0)

Tournament (93.9)

Remake (91.5)

Wharf (96.2)

Mountebank (88.9)

Festival (91.2)

Border Village (95.5)

King's Court (90.4)

A lot of good options here. I would probably go remake/chapel for 4/3 and wharf/chapel for 5/2, and then try to pick up wharves/FV and money, with KC later. I feel like this kind of board often leads to one guy going green early and hanging on for the win while the opponent builds his engine to go off a turn too late.

And least bought/gained:

Chancellor (5.2)

Oracle (5.3)

Philosopher's Stone (5.7)

Thief (8.7)

Bureaucrat (12.4)

Navigator (10.4)

Cache (5.7)

Harvest (10.5)

Contraband (11.5)

Adventurer (2.3)

Oracle/BM might be the right thing here, or Bureaucrat/BM.

Most bought/gained:

Chapel (98.

Crossroads (93.1)

Fishing Village (99.0)

Tournament (93.9)

Remake (91.5)

Wharf (96.2)

Mountebank (88.9)

Festival (91.2)

Border Village (95.5)

King's Court (90.4)

A lot of good options here. I would probably go remake/chapel for 4/3 and wharf/chapel for 5/2, and then try to pick up wharves/FV and money, with KC later. I feel like this kind of board often leads to one guy going green early and hanging on for the win while the opponent builds his engine to go off a turn too late.

And least bought/gained:

Chancellor (5.2)

Oracle (5.3)

Philosopher's Stone (5.7)

Thief (8.7)

Bureaucrat (12.4)

Navigator (10.4)

Cache (5.7)

Harvest (10.5)

Contraband (11.5)

Adventurer (2.3)

Oracle/BM might be the right thing here, or Bureaucrat/BM.

5

Re: predictive value of "level".

Level (defined as meanskill - sigma) isn't supposed to be a predictive measure anyway, so why would it be predictive? It's just notational shorthand. I mean, a guy whose estimated rank is 30+/-20 is a lot different than a guy whose estimated rank is 15+/-5, but "level" treats them the same. Of course it's not going to be as good a predictor as, say, mean skill.

Re: arbitrary cutoffs

There is this notion of "consistency" in estimators: you have some parameter t you want to predict, and you have some set of n observations that you want to predict it from. You generally want the following: as n increases, your estimate gets closer to t. Dropping relatively recent observations from the past from consideration guarantees that your estimators will not be consistent. This is a pretty bad thing, considering that you don't really get any offsetting benefit from it.

Re: decaying the past

There are two things going on here, which it seems some posters are missing. Suppose first that skill levels are fixed, like people never improve or get worse, and our goal is correctly assess everyone's skill level in an asymptotically consistent sense. Then we should not drop anything from the past, and all observations should be equally weighted. Then the probably is basically simple except for what prior beliefs we have about the population of isotropic players.

Okay, but not skill levels aren't fixed, so we have to do something else. The Glicko solution is basically to increase the variance of the prior on each player over time. This naturally decays the impact of older games. This is the "gamma" that rspeer mentioned, as far as I can tell. Now applying this solution, but only on days you play, has a totally counter-intuitive effect on rankings. Consider two guys A and B, who on day 0 have identical mu/sigma rankings. Then A goes to study for the bar exam, while B plays a game a day for the next two months, during which time his results are right in line with his previous ranking. This system will claim, obviously implausibly, that we are MORE uncertain about B's ranking than A's, which doesn't make any sense at all.

Now, rspeer mentioned a problem about players playing badly at first and not being able to dig themselves out of the hole fast enough. To my mind, this isn't a problem: our best estimate of their level is what it is under the parameters of the model, so meh. But I think his comment reflects a prior belief about the distribution of skill levels that is a) not accounted for in the model, b) probably true. That belief is that the rate of change of "true" skill levels is much higher for players with low rankings than it is for players with high rankings. To me this is obviously true; when you suck, it's easy to become marginally competent, just read dominionstrategy.com. When you are mediocre, it is harder but not impossible to become strong. When you are strong, it is difficult to become elite OR to become mediocre. When you are elite, it's hard to move anywhere. The higher your meanskill ranking is, the lower the variance of the drift of your meanskill, regardless of the variance of your meanskill.

So if this is really the problem you are trying to solve with all this tweaking of the system, then just use non-uniform gamma based on meanskill. Problem solved, and in a nice Bayesian way.

Level (defined as meanskill - sigma) isn't supposed to be a predictive measure anyway, so why would it be predictive? It's just notational shorthand. I mean, a guy whose estimated rank is 30+/-20 is a lot different than a guy whose estimated rank is 15+/-5, but "level" treats them the same. Of course it's not going to be as good a predictor as, say, mean skill.

Re: arbitrary cutoffs

There is this notion of "consistency" in estimators: you have some parameter t you want to predict, and you have some set of n observations that you want to predict it from. You generally want the following: as n increases, your estimate gets closer to t. Dropping relatively recent observations from the past from consideration guarantees that your estimators will not be consistent. This is a pretty bad thing, considering that you don't really get any offsetting benefit from it.

Re: decaying the past

There are two things going on here, which it seems some posters are missing. Suppose first that skill levels are fixed, like people never improve or get worse, and our goal is correctly assess everyone's skill level in an asymptotically consistent sense. Then we should not drop anything from the past, and all observations should be equally weighted. Then the probably is basically simple except for what prior beliefs we have about the population of isotropic players.

Okay, but not skill levels aren't fixed, so we have to do something else. The Glicko solution is basically to increase the variance of the prior on each player over time. This naturally decays the impact of older games. This is the "gamma" that rspeer mentioned, as far as I can tell. Now applying this solution, but only on days you play, has a totally counter-intuitive effect on rankings. Consider two guys A and B, who on day 0 have identical mu/sigma rankings. Then A goes to study for the bar exam, while B plays a game a day for the next two months, during which time his results are right in line with his previous ranking. This system will claim, obviously implausibly, that we are MORE uncertain about B's ranking than A's, which doesn't make any sense at all.

Now, rspeer mentioned a problem about players playing badly at first and not being able to dig themselves out of the hole fast enough. To my mind, this isn't a problem: our best estimate of their level is what it is under the parameters of the model, so meh. But I think his comment reflects a prior belief about the distribution of skill levels that is a) not accounted for in the model, b) probably true. That belief is that the rate of change of "true" skill levels is much higher for players with low rankings than it is for players with high rankings. To me this is obviously true; when you suck, it's easy to become marginally competent, just read dominionstrategy.com. When you are mediocre, it is harder but not impossible to become strong. When you are strong, it is difficult to become elite OR to become mediocre. When you are elite, it's hard to move anywhere. The higher your meanskill ranking is, the lower the variance of the drift of your meanskill, regardless of the variance of your meanskill.

So if this is really the problem you are trying to solve with all this tweaking of the system, then just use non-uniform gamma based on meanskill. Problem solved, and in a nice Bayesian way.

6

If the purpose of the leaderboard is to provide a crude tool for figuring out how to play players who are roughly at your skill level, it would be better to just have five levels or something. If the idea is to provide a more serious estimate of playing skill, it seems that all the parameters are more or less misconfigured.

1) Having a step function that is either time-based or duration-based whereby games drop off the ranking is bad; using decay or something Bayesian instead is better.

2) The skill variance calculations from the old board weren't good; they were too close for players of widely different volumes of play.

3) The "level" system which is some linear combination of estimated skill and variance is kind of arbitrary and misleading.

4) The assumed skill for new players is probably wrong; most unranked players are kind of bad. This is primarily a variance issue though, winning or losing against players with hardly any record shouldn't do much to rankings. That's not happening, though.

5) People being able to "game" the system by playing sock puppets is really quite avoidable; just restrict the information that games against any particular opponent can add to the system.

I don't know a lot about the guts TrueSkill, but it can hardly be rocket science; I'd be happy to do some research and help in tweaking parameters to make a system that reflects the goals of the leaderboard, whatever those are. (that last question is really quite important, btw).

1) Having a step function that is either time-based or duration-based whereby games drop off the ranking is bad; using decay or something Bayesian instead is better.

2) The skill variance calculations from the old board weren't good; they were too close for players of widely different volumes of play.

3) The "level" system which is some linear combination of estimated skill and variance is kind of arbitrary and misleading.

4) The assumed skill for new players is probably wrong; most unranked players are kind of bad. This is primarily a variance issue though, winning or losing against players with hardly any record shouldn't do much to rankings. That's not happening, though.

5) People being able to "game" the system by playing sock puppets is really quite avoidable; just restrict the information that games against any particular opponent can add to the system.

I don't know a lot about the guts TrueSkill, but it can hardly be rocket science; I'd be happy to do some research and help in tweaking parameters to make a system that reflects the goals of the leaderboard, whatever those are. (that last question is really quite important, btw).

8

Glad you liked our book. -ja

Poker can also be played purely mathematically (read the excellent "Mathematics of Poker" by Bill Chen) but you're still going to get destroyed by an experienced player at the table because he'll be able to read your body language and make good decisions based on that while he may not even know the correct odds.

9

hgfalling

Eastern US time.

Eastern US time.

10

Another way to crudely estimate this effect in percentage terms is the following.

Assume the following:

The win rate of the player who goes first overall is t+1/2 - (x2-x1)^2

This is equal to the overall reported "first player win percentage".

x2-x1 is normally distributed around zero, so the expected value of its square is twice the variance of each part, so E(x2-x1)^2 = 2s^2.

Sloppy quick looking at win pcts around rank 100 suggests s~0.08 or so, which means that the suppression of t from the player selection bias would be around 1.2% (win pct). So if overall CR data suggests that the first player wins 55-45 (what is the actual number?), an unbiased number would be perhaps 56-44 or so.

Now this model is pretty crude, but it would be easily adapted to monte carlo if someone had different priors on the composition of the playing population really wanted to know the answer to this.

Assume the following:

- Every player has a true "win probability" x1 or x2 that is their %win against an average player in a long match where both players go first equally often, and these win probabilities are roughly normal over the population with some standard deviation s.
- There is a true 1st player advantage t, and win probabilities are linear so that if x1 plays x2 and goes first, w(x1)=t+x1-x2+1/2.

The win rate of the player who goes first overall is t+1/2 - (x2-x1)^2

This is equal to the overall reported "first player win percentage".

x2-x1 is normally distributed around zero, so the expected value of its square is twice the variance of each part, so E(x2-x1)^2 = 2s^2.

Sloppy quick looking at win pcts around rank 100 suggests s~0.08 or so, which means that the suppression of t from the player selection bias would be around 1.2% (win pct). So if overall CR data suggests that the first player wins 55-45 (what is the actual number?), an unbiased number would be perhaps 56-44 or so.

Now this model is pretty crude, but it would be easily adapted to monte carlo if someone had different priors on the composition of the playing population really wanted to know the answer to this.

Pages: [**1**]