Dominion Strategy Forum

Please login or register.

Login with username, password and session length
Pages: 1 2 [All]

Author Topic: Beyond cheaters, this is why we have the isotropish leaderboard  (Read 11683 times)

0 Members and 1 Guest are viewing this topic.

() | (_) ^/

  • Minion
  • *****
  • Offline Offline
  • Posts: 632
  • Shuffle iT Username: p4ddy0d00rs
  • Nemo dat quod non habet.
  • Respect: +526
    • View Profile
    • BGG profile
+5

1) Hi all been a while yes there is no punctuation in this line unless of course you consider the closed parenthesis at the conclusion of the point indicator to be punctuation i dont but you may

2) I honestly have never ever cheated the Dominion leaderboard, either on Dominion Online or Isotropic Dominion.

3) I've played a handful of games today after playing one yesterday, and that after not playing at all for probably six months or so.

My "Pro" ranking per goko: #2 with 6598 points
My Isotropish ranking: #60 at level 40

The difference is staggering!  I'm sure many of you already know this, but this is the first time I've actually been on Dominion since the Isotropish leaderboard was introduced, and it is quite nice.  Really brings my head down out of the clouds.  8)
Logged

() | (_) ^/

  • Minion
  • *****
  • Offline Offline
  • Posts: 632
  • Shuffle iT Username: p4ddy0d00rs
  • Nemo dat quod non habet.
  • Respect: +526
    • View Profile
    • BGG profile
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #1 on: March 28, 2014, 11:07:17 am »
+2

.... and then I lost one game and I'm now down to 5167 goko rating.  <3
Logged

flies

  • Minion
  • *****
  • Offline Offline
  • Posts: 629
  • Shuffle iT Username: flies
  • Statistical mechanics of hard rods on a 1D lattice
  • Respect: +348
    • View Profile
    • ask the atheists
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #2 on: March 28, 2014, 01:56:48 pm »
0

was it against a low level player?
Logged
Gotta be efficient when most of your hand coordination is spent trying to apply mascara to your beard.
flies Dominionates on youtube

StrongRhino

  • Witch
  • *****
  • Offline Offline
  • Posts: 468
  • Shuffle iT Username: StrongRhino
  • Respect: +247
    • View Profile
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #3 on: March 28, 2014, 09:53:51 pm »
0

.... and then I lost one game and I'm now down to 5167 goko rating.  <3
A low player? I've lost that many points before, it's somewhat frustrating moving so much in one game.
Logged

Polk5440

  • Torturer
  • *****
  • Offline Offline
  • Posts: 1708
  • Respect: +1788
    • View Profile
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #4 on: March 28, 2014, 10:04:04 pm »
0

The key is that he hasn't played in 6 months. At some point it becomes like you are a new player again. Variance is very high and history matters little. The decay is faster for goko's system (recent play matters more), so it makes sense that's TrueSkill looks more stable at this point.
Logged

Tables

  • Margrave
  • *****
  • Offline Offline
  • Posts: 2816
  • Build more Bridges in the King's Court!
  • Respect: +3349
    • View Profile
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #5 on: March 28, 2014, 10:05:28 pm »
0

Wow, I wonder how it feels to get on a winning streak and just so happen to shoot up to the top of the Goko leaderboard? I mean nothing like that's ever happened to me.

But yeah, the Pro leaderboard is uh okay at gauging roughly people's skill but Isotropish is just so much more accurate, and giving uncertainty is nice as well.
Logged
...spin-offs are still better for all of the previously cited reasons.
But not strictly better, because the spinoff can have a different cost than the expansion.

flies

  • Minion
  • *****
  • Offline Offline
  • Posts: 629
  • Shuffle iT Username: flies
  • Statistical mechanics of hard rods on a 1D lattice
  • Respect: +348
    • View Profile
    • ask the atheists
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #6 on: March 28, 2014, 10:19:15 pm »
+1

i don't understand why the sigmas on the isotropish board are so much more tightly clumped than the old isotropic ratings. (for reference: http://dominion.isotropic.org/leaderboard/ - more games? really?)
Logged
Gotta be efficient when most of your hand coordination is spent trying to apply mascara to your beard.
flies Dominionates on youtube

GeoLib

  • Jester
  • *****
  • Offline Offline
  • Posts: 965
  • Respect: +1265
    • View Profile
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #7 on: March 28, 2014, 11:42:05 pm »
0

i don't understand why the sigmas on the isotropish board are so much more tightly clumped than the old isotropic ratings. (for reference: http://dominion.isotropic.org/leaderboard/ - more games? really?)

Isotropic had a small increase in uncertainty over time which is not in isotropish. I think that's it.
Logged
"All advice is awful"
 —Count Grishnakh

florrat

  • Minion
  • *****
  • Offline Offline
  • Posts: 542
  • Shuffle iT Username: florrat
  • Respect: +748
    • View Profile
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #8 on: March 28, 2014, 11:53:15 pm »
+1

i don't understand why the sigmas on the isotropish board are so much more tightly clumped than the old isotropic ratings. (for reference: http://dominion.isotropic.org/leaderboard/ - more games? really?)
Yeah, I've wondered that as well. The absolute minimum uncertainty on the whole leaderboard is 9.92. Is there some theoretical lower bound for the uncertainty, or can it get arbitrarily close to 0?
Logged

SirPeebles

  • Cartographer
  • *****
  • Offline Offline
  • Posts: 3249
  • Respect: +5460
    • View Profile
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #9 on: March 29, 2014, 07:21:32 am »
+5

1) Hi all been a while yes there is no punctuation in this line unless of course you consider the closed parenthesis at the conclusion of the point indicator to be punctuation i dont but you may

That's 'cause you hid 'em all in your username.
Logged
Well you *do* need a signature...

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4384
    • View Profile
    • WanderingWinder YouTube Page
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #10 on: March 29, 2014, 12:36:25 pm »
+2

I really don't understand anyone's basis for the claim that isotropish is "more accurate" than Goko/MF pro.

Joseph2302

  • Jester
  • *****
  • Offline Offline
  • Posts: 858
  • Shuffle iT Username: Joseph2302
  • "Better to be lucky than good"
  • Respect: +576
    • View Profile
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #11 on: March 29, 2014, 12:50:25 pm »
0

There's a massive discrepancy at times between Goko and Isotropish ratings. For instance, I'm currently at Goko rating 4853, isotropish level 21. I'm sat in great hall, and there are 6 people within +/- 300 rating of me. On isotropish ratings, they range between levels 12 and 30. Clearly someone's rating is a lot worse? (I'm guessing Goko's)
Logged
Mafia Stats: (correct as of 2017)
Town: 22 games, 8 wins
Scum: 5 games, 3 wins

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4384
    • View Profile
    • WanderingWinder YouTube Page
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #12 on: March 29, 2014, 01:09:33 pm »
+4

There's a massive discrepancy at times between Goko and Isotropish ratings. For instance, I'm currently at Goko rating 4853, isotropish level 21. I'm sat in great hall, and there are 6 people within +/- 300 rating of me. On isotropish ratings, they range between levels 12 and 30. Clearly someone's rating is a lot worse?
Actually, all that follows is that they're much different. System A might be horrible at rating 3 of them and System B might be horrible at rating the other 3. Or one or both might just be horribly misrating you. Or...

Quote
(I'm guessing Goko's)
But my real question is, WHY is this the flat assumption of everyone? Is it just because there's a perception that anything Goko does must be bad...?

SCSN

  • Mountebank
  • *****
  • Offline Offline
  • Posts: 2227
  • Respect: +7140
    • View Profile
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #13 on: March 29, 2014, 01:56:36 pm »
+3

Goko's rating system is swingy as hell. Since human skill changes only slowly over time, any estimation of it that's all over the place is horrible.
Logged

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4384
    • View Profile
    • WanderingWinder YouTube Page
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #14 on: March 29, 2014, 02:15:05 pm »
+2

Goko's rating system is swingy as hell. Since human skill changes only slowly over time, any estimation of it that's all over the place is horrible.

There are a few problems with this; 1) we don't actually know if it's swingy, really, because we don't know how much 5 points or 50 points or 500 points means (admittedly, this isn't exactly a point in the system's favor); 2)while (presumably, though I don't have better than anecdotal evidence) skill doesn't change rapidly over time for most players, this doesn't mean that the best estimation of said skill isn't going to move around a bit; 3)I don't actually agree with your assessment that their system is 'swingy as hell' - it actually doesn't seem to move that much at all to me. On the other hand, the isotropish ratings seem INCREDIBLY sluggish.

Polk5440

  • Torturer
  • *****
  • Offline Offline
  • Posts: 1708
  • Respect: +1788
    • View Profile
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #15 on: March 29, 2014, 02:17:53 pm »
0

On the other hand, the isotropish ratings seem INCREDIBLY sluggish.

Yes, definitely.
Logged

SCSN

  • Mountebank
  • *****
  • Offline Offline
  • Posts: 2227
  • Respect: +7140
    • View Profile
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #16 on: March 29, 2014, 03:02:09 pm »
+4

Goko's rating system is swingy as hell. Since human skill changes only slowly over time, any estimation of it that's all over the place is horrible.

There are a few problems with this; 1) we don't actually know if it's swingy, really, because we don't know how much 5 points or 50 points or 500 points means (admittedly, this isn't exactly a point in the system's favor)

We do know this from looking at the relative rankings. To give an example: earlier today I was briefly above Stef on the Goko rankings after winning a few games in a row, i.e. according to their rating system I would be a favorite in our next match-up, even though we have both played thousand of games on their site of which over a hundred against each other. Goko's conclusion is clearly retarded, because there's no doubt in my (or isotropish's) mind that Stef is the better player. For some more examples, see this post by Andrew.

Quote
2)while (presumably, though I don't have better than anecdotal evidence) skill doesn't change rapidly over time for most players, this doesn't mean that the best estimation of said skill isn't going to move around a bit

It should move around a bit when it has little evidence, but when it has hundreds or even thousands of games on you including a ton of recent data points, it should be changing very conservatively. And Goko's system isn't just moving around a bit, it's bouncing wildly to the tune of white noise.

Quote
3)I don't actually agree with your assessment that their system is 'swingy as hell' - it actually doesn't seem to move that much at all to me. On the other hand, the isotropish ratings seem INCREDIBLY sluggish.

We clearly have very different expectations here, because I think the isotropish ratings are much too volatile still. After having played thousands of games and a sufficient volume recently, it shouldn't be possible to change by a few levels within a single day: the prior of your skill having changed significantly over a very short time-span should be close to zero (by the nature of skill acquisition and decay), so that any significant deviation from expectation over a small sample should be judged as a fluke and thus only very slightly affect ratings.

To make of this a testable prediction: I predict that a running 30-day average of the isotropish ratings will be a significantly better predictor of the outcome of match-ups between players than the ratings as they currently are.
Logged

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4384
    • View Profile
    • WanderingWinder YouTube Page
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #17 on: March 29, 2014, 03:37:46 pm »
+1

Goko's rating system is swingy as hell. Since human skill changes only slowly over time, any estimation of it that's all over the place is horrible.

There are a few problems with this; 1) we don't actually know if it's swingy, really, because we don't know how much 5 points or 50 points or 500 points means (admittedly, this isn't exactly a point in the system's favor)

We do know this from looking at the relative rankings. To give an example: earlier today I was briefly above Stef on the Goko rankings after winning a few games in a row, i.e. according to their rating system I would be a favorite in our next match-up, even though we have both played thousand of games on their site of which over a hundred against each other. Goko's conclusion is clearly retarded, because there's no doubt in my (or isotropish's) mind that Stef is the better player. For some more examples, see this post by Andrew.
Well, I can go rebut that post if you want. Also isotropish had Stef below you for a while several days ago, so there's some doubt in its mind (even if we ignore that, it has you guys within like 1 st dev right now, so it thinks there's a reasonable chance). Further, I guess I think players' skills move faster than you do, because I certainly wouldn't quote thousands of games like they're all relevant. E.g., since the beginning of the month, you've played 99 games, 13 against Stef. He's played 168. You're 7-6 against him this month. Last month, you were 12-8 against him. No games in January. 1-3 in December. November, 3-2. Based on these results, anyway, you're pretty clearly better than him. I don't know what hundreds of games from ancient times you're banking on to confirm your "he's better than me", at least heads up.

Quote
Quote
2)while (presumably, though I don't have better than anecdotal evidence) skill doesn't change rapidly over time for most players, this doesn't mean that the best estimation of said skill isn't going to move around a bit

It should move around a bit when it has little evidence, but when it has hundreds or even thousands of games on you including a ton of recent data points, it should be changing very conservatively. And Goko's system isn't just moving around a bit, it's bouncing wildly to the tune of white noise.
When there are a ton of recent data points, it ought to move towards what those say, ignoring to some extent the older data. And again, you don't know how much it's moving around, because you only see points, and you don't know how much points are worth. It could be that all this random fluctuation is just bumping between 50.001% and 49.999%. Yes, you have rank data, but you don't have how MUCH it favors anyone against anyone else.

Quote
Quote
3)I don't actually agree with your assessment that their system is 'swingy as hell' - it actually doesn't seem to move that much at all to me. On the other hand, the isotropish ratings seem INCREDIBLY sluggish.

We clearly have very different expectations here, because I think the isotropish ratings are much too volatile still. After having played thousands of games and a sufficient volume recently, it shouldn't be possible to change by a few levels within a single day: the prior of your skill having changed significantly over a very short time-span should be close to zero (by the nature of skill acquisition and decay), so that any significant deviation from expectation over a small sample should be judged as a fluke and thus only very slightly affect ratings.
Well, when I lose something like 15 out of 20 against someone (who was ranked reasonably high to start with and it still has me at like 80% against them, I have to assume isotropish is too stodgy.
To be more clear here, the math doesn't really back you up here. If my model has it as a 2% chance Bob beats Tim in any game, and Bob beats Tim 10 games in a row, there's one chance in something on the order of a Billion Billion of that happening. Your model is wrong, and needs to move. I don't care if you have 10k games, your ratings need to move significantly. Obviously, this is a pretty dramatic example, but even in more realistic scenarios, you can pretty quickly to get to things that are 1 in a thousand or worse, very very easily. Now it's possible that you just had that random luck pop up, but I think it's more likely that the players' skills weren't accurately recorded, possibly because of at least somewhat of a skill change.

Quote
To make of this a testable prediction: I predict that a running 30-day average of the isotropish ratings will be a significantly better predictor of the outcome of match-ups between players than the ratings as they currently are.

You think a rating taken as an average over the last 30 days of isotropish ratings will be a significantly better predictor of WHICH matchups between players than... the current isotropish ratings? There are lots of holes in this that would need to be filled before it can be considered a testable prediction. First, how are you averaging? Arithmetic mean of mu and arithmetic mean of sigma? Do you have the historical data to calculate this? How do you time-average the data, since it updates real-time? Over what time period are you taking this measurement? Perhaps most important, how do you want to define "better predictor"? You need some kind of error function.

sudgy

  • Cartographer
  • *****
  • Offline Offline
  • Posts: 3431
  • Shuffle iT Username: sudgy
  • It's pronounced "SOO-jee"
  • Respect: +2707
    • View Profile
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #18 on: March 29, 2014, 03:41:29 pm »
0

I think the more you play, the better the Pro rating is at knowing your actual score.  WW, you play 29384 games a day so it shows you pretty well.  But, people like me or () | (_) ^/ aren't shown as well on the Pro rating because we rarely play.
Logged
If you're wondering what my avatar is, watch this.

Check out my logic puzzle blog!

   Quote from: sudgy on June 31, 2011, 11:47:46 pm

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4384
    • View Profile
    • WanderingWinder YouTube Page
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #19 on: March 29, 2014, 03:58:32 pm »
0

I think the more you play, the better the Pro rating is at knowing your actual score.  WW, you play 29384 games a day so it shows you pretty well.  But, people like me or () | (_) ^/ aren't shown as well on the Pro rating because we rarely play.
I don't play THAT much, though you have a point. On the other hand, the "level" system on the isotropish has a very similar problem - it all comes down to displaying your rating with a penalty for uncertainty. In any case, my point isn't that Pro rating is a great system (I don't think it is), it's that isotropish is pretty similarly lousy.

dominion123

  • Ambassador
  • ***
  • Offline Offline
  • Posts: 33
  • Respect: +11
    • View Profile
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #20 on: March 29, 2014, 05:31:50 pm »
0

There's nothing wrong with goko's rating system, it (presumably) uses some sort of ELO algorithm much like chess ratings. It's mathematically sound; given the ratings of two players you are able to predict the likelihood of one winning over another (+-200 probably means you are very close in skill), and the ratings change by the exact amount it needs to account for the new information (a loss or a win). The criticism here is that they don't measure skill very well. Well, it does.

Personally I prefer these kinds of ratings the most because they are have an applicable interpretation, namely the likelihood of one player winning over another. I don't know how the isotropic level system works (I'm not familiar with it), but if you conclude that one player is much superior to another despite even developed goko-rakings, then I must say I much prefer goko's. If you gain "points" or whatever based on how many games you play, and not solely who you play against, then I have a problem with it.
« Last Edit: March 29, 2014, 05:33:07 pm by dominion123 »
Logged

flies

  • Minion
  • *****
  • Offline Offline
  • Posts: 629
  • Shuffle iT Username: flies
  • Statistical mechanics of hard rods on a 1D lattice
  • Respect: +348
    • View Profile
    • ask the atheists
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #21 on: March 29, 2014, 06:56:11 pm »
0

I don't know how the isotropic level system works (I'm not familiar with it), but if you conclude that one player is much superior to another despite even developed goko-rakings, then I must say I much prefer goko's.

Isotropish uses TrueSkill which does seem to represent the point difference as an odds difference through some kind of exponential scaling iinm. 

My goko rating will go down aobut 60 points if I lose to a player ~800 pts lower than me.  If that happens five times in a row, I've lost 300 points.  That volatility is not what I'd want for maximum accuracy.
Logged
Gotta be efficient when most of your hand coordination is spent trying to apply mascara to your beard.
flies Dominionates on youtube

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4384
    • View Profile
    • WanderingWinder YouTube Page
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #22 on: March 29, 2014, 07:21:52 pm »
+1

I don't know how the isotropic level system works (I'm not familiar with it), but if you conclude that one player is much superior to another despite even developed goko-rakings, then I must say I much prefer goko's.

Isotropish uses TrueSkill which does seem to represent the point difference as an odds difference through some kind of exponential scaling iinm. 
TrueSkill models things as a difference of normal distributions, with the combined normal having an extra paramater for intrinsic variability of the game.

Quote
My goko rating will go down aobut 60 points if I lose to a player ~800 pts lower than me.  If that happens five times in a row, I've lost 300 points.  That volatility is not what I'd want for maximum accuracy.
But 60 points and 800 points and 300 points don't necessarily mean anything. If it was losing .6 points against someone rated 8 below you, would it be a problem? Well, this is the same thing, they are just showing you extra digits. The other thing is, since you don't know what this means, it could be that someone 800 points lower than you has a 40% chance at winning, and then dropping 300 points only gets you to 45% against someone you originally were rated the same as, which isn't awful for losing 5 games in a row against someone rated a ways below you.

Really, you measure whether the volatility is appropriate or not based on whether it is accurate to predict future games, and the only thing we KNOW is bad about the current system, in terms of worrying about accuracy, is that you can't tell what these predictions are to measure whether it's good or not.

popsofctown

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5477
  • Respect: +2860
    • View Profile
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #23 on: March 29, 2014, 10:05:14 pm »
0

How do I increase the certainty that I suck at dominion
Logged

Polk5440

  • Torturer
  • *****
  • Offline Offline
  • Posts: 1708
  • Respect: +1788
    • View Profile
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #24 on: March 29, 2014, 10:29:49 pm »
+1

Goko's rating system is swingy as hell. Since human skill changes only slowly over time, any estimation of it that's all over the place is horrible.

There are a few problems with this; 1) we don't actually know if it's swingy, really, because we don't know how much 5 points or 50 points or 500 points means (admittedly, this isn't exactly a point in the system's favor)

We do know this from looking at the relative rankings. To give an example: earlier today I was briefly above Stef on the Goko rankings after winning a few games in a row, i.e. according to their rating system I would be a favorite in our next match-up, even though we have both played thousand of games on their site of which over a hundred against each other. Goko's conclusion is clearly retarded, because there's no doubt in my (or isotropish's) mind that Stef is the better player. For some more examples, see this post by Andrew.
Well, I can go rebut that post if you want. Also isotropish had Stef below you for a while several days ago, so there's some doubt in its mind (even if we ignore that, it has you guys within like 1 st dev right now, so it thinks there's a reasonable chance). Further, I guess I think players' skills move faster than you do, because I certainly wouldn't quote thousands of games like they're all relevant. E.g., since the beginning of the month, you've played 99 games, 13 against Stef. He's played 168. You're 7-6 against him this month. Last month, you were 12-8 against him. No games in January. 1-3 in December. November, 3-2. Based on these results, anyway, you're pretty clearly better than him. I don't know what hundreds of games from ancient times you're banking on to confirm your "he's better than me", at least heads up.

I like this game!

I am 5-2 against Stef this year to date and 9-7 against SheCantSayNo this year to date.

I'd like my Silver medal, please!  ;D (Can't claim gold... Mic Qsenoch wipes the floor with me.)
Logged

jl8e

  • Steward
  • ***
  • Offline Offline
  • Posts: 26
  • Respect: +43
    • View Profile
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #25 on: March 29, 2014, 10:33:07 pm »
0

My goko rating will go down aobut 60 points if I lose to a player ~800 pts lower than me.  If that happens five times in a row, I've lost 300 points.  That volatility is not what I'd want for maximum accuracy.

If you’re losing five times running to players rated significantly below you, your rating is too high, and should be dropping significantly.

Dominion ratings, whatever system is in use, are going to show volatility, because Dominion is a high-variance game. It’s not like chess, where at some point, the higher-ranked player simply is not going to lose. No matter how good someone is, against an average player they’re still going to lose occasionally because of luck. If the skill difference means they win 95% of the time, then in a perfectly-balanced rating system their rating is going to drop significantly when they do lose. Specifically, it’s going to drop by 19 * x, where x is however much they would gain by winning.
Logged

popsofctown

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5477
  • Respect: +2860
    • View Profile
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #26 on: March 30, 2014, 12:23:59 am »
+3

I dunno man, you give Celestial Chameleon the right kingdom against a weaker player..
Logged

flies

  • Minion
  • *****
  • Offline Offline
  • Posts: 629
  • Shuffle iT Username: flies
  • Statistical mechanics of hard rods on a 1D lattice
  • Respect: +348
    • View Profile
    • ask the atheists
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #27 on: March 30, 2014, 12:03:35 pm »
0

Quote
My goko rating will go down aobut 60 points if I lose to a player ~800 pts lower than me.  If that happens five times in a row, I've lost 300 points.  That volatility is not what I'd want for maximum accuracy.
But 60 points and 800 points and 300 points don't necessarily mean anything. If it was losing .6 points against someone rated 8 below you, would it be a problem? Well, this is the same thing, they are just showing you extra digits.

Right now 300 points would drop me from the 11'th ranked player to #33.  This feels wrong.  Rankings are better for scale insofar as the scale is not arbitrary per se, but the meaning of a rank difference depends on how many players are ranked.  (there are about 300 players above lvl 29 on isotropish, ~500 above 5000 on goko.)

We don't really know how to rank skill of players.  We have no truth to compare to.  We could, in principle, devise bots with a known "skill" (whereby they'd decide at the outset who'd win/lose based on some function of a scalar "skill" difference) and see what isotropish would do vs goko, and that would help.  But we're not going to do that.

I appreciate your skepticism, WW, but I can't shake the feeling that goko's ranking is too volatile.  Whatever my skill is, five games out of 1079 shouldn't change my estimation of it that much (how many of those are over the last month I'm not sure, 50?).

Quote
TrueSkill models things as a difference of normal distributions, with the combined normal having an extra paramater for intrinsic variability of the game.
If you think the analysis here, where the odds are given as an exponential function of TS difference, is mistaken I'd be interested to hear why.
Logged
Gotta be efficient when most of your hand coordination is spent trying to apply mascara to your beard.
flies Dominionates on youtube

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4384
    • View Profile
    • WanderingWinder YouTube Page
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #28 on: March 30, 2014, 12:31:53 pm »
+1

Quote
My goko rating will go down aobut 60 points if I lose to a player ~800 pts lower than me.  If that happens five times in a row, I've lost 300 points.  That volatility is not what I'd want for maximum accuracy.
But 60 points and 800 points and 300 points don't necessarily mean anything. If it was losing .6 points against someone rated 8 below you, would it be a problem? Well, this is the same thing, they are just showing you extra digits.

Right now 300 points would drop me from the 11'th ranked player to #33.  This feels wrong.  Rankings are better for scale insofar as the scale is not arbitrary per se, but the meaning of a rank difference depends on how many players are ranked.  (there are about 300 players above lvl 29 on isotropish, ~500 above 5000 on goko.)

We don't really know how to rank skill of players.  We have no truth to compare to.  We could, in principle, devise bots with a known "skill" (whereby they'd decide at the outset who'd win/lose based on some function of a scalar "skill" difference) and see what isotropish would do vs goko, and that would help.  But we're not going to do that.
But this isn't true. We DO know a way to rank the skill of players - look at game results. Rating systems can (and generally do) give  specific numeric predictions of game outcomes, e.g. Jimmy has a 72% chance of winning against Bob. Then you compare these results against what happens and see what has the best accuracy/least error. There's actually more than one way to measure accuracy, and you can argue their merits, but you can certainly do it by one of them.

As for dropping you 22 spots, well, maybe that's right - seems to me that it thinks that you and these other guys are pretty closely bunched as is, and it wouldn't take much to flip it from "You're very slightly better than them" to "You're very slightly worse". It's certainly plausible.

Quote
I appreciate your skepticism, WW, but I can't shake the feeling that goko's ranking is too volatile.  Whatever my skill is, five games out of 1079 shouldn't change my estimation of it that much (how many of those are over the last month I'm not sure, 50?).
I mean, sure, you can feel that way. My main point is that right now, all anyone has to go on either way is just feeling.

Quote
Quote
TrueSkill models things as a difference of normal distributions, with the combined normal having an extra paramater for intrinsic variability of the game.
If you think the analysis here, where the odds are given as an exponential function of TS difference, is mistaken I'd be interested to hear why.
I'm apparently missing something in that link, as what I see there is that skills are measured by Normal (AKA Gaussian) distributions. I don't see the exponential function showing up in what is actually used; I see it given as a comparison to other models (Elo as currently implemented most places, though not actually what Elo himself originally proposed), which have logistic bases. But for sure it says they use Normal distributions there. (See the 9th, 13th, and 14th posts there, as well as the linked paper).

flies

  • Minion
  • *****
  • Offline Offline
  • Posts: 629
  • Shuffle iT Username: flies
  • Statistical mechanics of hard rods on a 1D lattice
  • Respect: +348
    • View Profile
    • ask the atheists
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #29 on: March 30, 2014, 04:05:39 pm »
0

Quote
We DO know a way to rank the skill of players...
now i'm confused.  if there is a way to rank the skill of players more accurately than TS or whatever goko does, why don't we use it?

Quote
As for dropping you 22 spots, well, maybe that's right - seems to me that it thinks that you and these other guys are pretty closely bunched as is, and it wouldn't take much to flip it from "You're very slightly better than them" to "You're very slightly worse". It's certainly plausible.
this is reasonable. 

Quote
I'm apparently missing something in that link, as what I see there is that skills are measured by Normal (AKA Gaussian) distributions. I don't see the exponential function showing up in what is actually used; I see it given as a comparison to other models (Elo as currently implemented most places, though not actually what Elo himself originally proposed), which have logistic bases. But for sure it says they use Normal distributions there. (See the 9th, 13th, and 14th posts there, as well as the linked paper).
Ok, it's taken me a long time (months) to grok all this, and I haven't got my head entirely wrapped around it, but the exponential odds referred to in the link is certainly not what TS does.  I'd be very interested to work out exactly what the win prediction under TS is, but I haven't got the time to work that out at the moment.
Logged
Gotta be efficient when most of your hand coordination is spent trying to apply mascara to your beard.
flies Dominionates on youtube

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4384
    • View Profile
    • WanderingWinder YouTube Page
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #30 on: March 30, 2014, 04:30:54 pm »
0

Quote
We DO know a way to rank the skill of players...
now i'm confused.  if there is a way to rank the skill of players more accurately than TS or whatever goko does, why don't we use it?
Well, I'm not actually saying we know a way that is better than TS/what goko does/etc. I'm saying that those are methods, and we can measure how good they are. You're never going to find a perfect method, but you can measure different methods to see how good they are, and look for whichever thing is the best. At least, you can do that in principle, though nobody is actually measuring their accuracy right now.

Quote
Quote
As for dropping you 22 spots, well, maybe that's right - seems to me that it thinks that you and these other guys are pretty closely bunched as is, and it wouldn't take much to flip it from "You're very slightly better than them" to "You're very slightly worse". It's certainly plausible.
this is reasonable. 

Quote
I'm apparently missing something in that link, as what I see there is that skills are measured by Normal (AKA Gaussian) distributions. I don't see the exponential function showing up in what is actually used; I see it given as a comparison to other models (Elo as currently implemented most places, though not actually what Elo himself originally proposed), which have logistic bases. But for sure it says they use Normal distributions there. (See the 9th, 13th, and 14th posts there, as well as the linked paper).
Ok, it's taken me a long time (months) to grok all this, and I haven't got my head entirely wrapped around it, but the exponential odds referred to in the link is certainly not what TS does.  I'd be very interested to work out exactly what the win prediction under TS is, but I haven't got the time to work that out at the moment.
[/quote]

The 'exponential odds thing' is what is used by most every Elo system nowadays, as well as several variants. It's probably the most common nowadays, and it's relatively easy to compute.

TS uses this (simplified to a 2-player game here):
P_A_Wins = Normal_CDF(Mu_A - Mu_B, Sqrt(Sigma_A^2+Sigma_B^2+Sigma_Game^2))

where P_A_Wins is the probability player A wins and Normal_CDF is the cumulative distirbution function of the Normal (or Gaussian) distribution, which you can't compute in closed form. You have to do it numerically, which used to effectively mean we looked it up in a table, though nowadays people do it on computers. Googling, the first thing that plops up is this: http://www.danielsoper.com/statcalc3/calc.aspx?id=53 though you can do it in e.g. Excel if you have that. The parametrization I give above is Normal(X, Standard Deviation), sometimes you see it as Mean, Variance. Uh, what else? Oh, I guess if the thing wants a mean, put it in as 0. And if you're trying to do it off of isotropish ratings, then you'll want to know that the displayed uncertainty is 3*sigma, not just straight sigma. I don't remember exactly what it uses for Sigma_game; 25/6 seems to be what I remember, but uh, that certainly could be wrong.

ragingduckd

  • Board Moderator
  • *
  • Offline Offline
  • Posts: 1059
  • Respect: +3527
    • View Profile
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #31 on: April 05, 2014, 06:17:25 pm »
0

TS uses this (simplified to a 2-player game here):
P_A_Wins = Normal_CDF(Mu_A - Mu_B, Sqrt(Sigma_A^2+Sigma_B^2+Sigma_Game^2))

I think there's a small typo here.  The σ2Game needs a factor of 2, since removing the player uncertainties (σA= σB=0) should yield the standard Gaussian Elo result: Φ(μA - μB, 21/2β)

Also, if you want a non-zero draw probability, throw in a draw margin ε.  For Isotropish, ε≈2.2, which corresponds to a (somewhat inaccurate) draw rate of 5%.  More empirically accurate would be ε≈0.78 (1.75%).

Logged
Salvager Extension | Isotropish Leaderboard | Game Data | Log Search & other toys | Salvager Bug Reports

Salvager not working for me at all today. ... Please help! I can't go back to playing without it like an animal!

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4384
    • View Profile
    • WanderingWinder YouTube Page
Re: Beyond cheaters, this is why we have the isotropish leaderboard
« Reply #32 on: April 05, 2014, 06:49:41 pm »
0

TS uses this (simplified to a 2-player game here):
P_A_Wins = Normal_CDF(Mu_A - Mu_B, Sqrt(Sigma_A^2+Sigma_B^2+Sigma_Game^2))

I think there's a small typo here.  The σ2Game needs a factor of 2, since removing the player uncertainties (σA= σB=0) should yield the standard Gaussian Elo result: Φ(μA - μB, 21/2β)

Also, if you want a non-zero draw probability, throw in a draw margin ε.  For Isotropish, ε≈2.2, which corresponds to a (somewhat inaccurate) draw rate of 5%.  More empirically accurate would be ε≈0.78 (1.75%).



1st, this was all off the top of my head, so it wasn't entirely precise. Forgot the 2, though it's not the most accurate thing to say that this is the "standard Gaussian Elo result" - all the actually-running Elo systems use a Logistic function, not a Gaussian; to be fair, Elo originally proposed Gaussian distributions, but... he doesn't actually say that this is the right figure; he explains in section 8.23 of his book (which is sitting on the arm of my chair as I type this) how you might get that figure, but eventually doesn't model players' variances separately, arguing that even if they're far different, it ends up not making so much difference. And indeed, in the Elo system, as he proposed it, there is just one variance that gets used, and it's an entirely irrelevant scale factor (well, ok, it makes a difference, but it's a scale factor - basically all it does is make the numbers different, without changing the predictions of the system, similar to a "double it, double the gaps between the ratings" (though not actually quite this simple)).

And actually, your explanation of where the 2 comes from doesn't actually make sense - the entire reason a two is there is because there are two players, and when you take a difference between two Gaussians, you get a Gaussian with mean equal to the differences of the mean and variance equal to the sum of the variance, which if you have equal variances (sigmaA = sigmaB) is simply sigmaA^2 + sigmaB^2 = 2sigma^2. That's actually why the 2 is there.

As for the draw probability, you need to have something if you want to model explicitly the chance at a draw, you need a paramter, yes, but I was actually doing the common convention of treating a draw as a simultaneous half-win and half-loss, or to put it more simply, I'm giving not the probability of a win, but the expected score - which is equivalent to (Wins + Draws/2)/(Games)

But yeah, there should be a 2 there, and at least some forms of TS do explicitly model draws differently, so there is that.
Pages: 1 2 [All]
 

Page created in 0.145 seconds with 20 queries.