Dominion Strategy Forum

Dominion => Dominion Online at Shuffle iT => Dominion General Discussion => Goko Dominion Online => Topic started by: ragingduckd on April 11, 2014, 07:34:34 pm

Title: Goko's Rating System, Part 1: ... in a formula!
Post by: ragingduckd on April 11, 2014, 07:34:34 pm

They said it couldn't be written down (https://getsatisfaction.com/goko/topics/summary_of_rating_calculations#reply_11914215)...
They swore they'd never reveal their secret (http://forum.makingfun.com/showthread.php?4259-Rating-System&p=21995&viewfull=1#post21995)...
Yet today, ladies and gentlemen, we unveil Goko's Rating system in a formula!

... well, in a few lines of pseudocode anyway:

Display rating as floor(μ-2σ)
Update μ and σ using TrueSkill with β=1375, γ=27.5, draw_probability=5%

Yup, after all the fuss and all the secrecy, they've just been running TrueSkill. Unmodified, official, read-the-article, download-the-source-code Microsoft TrueSkill^TM.

Why Goko misled us about this is totally beyond me. I contacted Jeff last week, just in case there was some good reason for the subterfuge and MakingFun would somehow be furious if people knew the actual system. I never got a response... so I guess I hope they're not. ;)

Tweaks

So what about all those weird tweaks? Goko told us that they were "perverting the rating system" (https://getsatisfaction.com/goko/topics/summary_of_rating_calculations#reply_11914215) to prevent ratings from dropping after a win, and that ratings were bounded above and below (https://getsatisfaction.com/goko/topics/summary_of_rating_calculations#topic_5233375). And what about that massive (http://forum.dominionstrategy.com/index.php?topic=8457.msg256227#msg256227) overnight drop in rating (https://getsatisfaction.com/goko/topics/skill_ratings_are_dropping_overnight#reply_12578147)?

None of these turn out to be anywhere near as bad as many of us thought. Goko Pro isn't a good system, but it's bad because of its choice of TrueSkill parameters, not because of its tweaks.

Rating can't go below 0: This tweak is purely cosmetic. If μ-2σ<0, your rating will be displayed as 0, but Goko continues tracking your μ and σ correctly.
Rating never drops after a win: It's legitimately possible for your displayed TrueSkill rating to go down after a win. If you beat a far lower-rated player, TS gives you a modest increase in σ but only a tiny increase in μ. So μ-2σ can go down even though μ itself (its best guess for your skill) has gone up.

Goko's tweak for this is also purely cosmetic. If your displayed rating would change by -3 after a win, the client just lies to you and says there was no change. But it keeps tracking your μ and σ correctly. Then if you win the next game and deserve +20, it compensates by showing you +17 instead.
μ can't go below 0 or above 10k: This one's real, at least for the Casual system. But μ=0 in Pro mode is so horrifically bad that not one player has bumped up against the limit. Same story for the alleged upper bound at 10k.
Daily increase in uncertainty: This used to be really brutal.... maybe a 5% increase in σ every 12:00 AM EST. And the increases compounded since they were re-multiplying your σ by 1.05 every day. So if you were a frequent player and you took a break, you might drop by -20 points on the first day, but then by -21 on the next day, and by the end of a month you'd be dropping nearly 100 points a day. As of now, σ increases by only 1% per day, which is more like -4 points on the first day for a frequent player. Also, it now happens at 12:30 AM... presumably for strategic sheep purposes.

Try it yourself!

I coded up their TrueSkill parameters and tweaks so that you can verify it. Just punch in your username and your opponent's, and it'll tell you what to expect after a win, loss, or draw. It tells you both your "real" new rating and the change that MF/Goko will tell you. They can diverge if you beat a much lower rated opponent, as in the example below.

Please post a screen shot if you discover a case in which its prediction is wrong, but first double-check that it actually had your rating right before the game. The process that collects that data can fall behind.

~~Pro Mode Rating Predictor (http://gokosalvager.com/static/gokoproratingpredictor.html)~~ (Offline)

(http://i.imgur.com/kcYEqAV.png)

Continued in Part 2: Reverse Engineering the System (http://forum.dominionstrategy.com/index.php?topic=10884.msg367431#msg367431)

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: Kirian on April 11, 2014, 07:47:47 pm

Just a note on one thing, since it has come up: under most normal TS parameters, the chance of losing rating due to a win is in fact astronomically small... and it requires sigmas to be already ridiculous...

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: SCSN on April 11, 2014, 08:01:38 pm

I've suspected for a while that ragingduckd is not quite human, and no, not an AI either...

(http://i.imgur.com/bRn6z7q.jpg)

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: silverspawn on April 11, 2014, 08:12:28 pm

l
o
l

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: Schneau on April 11, 2014, 08:21:47 pm

This is awesome and I'm looking forward to part 2.

But, the rating predictor isn't working for me. I get a page where I can enter two usernames, but no button to tell it to "go" or anything. Tested in Firefox and Chrome.

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: Awaclus on April 11, 2014, 08:50:28 pm

Quote from: Schneau on April 11, 2014, 08:21:47 pm

But, the rating predictor isn't working for me. I get a page where I can enter two usernames, but no button to tell it to "go" or anything. Tested in Firefox and Chrome.

Also not working with Opera.

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: Beyond Awesome on April 11, 2014, 09:24:47 pm

AI, you continue to amaze me. Why hasn't Goko offered you a job yet? Or, did you decline their offer?

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: silverspawn on April 11, 2014, 09:56:38 pm

Quote from: Awaclus on April 11, 2014, 08:50:28 pm

Quote from: Schneau on April 11, 2014, 08:21:47 pm
But, the rating predictor isn't working for me. I get a page where I can enter two usernames, but no button to tell it to "go" or anything. Tested in Firefox and Chrome.
Also not working with Opera.

i tested it in firefox and it didn't work, then i tried chrome and it worked. i installed salvager in chrome, in case it matters

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: ragingduckd on April 11, 2014, 10:26:56 pm

Quote from: Schneau on April 11, 2014, 08:21:47 pm

This is awesome and I'm looking forward to part 2.

But, the rating predictor isn't working for me. I get a page where I can enter two usernames, but no button to tell it to "go" or anything. Tested in Firefox and Chrome.

Apologies, folks. I've been coding with JS Promises, which I thought I could use on Firefox, but apparently not. I'm not sure why it doesn't work on "Opera," which I assume is a model of car?

Will try to resolve these issues.

It should work on Chrome... there's no "Submit" button. It just triggers once you've selected the second name. Maybe try clicking the name you want when it appears in the autocomplete box, rather than writing it out entirely?

Edit: Installing Salvager isn't necessary and won't resolve any bugs with the rating predictor.

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: ragingduckd on April 11, 2014, 10:34:04 pm

Quote from: ragingduckd on April 11, 2014, 10:26:56 pm

Quote from: Schneau on April 11, 2014, 08:21:47 pm
This is awesome and I'm looking forward to part 2.

But, the rating predictor isn't working for me. I get a page where I can enter two usernames, but no button to tell it to "go" or anything. Tested in Firefox and Chrome.

Apologies, folks. I've been coding with JS Promises, which I thought I could use on Firefox, but apparently not. I'm not sure why it doesn't work on "Opera," which I assume is a model of car?

Will try to resolve these issues.

It should work on Chrome... there's no "Submit" button. It just triggers once you've selected the second name. Maybe try clicking the name you want when it appears in the autocomplete box, rather than writing it out entirely?

Okay, I set it up with JS polyfill to provide the Promise objects, and it works for me on Firefox now. I'm betting that it'll work on Safari and Opera too, but I haven't tested it.

Edit: You may need to do a "hard refresh" to get the fixed version. Ctrl-Shift-R on Firefox.

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: Awaclus on April 11, 2014, 10:38:10 pm

Quote from: ragingduckd on April 11, 2014, 10:34:04 pm

Quote from: ragingduckd on April 11, 2014, 10:26:56 pm
Quote from: Schneau on April 11, 2014, 08:21:47 pm
This is awesome and I'm looking forward to part 2.

But, the rating predictor isn't working for me. I get a page where I can enter two usernames, but no button to tell it to "go" or anything. Tested in Firefox and Chrome.

Apologies, folks. I've been coding with JS Promises, which I thought I could use on Firefox, but apparently not. I'm not sure why it doesn't work on "Opera," which I assume is a model of car?

Will try to resolve these issues.

It should work on Chrome... there's no "Submit" button. It just triggers once you've selected the second name. Maybe try clicking the name you want when it appears in the autocomplete box, rather than writing it out entirely?

Okay, I set it up with JS polyfill to provide the Promise objects, and it works for me on Firefox now. I'm betting that it'll work on Safari and Opera too, but I haven't tested it.

Edit: You may need to do a "hard refresh" to get the fixed version. Ctrl-Shift-R on Firefox.

Works on Opera.

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: blueblimp on April 11, 2014, 10:39:47 pm

My guess as to why they wouldn't say that they're using TrueSkill: TrueSkill is patented, so by using it, they open themselves up to a patent lawsuit from Microsoft.

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: ragingduckd on April 11, 2014, 10:43:18 pm

Quote from: blueblimp on April 11, 2014, 10:39:47 pm

My guess as to why they wouldn't say that they're using TrueSkill: TrueSkill is patented, so by using it, they open themselves up to a patent lawsuit from Microsoft.

Oh dear... that may well be it. Here's a note from sublee's LICENSE FILE (https://github.com/sublee/trueskill/blob/master/LICENSE). Yikes.

Quote

Caution
=======

This TrueSkill project is opened under the BSD license but the
TrueSkill(TM) brand is not. Microsoft permits only Xbox Live games or
non-commercial projects to use TrueSkill(TM). If your project is
commercial, you should find another rating system.

Or is that saying that you can use the algorithm but not the name? Because that would explain their behavior too.

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: blueblimp on April 11, 2014, 10:52:09 pm

That paragraph somewhat conflates the TrueSkill trademark and the patent (or patents?). It's not the brand they need to be worried about, as long as they don't use the name "TrueSkill". I guess they're just hoping to keep a low profile so that the patent doesn't become a problem.

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: Polk5440 on April 11, 2014, 11:17:51 pm

Quote from: ragingduckd on April 11, 2014, 07:34:34 pm

Why Goko misled us about this is totally beyond me. I contacted Jeff last week, just in case there was some good reason for the subterfuge and MakingFun would somehow be furious if people knew the actual system. I never got a response... so I guess I hope they're not. ;)

Liability reasons.

Quote

Tweaks
None of these turn out to be anywhere near as bad as we were led to believe.

I feel vindicated in my beliefs. We were not led to believe anything. Many people just don't understand how ranking systems work. They are not achievement systems.

Quote

Goko Pro isn't a good system, but it's bad because of its choice of TrueSkill parameters, not because of its tweaks.

I will wait for the empirical test I hope you ran to back this statement up. I am not convinced the Iso params are chosen optimally. My hypothesis is that they overpredict win rates for the higher ranked player.

Quote

Rating can't go below 0: This tweak is purely cosmetic. If μ-2σ<0, your rating will be displayed as 0, but Goko continues tracking your μ and σ correctly.

Boy, am I glad to see you confirm this. You had me really worried after you mentioned in the Salvager thread that you thought they were actually storing 0. That would have been a disaster.

Quote

Rating never drops after a win: It's legitimately possible for your displayed TrueSkill rating to go down after a win. If you beat a far lower-rated player, TS gives you a modest increase in σ but only a tiny increase in μ. So μ-2σ can go down even though μ itself (its best guess for your skill) has gone up.

Goko's tweak for this is also purely cosmetic. If your displayed rating would change by -3 after a win, the client just lies to you and says there was no change. But it keeps tracking your μ and σ correctly. Then if you win the next game and deserve +20, it compensates by showing you +17 instead.

This makes me soooo happy! An improvement over the Isotropic implementation. This was one of the more heated points of contention, too. I am glad the change they made was cosmetic and doesn't actually affect the long term rating. Keeps the right system, quells the complaints with a little misdirection.

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: ragingduckd on April 11, 2014, 11:44:31 pm

Quote from: Polk5440 on April 11, 2014, 11:17:51 pm

We were not led to believe anything. Many people just don't understand how ranking systems work. They are not achievement systems.

You're right. I've rephrased it in my OP. Certainly nobody ever told us "man, is our system awful." ;)

Quote

Quote
Rating can't go below 0: This tweak is purely cosmetic. If μ-2σ<0, your rating will be displayed as 0, but Goko continues tracking your μ and σ correctly.

Boy, am I glad to see you confirm this. You had me really worried after you mentioned in the Salvager thread that you thought they were actually storing 0. That would have been a disaster.

Sorry... I wasn't clear in that post. It's the distinction between the mu=0 floor, which is real but not a problem, and the displayed rating=0 floor, which is cosmetic:

Quote from: ragingduckd on March 25, 2014, 02:40:43 pm

Zero Limits - Goko actually uses zero for the data as well as the displayed value. Like right now my casual rating is displayed as zero (I resign to bots constantly when testing), but if I query the raw data, it gives me {mean: 0, SD: 298}. So even though my history says I should be -4000 or something, Goko is going to award points to my next opponent as though I was rated zero. This randomizing bias propagates throughout the leaderboard, having little effect on the top but really screwing up the bottom.

At the time, I though that many of the low-rated Pro players actually had mu=0, like I had in Casual mode. Now I'm pretty sure there actually aren't any players with mu=0 in Pro. So there's no bias to screw up the bottom of the leaderboard after all.

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: DStu on April 12, 2014, 04:38:05 am

Is TS really patented? I know US patent system is crazy, and protection for the brand and copyright for the actual code is certainly ok, but a patent on this?

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: GwinnR on April 12, 2014, 08:49:59 am

Not sure, if this is the right place, but coming back to my post

Quote from: GwinnR on April 03, 2014, 06:10:44 am

Just one note on the leaderboard-discussion: Wouldn't it be possible to search games where Goko says that player A is better and isotropic says Player B is better? Then you could compare how the games end and see which system is right in most times.

I did this for my games several times and the istotropish-leaderboard predicted the right winner in 70% of the games. Ok, I looked at 10 games which is not very significant, but it is a beginning.

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: Kirian on April 12, 2014, 09:37:36 am

Quote from: DStu on April 12, 2014, 04:38:05 am

Is TS really patented? I know US patent system is crazy, and protection for the brand and copyright for the actual code is certainly ok, but a patent on this?

Welcome to the US Patent system, where clicking a button, moving a card, or displaying a number are all patentable!

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: Polk5440 on April 12, 2014, 10:06:12 am

Quote from: GwinnR on April 12, 2014, 08:49:59 am

Not sure, if this is the right place, but coming back to my post
Quote from: GwinnR on April 03, 2014, 06:10:44 am
Just one note on the leaderboard-discussion: Wouldn't it be possible to search games where Goko says that player A is better and isotropic says Player B is better? Then you could compare how the games end and see which system is right in most times.
I did this for my games several times and the istotropish-leaderboard predicted the right winner in 70% of the games. Ok, I looked at 10 games which is not very significant, but it is a beginning.

Given mu and sigma there are formulas that tell you the predicted win and draw rate probabilities. My first stab at checking the parameters (if I had the time and data) would go something like this:

1. Create a dataset of game outcomes and players' (mu, sigma) at the time the game was played.
2. Calculate the predicted probabilities of each player winning or drawing given their (mu, sigma).
3. Create an empirical distribution of what actually happened. (e.g. create percentage buckets. For each bucket, find games where a player was predicted to win with a percentage that falls into that bucket. What was the actual win rate for players in that bucket?)
4. Does the predicted chance of winning (drawing) match the empirically calculated win (draw) rate?

In a chart with the x-axis being predicted chance of winning and y-axis actual win rate, you should see your observations on approximately a 45 degree line. For example, if the parameters (beta, etc) were chosen too tightly (e.g. not enough "luck") then the predicted probability of winning would be higher than actual probability of winning for players expected to win.

Also, there are (business/community) reasons to distort the rating system/leaderboard by putting in "too much" rating decay for not playing (which would make predicted probs worse). The main one is to encourage regular play and discourage leaderboard camping.

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: Polk5440 on April 12, 2014, 10:16:39 am

Quote from: Kirian on April 12, 2014, 09:37:36 am

Quote from: DStu on April 12, 2014, 04:38:05 am
Is TS really patented? I know US patent system is crazy, and protection for the brand and copyright for the actual code is certainly ok, but a patent on this?

Welcome to the US Patent system, where clicking a button, moving a card, or displaying a number are all patentable!

This is probably going into RSP territory now, but here is an article about software patents in the US as it relates to a case currently in front of the Supreme Court. (http://www.washingtonpost.com/business/in-new-case-supreme-court-revisits-the-question-of-software-patents/2014/03/28/a3da1c52-ad3a-11e3-9627-c65021d6d572_story.html) The Court could rule on it by June, and I am hoping the ruling (which will probably be narrow) will spur either more software patent challenges or a discussion and change in law about software patents.

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: blueblimp on April 12, 2014, 12:15:19 pm

Quote from: DStu on April 12, 2014, 04:38:05 am

Is TS really patented? I know US patent system is crazy, and protection for the brand and copyright for the actual code is certainly ok, but a patent on this?

You say "a patent on this" like the TrueSkill algorithm is somehow obvious, but it really doesn't seem obvious to me!

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: DStu on April 12, 2014, 12:22:00 pm

Quote from: blueblimp on April 12, 2014, 12:15:19 pm

Quote from: DStu on April 12, 2014, 04:38:05 am
Is TS really patented? I know US patent system is crazy, and protection for the brand and copyright for the actual code is certainly ok, but a patent on this?
You say "a patent on this" like the TrueSkill algorithm is somehow obvious, but it really doesn't seem obvious to me!

Afaiu, it's basically bayesian statistic with gaussian prior and gaussian model, which is about the first thing you would try when you want to model skill with uncertainty.

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: michaeljb on April 13, 2014, 06:36:08 pm

Quote from: ragingduckd on April 11, 2014, 07:34:34 pm

Rating never drops after a win: It's legitimately possible for your displayed TrueSkill rating to go down after a win. If you beat a far lower-rated player, TS gives you a modest increase in σ but only a tiny increase in μ. So μ-2σ can go down even though μ itself (its best guess for your skill) has gone up.

I don't understand this. I understand that "tiny increase in μ" combined with "modest increase in σ" results in a decrease of μ-2σ, but I don't understand why TS gives you a modest increase in σ.

Isn't σ supposed to be the uncertainty in the rating you have? If so, it seems like it should not be increasing when your actual game result matches the predicted game result. When you play a lower-rated player, the predicted result will be a win for you, and intuitively to me that means that when you do win, uncertainty should remain the same or decrease.

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: Kirian on April 13, 2014, 06:49:34 pm

Quote from: michaeljb on April 13, 2014, 06:36:08 pm

Quote from: ragingduckd on April 11, 2014, 07:34:34 pm
Rating never drops after a win: It's legitimately possible for your displayed TrueSkill rating to go down after a win. If you beat a far lower-rated player, TS gives you a modest increase in σ but only a tiny increase in μ. So μ-2σ can go down even though μ itself (its best guess for your skill) has gone up.

I don't understand this. I understand that "tiny increase in μ" combined with "modest increase in σ" results in a decrease of μ-2σ, but I don't understand why TS gives you a modest increase in σ.

Isn't σ supposed to be the uncertainty in the rating you have? If so, it seems like it should not be increasing when your actual game result matches the predicted game result. When you play a lower-rated player, the predicted result will be a win for you, and intuitively to me that means that when you do win, uncertainty should remain the same or decrease.

Essentially, sigma is supposed to increase at the start of each game, then decrease at the end of the game, based on how much more certain we are about the player's actual rank. That initial increase in sigma is what keeps ratings from stagnating. However, if the decrease in sigma is incredibly small--i.e. a 7500 beats a 1000--the applied decrease in sigma isn't as large as the automatic increase. This only happens when all players already have sigma near the minimum.

Title: Re: Goko's Rating System... in a formula! (Part 1 of 3)
Post by: ragingduckd on April 13, 2014, 06:53:13 pm

Quote from: michaeljb on April 13, 2014, 06:36:08 pm

Quote from: ragingduckd on April 11, 2014, 07:34:34 pm
Rating never drops after a win: It's legitimately possible for your displayed TrueSkill rating to go down after a win. If you beat a far lower-rated player, TS gives you a modest increase in σ but only a tiny increase in μ. So μ-2σ can go down even though μ itself (its best guess for your skill) has gone up.

I don't understand this. I understand that "tiny increase in μ" combined with "modest increase in σ" results in a decrease of μ-2σ, but I don't understand why TS gives you a modest increase in σ.

Isn't σ supposed to be the uncertainty in the rating you have? If so, it seems like it should not be increasing when your actual game result matches the predicted game result. When you play a lower-rated player, the predicted result will be a win for you, and intuitively to me that means that when you do win, uncertainty should remain the same or decrease.

As I understand it, the gamma (aka tau) parameter gives you an increase in uncertainty before every game that's mean to model the possibility that your skill has changed. And I think you're right that it's really doing the wrong thing when you beat a much lower rated player... it's a compromise. Without gamma, your sigma plummets and you can end up with a rating that lags your evolving skill level.

There's an argument to be made for skipping gamma and just applying a daily increase in uncertainty... Holger suggested (http://forum.dominionstrategy.com/index.php?topic=8900.msg364617#msg364617) that Isotropic might have been doing this. I suppose it's a question of whether you think skill is more likely to change with time away from the game or with experience playing.

Also, what Kirian said. :)

Title: Re: Goko's Rating System, Part 1: ... in a formula!
Post by: michaeljb on April 14, 2014, 01:03:13 am

Cool, thanks for the explanations. I probably could have just read about TS on my own to find out...but what fun is that?

Title: Re: Goko's Rating System, Part 1: ... in a formula!
Post by: qmech on April 14, 2014, 04:48:18 am

Another redundant rephrasing: TS wants to discount the games you won against good players in the past. It uses "number of games played" as a proxy for the passage of time. If you want to discount old games though, then the "bump uncertainty every day" approach seems more reasonable.

Title: Re: Goko's Rating System, Part 1: ... in a formula!
Post by: Holger on April 14, 2014, 12:46:47 pm

Great to have the elusive formula at last.

Quote from: ragingduckd on April 11, 2014, 07:34:34 pm

μ can't go below 0 or above 10k: This one's real, at least for the Casual system. But μ=0 in Pro mode is so horrifically bad that not one player has bumped up against the limit. Same story for the alleged upper bound at 10k.

Not even Serf Bot, which has Isotropish mu=-18.7? Since Serf Bot has over 4,000 Pro games, this might make a difference for many weak players...

Quote from: ragingduckd on April 13, 2014, 06:53:13 pm

Quote from: michaeljb on April 13, 2014, 06:36:08 pm
Quote from: ragingduckd on April 11, 2014, 07:34:34 pm
Rating never drops after a win: It's legitimately possible for your displayed TrueSkill rating to go down after a win. If you beat a far lower-rated player, TS gives you a modest increase in σ but only a tiny increase in μ. So μ-2σ can go down even though μ itself (its best guess for your skill) has gone up.

I don't understand this. I understand that "tiny increase in μ" combined with "modest increase in σ" results in a decrease of μ-2σ, but I don't understand why TS gives you a modest increase in σ.

Isn't σ supposed to be the uncertainty in the rating you have? If so, it seems like it should not be increasing when your actual game result matches the predicted game result. When you play a lower-rated player, the predicted result will be a win for you, and intuitively to me that means that when you do win, uncertainty should remain the same or decrease.

As I understand it, the gamma (aka tau) parameter gives you an increase in uncertainty before every game that's mean to model the possibility that your skill has changed. And I think you're right that it's really doing the wrong thing when you beat a much lower rated player... it's a compromise. Without gamma, your sigma plummets and you can end up with a rating that lags your evolving skill level.

There's an argument to be made for skipping gamma and just applying a daily increase in uncertainty... Holger suggested (http://forum.dominionstrategy.com/index.php?topic=8900.msg364617#msg364617) that Isotropic might have been doing this. I suppose it's a question of whether you think skill is more likely to change with time away from the game or with experience playing.

Also, what Kirian said. :)

I'm unsure whether I prefer daily or gamely uncertainty increases ~~decreases~~ myself (or both, like Goko does). I do agree with michaeljb that the rating shouldn't drop after win, but that's a "bug" of TrueSkill, which Goko (and Isotropish) just copied. Ideally, you'd limit the automatic uncertainty increase by the mu increase (to give at worst a rating change of zero) in the case of an expected result. "Lying" about the rating decrease doesn't help people who get stuck with an ever-decreasing rating for continually beating Serf Bot (there was such a case last year in Casual mode: http://forum.dominionstrategy.com/index.php?topic=6819.msg189088#msg189088 (http://forum.dominionstrategy.com/index.php?topic=6819.msg189088#msg189088)).

Edit: added link.
2nd edit: fixed "sign error"

Title: Re: Goko's Rating System, Part 1: ... in a formula!
Post by: Polk5440 on April 14, 2014, 01:13:52 pm

Quote from: Holger on April 14, 2014, 12:46:47 pm

I'm unsure whether I prefer daily or gamely uncertainty decreases myself (or both, like Goko does).

Correct me if I'm wrong, but if I remember the TrueSkill documentation correctly, the per game uncertainty models the idea that the skill you play with for that game is itself drawn from a distribution. One never plays with a fixed underlying skill. For example, I may be watching tv, be under the weather, distracted by something outside, etc. These are factors separate from luck within the game itself. [The modeling assumption is that the parameter (beta, I think?) that describes the distribution from which you "draw" your skill every game is the same for every player.]

The daily uncertainty increase is an artificial way of 1) encouraging play and decreasing leaderboard camping and/or 2) crudely modeling skill depreciation over time.

Quote

I do agree with michaeljb that the rating shouldn't drop after win, but that's a "bug" of TrueSkill, which Goko (and Isotropish) just copied. Ideally, you'd limit the automatic uncertainty increase by the mu increase (to give at worst a rating change of zero) in the case of an expected result. "Lying" about the rating decrease doesn't help people who get stuck with an ever-decreasing rating for continually beating Serf Bot (there was such a case last year in Casual mode: http://forum.dominionstrategy.com/index.php?topic=6819.msg189088#msg189088 (http://forum.dominionstrategy.com/index.php?topic=6819.msg189088#msg189088)).

Edit: added link.

Again, ranking systems are not achievement systems.

mu - 2*sigma is just one way to represent a two-parameter system as one number to create a leaderboard. If you prefer not having a rating decline purely because of uncertainty increase, then consider preferring a leaderboard based only on mu rather than changing the system itself.

Ideally, the leaderboard/rating decline from a win against a weak player wouldn't necessarily impact the quality of matchmaking, either. I think Microsoft tries to match players on the highest expected probability of a draw (not the same thing as closest rank on the leaderboard). With good matchmaking, rating declines should almost never happen, anyway.

Title: Re: Goko's Rating System, Part 1: ... in a formula!
Post by: Holger on April 15, 2014, 08:04:48 am

Quote from: Polk5440 on April 14, 2014, 01:13:52 pm

Quote from: Holger on April 14, 2014, 12:46:47 pm
I'm unsure whether I prefer daily or gamely uncertainty decreases myself (or both, like Goko does).

Correct me if I'm wrong, but if I remember the TrueSkill documentation correctly, the per game uncertainty models the idea that the skill you play with for that game is itself drawn from a distribution. One never plays with a fixed underlying skill. For example, I may be watching tv, be under the weather, distracted by something outside, etc. These are factors separate from luck within the game itself. [The modeling assumption is that the parameter (beta, I think?) that describes the distribution from which you "draw" your skill every game is the same for every player.]

I think beta does account for the luck of the game, not a "skill distribution". Either way, there is a separate parameter gamma, which does systematically increase the uncertainty once for each game. It's this parameter which allows for rating decreases after a win.

Quote

The daily uncertainty increase is an artificial way of 1) encouraging play and decreasing leaderboard camping and/or 2) crudely modeling skill depreciation over time.

Agreed; but skill depreciation can also be "crudely" modeled by an uncertainty once per game instead, like the original TrueSkill algorithm (and also (in addition to the daily increase) Goko) does.

Quote

Quote
I do agree with michaeljb that the rating shouldn't drop after win, but that's a "bug" of TrueSkill, which Goko (and Isotropish) just copied. Ideally, you'd limit the automatic uncertainty increase by the mu increase (to give at worst a rating change of zero) in the case of an expected result. "Lying" about the rating decrease doesn't help people who get stuck with an ever-decreasing rating for continually beating Serf Bot (there was such a case last year in Casual mode: http://forum.dominionstrategy.com/index.php?topic=6819.msg189088#msg189088 (http://forum.dominionstrategy.com/index.php?topic=6819.msg189088#msg189088)).

Edit: added link.

Again, ranking systems are not achievement systems.

mu - 2*sigma is just one way to represent a two-parameter system as one number to create a leaderboard. If you prefer not having a rating decline purely because of uncertainty increase, then consider preferring a leaderboard based only on mu rather than changing the system itself.

I wouldn't mind a leaderboard based only on mu (if it doesn't lead to new players getting to #1 with 10 lucky wins); but on Goko, the leaderboard effectively IS the system because they don't publish mu and sigma separately (let alone the estimated win probabilities). And to me it makes no sense at all to decrease the rating for a win, no matter whether you consider the leaderboard as an "achievement system" or not. FWIW, the Goko leaderboard is used as a ranking system with the Salvager extension (requiring e.g. "4000+" opponents), although I'd consider it an achievement system due to the substracted 2*sigma.

Quote

Ideally, the leaderboard/rating decline from a win against a weak player wouldn't necessarily impact the quality of matchmaking, either. I think Microsoft tries to match players on the highest expected probability of a draw (not the same thing as closest rank on the leaderboard). With good matchmaking, rating declines should almost never happen, anyway.

Certainly the probability of a rating decline also depends on the TrueSkill parameters, not only the good matchmaking. With the high number of complaints about it, it seems to have occurred quite frequently on Goko. (Goko does seem to have good "bot matchmaking", always choosing the bot closest to the rating as an opponent when starting a "Play bots" game. This didn't prevent the quoted Serf Bot matchmaking being a rating trap.)

Title: Re: Goko's Rating System, Part 1: ... in a formula!
Post by: WanderingWinder on April 15, 2014, 04:40:56 pm

So it's not clear (or I missed it): did you check this for multiplayer? Because it's possible they're using a system which collapses to being (virtually) identical to a TS implementation for 2-player, but which differs in multiplayer).
(I say virtually here because truncating technically makes it different, though not in a way I'd expect anyone would argue is 'better').

Title: Re: Goko's Rating System, Part 1: ... in a formula!
Post by: ragingduckd on April 15, 2014, 05:41:31 pm

Quote from: WanderingWinder on April 15, 2014, 04:40:56 pm

So it's not clear (or I missed it): did you check this for multiplayer? Because it's possible they're using a system which collapses to being (virtually) identical to a TS implementation for 2-player, but which differs in multiplayer).
(I say virtually here because truncating technically makes it different, though not in a way I'd expect anyone would argue is 'better').

Good question. Yes, multiplayer appears to be the same.

In the client:

Code: [Select]

Before game: {SD: 418.4078623330855, mean: 761.6906835281002} 
Game Result: first place vs two new players
After game: {SD: 413.57218123834264, mean: 963.1254546591063}

In simulation:

Code: [Select]

> import trueskill
> r1 = trueskill.Rating(761.69068, 418.40786)
> r2 = trueskill.Rating(5500,2250)
> r3 = trueskill.Rating(5500,2250)
>
> from gdt.ratings.rating_system import goko
> goko.rate((r1,r2,r3), (1,2,3))[0]
>>trueskill.Rating(mu=963.125, sigma=413.572)

Title: Re: Goko's Rating System, Part 1: ... in a formula!
Post by: ragingduckd on April 17, 2014, 03:32:03 pm

Quote

I coded up their TrueSkill parameters and tweaks so that you can verify it. Just punch in your username and your opponent's, and it'll tell you what to expect after a win, loss, or draw. It tells you both your "real" new rating and the change that MF/Goko will tell you. They can diverge if you beat a much lower rated opponent, as in the example below.

Please post a screen shot if you discover a case in which its prediction is wrong, but first double-check that it actually had your rating right before the game. The process that collects that data can fall behind.

~~Pro Mode Rating Predictor (http://gokosalvager.com/static/gokoproratingpredictor.html)~~ (Offline)

I'm taking this offline because the only way I found to keep a current list of Goko Pro ratings is an intolerable nuisance. I'll add the same functionality to Salvager soon. It's a whole lot easier to do in the client.

Title: Re: Goko's Rating System, Part 1: ... in a formula!
Post by: Holger on September 30, 2014, 08:23:47 am

Quote from: ragingduckd on April 17, 2014, 03:32:03 pm

Quote
I coded up their TrueSkill parameters and tweaks so that you can verify it. Just punch in your username and your opponent's, and it'll tell you what to expect after a win, loss, or draw. It tells you both your "real" new rating and the change that MF/Goko will tell you. They can diverge if you beat a much lower rated opponent, as in the example below.

Please post a screen shot if you discover a case in which its prediction is wrong, but first double-check that it actually had your rating right before the game. The process that collects that data can fall behind.

~~Pro Mode Rating Predictor (http://gokosalvager.com/static/gokoproratingpredictor.html)~~ (Offline)

I'm taking this offline because the only way I found to keep a current list of Goko Pro ratings is an intolerable nuisance. I'll add the same functionality to Salvager soon. It's a whole lot easier to do in the client.

Will you still add this? (Or have you and I just didn't find it?)

Title: Re: Goko's Rating System, Part 1: ... in a formula!
Post by: ragingduckd on September 30, 2014, 12:50:35 pm

Quote from: Holger on September 30, 2014, 08:23:47 am

Quote from: ragingduckd on April 17, 2014, 03:32:03 pm
Quote
I coded up their TrueSkill parameters and tweaks so that you can verify it. Just punch in your username and your opponent's, and it'll tell you what to expect after a win, loss, or draw. It tells you both your "real" new rating and the change that MF/Goko will tell you. They can diverge if you beat a much lower rated opponent, as in the example below.

Please post a screen shot if you discover a case in which its prediction is wrong, but first double-check that it actually had your rating right before the game. The process that collects that data can fall behind.

~~Pro Mode Rating Predictor (http://gokosalvager.com/static/gokoproratingpredictor.html)~~ (Offline)

I'm taking this offline because the only way I found to keep a current list of Goko Pro ratings is an intolerable nuisance. I'll add the same functionality to Salvager soon. It's a whole lot easier to do in the client.

Will you still add this? (Or have you and I just didn't find it?)

I have alpha-quality code for it somewhere, but I'm not sure what I've done with it.

It requires adding a trueskill Javascript package to Salvager, querying the ratings from Goko, calculating the predicted changes using Goko's trueskill parameters, and displaying that in the UI.

Um... anyone else want to do that instead of me writing my code all over again?

Title: Re: Goko's Rating System, Part 1: ... in a formula!
Post by: Holger on October 09, 2014, 01:11:10 pm

Quote from: ragingduckd on September 30, 2014, 12:50:35 pm

Quote from: Holger on September 30, 2014, 08:23:47 am
Quote from: ragingduckd on April 17, 2014, 03:32:03 pm
Quote
I coded up their TrueSkill parameters and tweaks so that you can verify it. Just punch in your username and your opponent's, and it'll tell you what to expect after a win, loss, or draw. It tells you both your "real" new rating and the change that MF/Goko will tell you. They can diverge if you beat a much lower rated opponent, as in the example below.

Please post a screen shot if you discover a case in which its prediction is wrong, but first double-check that it actually had your rating right before the game. The process that collects that data can fall behind.

~~Pro Mode Rating Predictor (http://gokosalvager.com/static/gokoproratingpredictor.html)~~ (Offline)

I'm taking this offline because the only way I found to keep a current list of Goko Pro ratings is an intolerable nuisance. I'll add the same functionality to Salvager soon. It's a whole lot easier to do in the client.

Will you still add this? (Or have you and I just didn't find it?)

I have alpha-quality code for it somewhere, but I'm not sure what I've done with it.

It requires adding a trueskill Javascript package to Salvager, querying the ratings from Goko, calculating the predicted changes using Goko's trueskill parameters, and displaying that in the UI.

Um... anyone else want to do that instead of me writing my code all over again?

If you don't have it ready-made anymore, there's probably no need to put more work into it. (Now I think about it, I would probably be more interested in an Isotropish rating predictor than a Goko rating predictor.)

But I'm still most interested in reading your "Goko vs. Isotropish" article, so put that before anything else... ;)