Dominion Strategy Forum

Please login or register.

Login with username, password and session length
Pages: 1 [2]  All

Author Topic: Goko's Rating System, Part 1: ... in a formula!  (Read 16658 times)

0 Members and 1 Guest are viewing this topic.

ragingduckd

  • Board Moderator
  • *
  • Offline Offline
  • Posts: 1059
  • Respect: +3527
    • View Profile
Re: Goko's Rating System... in a formula! (Part 1 of 3)
« Reply #25 on: April 13, 2014, 06:53:13 pm »
+3

Rating never drops after a win: It's legitimately possible for your displayed TrueSkill rating to go down after a win.  If you beat a far lower-rated player, TS gives you a modest increase in σ but only a tiny increase in μ.  So μ-2σ can go down even though μ itself (its best guess for your skill) has gone up.

I don't understand this. I understand that "tiny increase in μ" combined with "modest increase in σ" results in a decrease of μ-2σ, but I don't understand why TS gives you a modest increase in σ.

Isn't σ supposed to be the uncertainty in the rating you have? If so, it seems like it should not be increasing when your actual game result matches the predicted game result. When you play a lower-rated player, the predicted result will be a win for you, and intuitively to me that means that when you do win, uncertainty should remain the same or decrease.

As I understand it, the gamma (aka tau) parameter gives you an increase in uncertainty before every game that's mean to model the possibility that your skill has changed.  And I think you're right that it's really doing the wrong thing when you beat a much lower rated player... it's a compromise.  Without gamma, your sigma plummets and you can end up with a rating that lags your evolving skill level.

There's an argument to be made for skipping gamma and just applying a daily increase in uncertainty...  Holger suggested that Isotropic might have been doing this.  I suppose it's a question of whether you think skill is more likely to change with time away from the game or with experience playing.

Also, what Kirian said. :)
Logged
Salvager Extension | Isotropish Leaderboard | Game Data | Log Search & other toys | Salvager Bug Reports

Salvager not working for me at all today. ... Please help! I can't go back to playing without it like an animal!

michaeljb

  • Board Moderator
  • *
  • Offline Offline
  • Posts: 1422
  • Shuffle iT Username: michaeljb
  • Respect: +2113
    • View Profile
Re: Goko's Rating System, Part 1: ... in a formula!
« Reply #26 on: April 14, 2014, 01:03:13 am »
+1

Cool, thanks for the explanations. I probably could have just read about TS on my own to find out...but what fun is that?
Logged
🚂 Give 18xx games a chance 🚂

qmech

  • Torturer
  • *****
  • Offline Offline
  • Posts: 1918
  • Shuffle iT Username: qmech
  • What year is it?
  • Respect: +2320
    • View Profile
Re: Goko's Rating System, Part 1: ... in a formula!
« Reply #27 on: April 14, 2014, 04:48:18 am »
0

Another redundant rephrasing: TS wants to discount the games you won against good players in the past.  It uses "number of games played" as a proxy for the passage of time.  If you want to discount old games though, then the "bump uncertainty every day" approach seems more reasonable.
Logged

Holger

  • Minion
  • *****
  • Offline Offline
  • Posts: 736
  • Respect: +458
    • View Profile
Re: Goko's Rating System, Part 1: ... in a formula!
« Reply #28 on: April 14, 2014, 12:46:47 pm »
0

Great to have the elusive formula at last. 

  • μ can't go below 0 or above 10k: This one's real, at least for the Casual system.  But μ=0 in Pro mode is so horrifically bad that not one player has bumped up against the limit.  Same story for the alleged upper bound at 10k.

Not even Serf Bot, which has Isotropish mu=-18.7? Since Serf Bot has over 4,000 Pro games, this might make a difference for many weak players...

Rating never drops after a win: It's legitimately possible for your displayed TrueSkill rating to go down after a win.  If you beat a far lower-rated player, TS gives you a modest increase in σ but only a tiny increase in μ.  So μ-2σ can go down even though μ itself (its best guess for your skill) has gone up.

I don't understand this. I understand that "tiny increase in μ" combined with "modest increase in σ" results in a decrease of μ-2σ, but I don't understand why TS gives you a modest increase in σ.

Isn't σ supposed to be the uncertainty in the rating you have? If so, it seems like it should not be increasing when your actual game result matches the predicted game result. When you play a lower-rated player, the predicted result will be a win for you, and intuitively to me that means that when you do win, uncertainty should remain the same or decrease.

As I understand it, the gamma (aka tau) parameter gives you an increase in uncertainty before every game that's mean to model the possibility that your skill has changed.  And I think you're right that it's really doing the wrong thing when you beat a much lower rated player... it's a compromise.  Without gamma, your sigma plummets and you can end up with a rating that lags your evolving skill level.

There's an argument to be made for skipping gamma and just applying a daily increase in uncertainty...  Holger suggested that Isotropic might have been doing this.  I suppose it's a question of whether you think skill is more likely to change with time away from the game or with experience playing.

Also, what Kirian said. :)

I'm unsure whether I prefer daily or gamely uncertainty increases decreases myself (or both, like Goko does). I do agree with michaeljb that the rating shouldn't drop after win, but that's a "bug" of TrueSkill, which Goko (and Isotropish) just copied. Ideally, you'd limit the automatic uncertainty increase by the mu increase (to give at worst a rating change of zero) in the case of an expected result. "Lying" about the rating decrease doesn't help people who get stuck with an ever-decreasing rating for continually beating Serf Bot (there was such a case last year in Casual mode: http://forum.dominionstrategy.com/index.php?topic=6819.msg189088#msg189088 ).

Edit: added link.
2nd edit: fixed "sign error"
« Last Edit: April 15, 2014, 06:13:37 am by Holger »
Logged

Polk5440

  • Torturer
  • *****
  • Offline Offline
  • Posts: 1708
  • Respect: +1788
    • View Profile
Re: Goko's Rating System, Part 1: ... in a formula!
« Reply #29 on: April 14, 2014, 01:13:52 pm »
0

I'm unsure whether I prefer daily or gamely uncertainty decreases myself (or both, like Goko does).

Correct me if I'm wrong, but if I remember the TrueSkill documentation correctly, the per game uncertainty models the idea that the skill you play with for that game is itself drawn from a distribution. One never plays with a fixed underlying skill. For example, I may be watching tv, be under the weather, distracted by something outside, etc. These are factors separate from luck within the game itself. [The modeling assumption is that the parameter (beta, I think?) that describes the distribution from which you "draw" your skill every game is the same for every player.]

The daily uncertainty increase is an artificial way of 1) encouraging play and decreasing leaderboard camping and/or 2) crudely modeling skill depreciation over time.

Quote
I do agree with michaeljb that the rating shouldn't drop after win, but that's a "bug" of TrueSkill, which Goko (and Isotropish) just copied. Ideally, you'd limit the automatic uncertainty increase by the mu increase (to give at worst a rating change of zero) in the case of an expected result. "Lying" about the rating decrease doesn't help people who get stuck with an ever-decreasing rating for continually beating Serf Bot (there was such a case last year in Casual mode: http://forum.dominionstrategy.com/index.php?topic=6819.msg189088#msg189088 ).

Edit: added link.

Again, ranking systems are not achievement systems.

mu - 2*sigma is just one way to represent a two-parameter system as one number to create a leaderboard. If you prefer not having a rating decline purely because of uncertainty increase, then consider preferring a leaderboard based only on mu rather than changing the system itself.

Ideally, the leaderboard/rating decline from a win against a weak player wouldn't necessarily impact the quality of matchmaking, either. I think Microsoft tries to match players on the highest expected probability of a draw (not the same thing as closest rank on the leaderboard). With good matchmaking, rating declines should almost never happen, anyway.
Logged

Holger

  • Minion
  • *****
  • Offline Offline
  • Posts: 736
  • Respect: +458
    • View Profile
Re: Goko's Rating System, Part 1: ... in a formula!
« Reply #30 on: April 15, 2014, 08:04:48 am »
0

I'm unsure whether I prefer daily or gamely uncertainty decreases myself (or both, like Goko does).

Correct me if I'm wrong, but if I remember the TrueSkill documentation correctly, the per game uncertainty models the idea that the skill you play with for that game is itself drawn from a distribution. One never plays with a fixed underlying skill. For example, I may be watching tv, be under the weather, distracted by something outside, etc. These are factors separate from luck within the game itself. [The modeling assumption is that the parameter (beta, I think?) that describes the distribution from which you "draw" your skill every game is the same for every player.]
I think beta does account for the luck of the game, not a "skill distribution". Either way, there is a separate parameter gamma, which does systematically increase the uncertainty once for each game. It's this parameter which allows for rating decreases after a win.

Quote
The daily uncertainty increase is an artificial way of 1) encouraging play and decreasing leaderboard camping and/or 2) crudely modeling skill depreciation over time.

Agreed; but skill depreciation can also be "crudely" modeled by an uncertainty once per game instead, like the original TrueSkill algorithm (and also (in addition to the daily increase) Goko) does.


Quote
Quote
I do agree with michaeljb that the rating shouldn't drop after win, but that's a "bug" of TrueSkill, which Goko (and Isotropish) just copied. Ideally, you'd limit the automatic uncertainty increase by the mu increase (to give at worst a rating change of zero) in the case of an expected result. "Lying" about the rating decrease doesn't help people who get stuck with an ever-decreasing rating for continually beating Serf Bot (there was such a case last year in Casual mode: http://forum.dominionstrategy.com/index.php?topic=6819.msg189088#msg189088 ).

Edit: added link.

Again, ranking systems are not achievement systems.

mu - 2*sigma is just one way to represent a two-parameter system as one number to create a leaderboard. If you prefer not having a rating decline purely because of uncertainty increase, then consider preferring a leaderboard based only on mu rather than changing the system itself.
I wouldn't mind a leaderboard based only on mu (if it doesn't lead to new players getting to #1 with 10 lucky wins); but on Goko, the leaderboard effectively IS the system because they don't publish mu and sigma separately (let alone the estimated win probabilities). And to me it makes no sense at all to decrease  the rating for a win, no matter whether you consider the leaderboard as an "achievement system" or not. FWIW, the Goko leaderboard is used as a ranking system with the Salvager extension (requiring e.g. "4000+" opponents), although I'd consider it an achievement system due to the substracted 2*sigma.


Quote
Ideally, the leaderboard/rating decline from a win against a weak player wouldn't necessarily impact the quality of matchmaking, either. I think Microsoft tries to match players on the highest expected probability of a draw (not the same thing as closest rank on the leaderboard). With good matchmaking, rating declines should almost never happen, anyway.

Certainly the probability of a rating decline also depends on the TrueSkill parameters, not only the good matchmaking. With the high number of complaints about it, it seems to have occurred quite frequently on Goko. (Goko does seem to have good "bot matchmaking", always choosing the bot closest to the rating as an opponent when starting a "Play bots" game. This didn't prevent the quoted Serf Bot matchmaking being a rating trap.)
Logged

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4381
    • View Profile
    • WanderingWinder YouTube Page
Re: Goko's Rating System, Part 1: ... in a formula!
« Reply #31 on: April 15, 2014, 04:40:56 pm »
+2

So it's not clear (or I missed it): did you check this for multiplayer? Because it's possible they're using a system which collapses to being (virtually) identical to a TS implementation for 2-player, but which differs in multiplayer).
(I say virtually here because truncating technically makes it different, though not in a way I'd expect anyone would argue is 'better').

ragingduckd

  • Board Moderator
  • *
  • Offline Offline
  • Posts: 1059
  • Respect: +3527
    • View Profile
Re: Goko's Rating System, Part 1: ... in a formula!
« Reply #32 on: April 15, 2014, 05:41:31 pm »
+4

So it's not clear (or I missed it): did you check this for multiplayer? Because it's possible they're using a system which collapses to being (virtually) identical to a TS implementation for 2-player, but which differs in multiplayer).
(I say virtually here because truncating technically makes it different, though not in a way I'd expect anyone would argue is 'better').

Good question.  Yes, multiplayer appears to be the same.

In the client:

Code: [Select]
Before game: {SD: 418.4078623330855, mean: 761.6906835281002}
Game Result: first place vs two new players
After game: {SD: 413.57218123834264, mean: 963.1254546591063}

In simulation:

Code: [Select]
> import trueskill
> r1 = trueskill.Rating(761.69068, 418.40786)
> r2 = trueskill.Rating(5500,2250)
> r3 = trueskill.Rating(5500,2250)
>
> from gdt.ratings.rating_system import goko
> goko.rate((r1,r2,r3), (1,2,3))[0]
>>trueskill.Rating(mu=963.125, sigma=413.572)
Logged
Salvager Extension | Isotropish Leaderboard | Game Data | Log Search & other toys | Salvager Bug Reports

Salvager not working for me at all today. ... Please help! I can't go back to playing without it like an animal!

ragingduckd

  • Board Moderator
  • *
  • Offline Offline
  • Posts: 1059
  • Respect: +3527
    • View Profile
Re: Goko's Rating System, Part 1: ... in a formula!
« Reply #33 on: April 17, 2014, 03:32:03 pm »
+1

Quote
I coded up their TrueSkill parameters and tweaks so that you can verify it.  Just punch in your username and your opponent's, and it'll tell you what to expect after a win, loss, or draw.  It tells you both your "real" new rating and the change that MF/Goko will tell you.  They can diverge if you beat a much lower rated opponent, as in the example below.

Please post a screen shot if you discover a case in which its prediction is wrong, but first double-check that it actually had your rating right before the game.  The process that collects that data can fall behind.

Pro Mode Rating Predictor (Offline)

I'm taking this offline because the only way I found to keep a current list of Goko Pro ratings is an intolerable nuisance.  I'll add the same functionality to Salvager soon.  It's a whole lot easier to do in the client.
Logged
Salvager Extension | Isotropish Leaderboard | Game Data | Log Search & other toys | Salvager Bug Reports

Salvager not working for me at all today. ... Please help! I can't go back to playing without it like an animal!

Holger

  • Minion
  • *****
  • Offline Offline
  • Posts: 736
  • Respect: +458
    • View Profile
Re: Goko's Rating System, Part 1: ... in a formula!
« Reply #34 on: September 30, 2014, 08:23:47 am »
0

Quote
I coded up their TrueSkill parameters and tweaks so that you can verify it.  Just punch in your username and your opponent's, and it'll tell you what to expect after a win, loss, or draw.  It tells you both your "real" new rating and the change that MF/Goko will tell you.  They can diverge if you beat a much lower rated opponent, as in the example below.

Please post a screen shot if you discover a case in which its prediction is wrong, but first double-check that it actually had your rating right before the game.  The process that collects that data can fall behind.

Pro Mode Rating Predictor (Offline)

I'm taking this offline because the only way I found to keep a current list of Goko Pro ratings is an intolerable nuisance.  I'll add the same functionality to Salvager soon.  It's a whole lot easier to do in the client.

Will you still add this? (Or have you and I just didn't find it?)
Logged

ragingduckd

  • Board Moderator
  • *
  • Offline Offline
  • Posts: 1059
  • Respect: +3527
    • View Profile
Re: Goko's Rating System, Part 1: ... in a formula!
« Reply #35 on: September 30, 2014, 12:50:35 pm »
0

Quote
I coded up their TrueSkill parameters and tweaks so that you can verify it.  Just punch in your username and your opponent's, and it'll tell you what to expect after a win, loss, or draw.  It tells you both your "real" new rating and the change that MF/Goko will tell you.  They can diverge if you beat a much lower rated opponent, as in the example below.

Please post a screen shot if you discover a case in which its prediction is wrong, but first double-check that it actually had your rating right before the game.  The process that collects that data can fall behind.

Pro Mode Rating Predictor (Offline)

I'm taking this offline because the only way I found to keep a current list of Goko Pro ratings is an intolerable nuisance.  I'll add the same functionality to Salvager soon.  It's a whole lot easier to do in the client.

Will you still add this? (Or have you and I just didn't find it?)

I have alpha-quality code for it somewhere, but I'm not sure what I've done with it.

It requires adding a trueskill Javascript package to Salvager, querying the ratings from Goko, calculating the predicted changes using Goko's trueskill parameters, and displaying that in the UI.

Um... anyone else want to do that instead of me writing my code all over again?
Logged
Salvager Extension | Isotropish Leaderboard | Game Data | Log Search & other toys | Salvager Bug Reports

Salvager not working for me at all today. ... Please help! I can't go back to playing without it like an animal!

Holger

  • Minion
  • *****
  • Offline Offline
  • Posts: 736
  • Respect: +458
    • View Profile
Re: Goko's Rating System, Part 1: ... in a formula!
« Reply #36 on: October 09, 2014, 01:11:10 pm »
0

Quote
I coded up their TrueSkill parameters and tweaks so that you can verify it.  Just punch in your username and your opponent's, and it'll tell you what to expect after a win, loss, or draw.  It tells you both your "real" new rating and the change that MF/Goko will tell you.  They can diverge if you beat a much lower rated opponent, as in the example below.

Please post a screen shot if you discover a case in which its prediction is wrong, but first double-check that it actually had your rating right before the game.  The process that collects that data can fall behind.

Pro Mode Rating Predictor (Offline)

I'm taking this offline because the only way I found to keep a current list of Goko Pro ratings is an intolerable nuisance.  I'll add the same functionality to Salvager soon.  It's a whole lot easier to do in the client.

Will you still add this? (Or have you and I just didn't find it?)

I have alpha-quality code for it somewhere, but I'm not sure what I've done with it.

It requires adding a trueskill Javascript package to Salvager, querying the ratings from Goko, calculating the predicted changes using Goko's trueskill parameters, and displaying that in the UI.

Um... anyone else want to do that instead of me writing my code all over again?

If you don't have it ready-made anymore, there's probably no need to put more work into it. (Now I think about it, I would probably be more interested in an Isotropish rating predictor than a Goko rating predictor.)

But I'm still most interested in reading your "Goko vs. Isotropish" article, so put that before anything else... ;)
Logged
Pages: 1 [2]  All
 

Page created in 0.124 seconds with 20 queries.