Dominion Strategy Forum

Please login or register.

Login with username, password and session length
Pages: 1 [2] 3 4 ... 13  All

Author Topic: Isotropish Leaderboard (alternative to Goko Pro)  (Read 144124 times)

0 Members and 4 Guests are viewing this topic.

blueblimp

  • Margrave
  • *****
  • Offline Offline
  • Posts: 2849
  • Respect: +1559
    • View Profile
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #25 on: July 20, 2013, 05:24:58 am »
+2

Isotropic uses mu - 3*sigma. The "uncertainty" number the leaderboard shows is 3*sigma.
Logged

yed

  • Minion
  • *****
  • Offline Offline
  • Posts: 620
  • Shuffle iT Username: yed
  • Respect: +571
    • View Profile
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #26 on: July 20, 2013, 05:53:11 am »
0

Why are there a bunch of players with ".0000" after there username?

Lightning edit: After some more perusal, it looks like it's for duplicated usernames. It seems strange that Goko would allow those!
I suspect .0000 username is created after Facebook/Google+ login with duplicate name.
Logged

Awaclus

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 11809
  • Shuffle iT Username: Awaclus
  • (´。• ω •。`)
  • Respect: +12848
    • View Profile
    • Birds of Necama
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #27 on: July 20, 2013, 05:53:39 am »
+4

I vote for

3. Implement the isotropic leaderboard's algorithm and be done with it
Logged
Bomb, Cannon, and many of the Gunpowder cards can strongly effect gameplay, particularly in a destructive way

The YouTube channel where I make musicDownload my band's Creative Commons albums for free

Fabian

  • 2012 Swedish Champion
  • *
  • Offline Offline
  • Posts: 666
  • Respect: +542
    • View Profile
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #28 on: July 20, 2013, 05:59:29 am »
+1

While I certainly haven't put as much thought into this as some of you other guys have, you know, Awaclus suggestion doesn't seem half bad to me.
Logged

yed

  • Minion
  • *****
  • Offline Offline
  • Posts: 620
  • Shuffle iT Username: yed
  • Respect: +571
    • View Profile
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #29 on: July 20, 2013, 06:15:43 am »
0

On Iso, uncertainties never seem to have gotten below 6.5, and they didn't converge nearly as uniformly.

None of this makes sense to me. Intuitively, I would have expected uncertainties to converge asymptotically to zero. I also wouldn't have expected my uncertainties to converge any more uniformly than Iso's did. Are these anomalies evidence of a failure in TrueSkill, in my parameters, in my code, or in my intuition?

Maybe it has something to do with GAMMA parameter in iso rating:
https://github.com/dougz/trueskill/blob/master/trueskill.py#L285
EDIT: Added quote from iso trueskill source linked above.
Quote
gamma is a small amount by which a player's uncertainty (sigma) is
  increased prior to the start of each game.  This allows us to
  account for skills that vary over time; the effect of old games
  on the estimate will slowly disappear unless reinforced by evidence
  from new games.
« Last Edit: July 20, 2013, 06:31:05 am by yed »
Logged

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4381
    • View Profile
    • WanderingWinder YouTube Page
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #30 on: July 20, 2013, 08:25:50 am »
0

Kirian, your plaintive cries of "why" have straightforward answers.
I believe I am the one crying 'why'. But whilst these answers may be straightforward, that doesn't mean they are good.

Quote
You subtract some multiple of sigma because you need a confidence interval.
Uh, why? You really don't. Beyond this, you could use 50% confidence, which would get you to 0*sigma. But subtracting some multiple of sigma does *not* give you a confidence level anyway. It would be the bottom value of a CI - why don't you use the top? Except that this doesn't even work anyway, because there really isn't any evidence that the thing you are measuring is normally distributed - even if you are assuming this to be the case with no justification (as TrueSkill implicitly does.... this doesn't really make sense though).
Quote
You don't want false positives at the top of the leaderboard.
You've now switched from using the language of CIs to the language of hypothesis tests. But it doesn't really make sense to have a hypothesis test in this case. What is the hypothesis you are testing? The big issue with using a hypothesis test here is that you have thousands of players which each need measurement, which leads to thousands of different hypotheses, which are all related, and all the hypotheses you could purport to be testing off the top of my head are going to have *every* player fail the test... But what does 'false positive' even mean here? A player who shows up as good but isn't? In setting up the system this way, though, aren't you creating lots of false negatives? I mean, by cutting down alpha so low, you are way increasing beta? Or to look at it another way, you are saying people who haven't played are really bad players. As I look at this, there are 6905 or 6906 (can't tell on one because of rounding) players who are rated as being better than a new player, and only 7301 players total. This simply doesn't make sense.
Quote
If you have lots of false positives, it's not a leaderboard, it's a luckyboard, and everyone can recognize that and suggests blunt fixes by filtering out people who don't meet other criteria.
If this is a problem, then it is because your underlying system is a bad one. After all, it is your system which thinks that these players actually have whatever strength you are assigning them. Indeed, the system you have now isn't a leaderboard either so much as a 'be-sorta-good-but-most-of-all-play-a-lot board'.

Quote
That multiple is 3*sigma because you want a 99% confidence interval, which handwavily means that 1 player in the top 100 will be there by a lucky fluke.
This isn't correct at all. First, again, it is assuming that each player's rating is normally distributed, which is wrong, but more than this, it's actually a 99.73% confidence interval even in that case, and okay, it looks like that's not a big difference, but it swings things from being 1 in 100 to 1 in 370. But okay, the whole normal distribution on a rating thing is just so wrong... if that were the case, we would have based on the players all starting with mu = 25 sigma = 25/3, one in 370 players should have mu over 50 and one in 370 should have mu less than zero. In actuality, you don't have anyone over 37 or below 11, with several thousand players.

Quote
You could choose a different number, sure. 2*sigma would probably be acceptable. 1*sigma gets into the silly range. The 50*sigma that you trollishly suggest is deep, deep into the silly range and you know it -- such a leaderboard would not function at all.
But you're being excessively handwaving here. 2*sigma is fine why? 1 is silly how? In fact, with just 1 sigma, you are already getting to where these people you haven't heard of aren't at the very tip top, it doesn't seem to make all that much difference from 3 to me... And how is 50 deeply deeply silly? I don't understand why it's silly to have it at a low number, silly to have a really high number, but good to have it be 2-3 only. What is so special about those values? I truly don't understand. My best guess is that this is just the way it looks to you because you are used to it - you do 95% confidence intervals or 98% or 99% all the time, so you don't like using less than 2 sigma, and 50 sigmas looks weird to you because it's magically too high a standard for you. Well, I agree it's too high a standard, but you have to look at what you are measuring - it's not something where we want to give anything but our best guess.


The real issue is that subtracting or adding any level of sigma from the mean distorts what it is you are measuring. You are no longer measuring the strength of the results alone and reporting it, but rather a combination of the strength and the number of games played*. This means that to maximize your level, it is no longer so much about playing well when you play, but about playing a lot and playing pretty well. The higher number of sigma you subtract essentially shifts this function toward the playing more side (while adding would shift it to the playing less side). I am strongly of the opinion that what number you are reporting as your measurement should match inasmuch as possible the actual objective of what it is you are trying to measure, and as players play to win, we should be trying to measure their ability to win and not their ability to play a lot.

*technically sigma isn't a straight-up measurement of how many games get played, but there is a very high correlation there, such that by far the easiest way to affect your sigma is to play a lot.
« Last Edit: July 20, 2013, 10:33:41 am by WanderingWinder »
Logged

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4381
    • View Profile
    • WanderingWinder YouTube Page
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #31 on: July 20, 2013, 08:41:40 am »
0

It does play a little into my mu-3*sigma point, insofar as the difference between 1.0 and 0.75 is basically nothing to the system, whereas the .75 difference in mu that it is equating to is pretty significant.
Your first listing of straight mu sort is obviously not the most desirable thing ever, but I actually don't see how the second is only 'a little more palatable': to me, this IS the leaderboard. I don't know about cutting off at 30 games - I would probably cut off at sigma of 1.5 or 1 or something. But basically, yeah, that's what I would go with.

The problem is that the leaderboard means different things to different people. Using the ATP as an example, Andy Murray has been playing very well over the past year, but he didn't play in the French Open this year, so he can't be #1. This doesn't mean you shouldn't expect him to be able to beat Djokovic in the US Open, it just means he hasn't done enough to deserve to be #1, based on the meaning of the ATP rankings. The rankings are not predictive, they are accomplishment-based.

Some people like it this way, and some people don't. Psychologically, sorting by mu makes it kind of crazy. People are already upset at how much one game affects your Goko rating, so I shudder to think how they'll feel about a leaderboard sorted by mu. However, the people who play the most often don't really want to be rewarded for simply playing more games, but only for playing better, so these people may tend to prefer to sort by mu.

Maybe a good compromise would be to provide both? Have the leaderboard be the standard Trueskill thing, but make it sortable by mu, so people can see that if they want?
The ATP is  not a good comparison. Here's why:
For one thing, their ranking system is really terrible - it treats all events within the last year absolutely equally (well, okay, no, it gives more weight to more important tournaments; my point is that *when* the events happened doesn't matter - yesterday is given the same weight as 50 weeks ago), and then throws out everything before that. So, if Andy Murray loses every match he plays from now until next year's Wimbledon, and then he gets to the finals of Wimbledon next year but loses to Djokovic or Nadal (or whatever the best player is at that point), his ranking will DECREASE on the basis of that result, because it was worse than he did this year. It also takes only round into account, and not opponents. So the guy who beat Nadal in the early rounds this year got no more credit for that than someone beating random world number 58. (There are other issues with the system that aren't really relevant here, such as that it makes no differentiation based on surface).

But the big point of your post here is this: "The rankings are not predictive, they are accomplishment-based." Well, the problem with this is, what is an accomplishment? For Tennis, it sort of makes sense to do things as you suggest, because while they are 'open' tournaments, they don't really just let anyone play. But the biggest thing is, you measure the goal. Well, in tennis, it's not just about winning the highest percentage of matches against the best players you can. Oh sure, that's somewhat in there, but really you are trying to win Grand Slams, which is much more important, and then you are trying to win whatever other tournament, that's less important, etc. The point is that not all events are equally important, and so you want to take that into account when doing the rankings. That's not really true of Dominion, and moreover, not in a way which matters: In tennis, playing 20 events a year may well make you more tired, which can inhibit your ability to do as well in them. In Dominion, the fatigue factor... well, maybe it's not non-existent, but it's much less.

But most importantly along these lines, what is accomplishment? I would say that it's having good winning rates much moreso than winning a lot. The thing about accomplishment systems is that everything is positive, you don't lose rating. But I think losing 50 games and winning 2 should be rated worse than just winning any one of those same games, no?

Oh, it's also a misunderstanding if you think that sorting by mu will make the ratings swingier - it really won't.
« Last Edit: July 20, 2013, 10:34:08 am by WanderingWinder »
Logged

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4381
    • View Profile
    • WanderingWinder YouTube Page
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #32 on: July 20, 2013, 09:00:50 am »
0

There seem to be three reasonable options:
  • Sort by mu with a cutoff based on variance or number of games
  • Sort by mu-k*sigma for some k between 1 and 3
  • Implement the isotropic leaderboard's algorithm and be done with it
I'm pretty sure that nobody wants the fourth option, sorting by mu with no cutoff.
I actually do want option number four. The problem with it is only that the system is probably quite wrong. Playing one game, no matter how good you do, shouldn't get you to high enough of a rating that you would be rated like number one on the leaderboard. I mean, that's actually an empirical question - does this make for a better rating system than the alternative or not? You all react against the mu sort because you think it should probably be worse. Well, I tend to agree with that line of thought, but it's an empirical question, and if it is one which we are right on, it actually just means the entire rating system is bad. Well, I suppose I would probably prefer an 'active' leaderboard, such that you fall off after a certain period of inactivity, but still I definitely want a mu sort.

Quote
If I understand correctly, the purpose of sorting by mu-k*sigma is the same as that of implementing a cutoff. In both cases, the goal is to keep the top of the leaderboard from being filled up by mediocre players who have been lucky in a small number of games. Either option deviates from a rating system's one truly objective goal: estimating the probability that any given player beats another.
Well, I don't actually think this is the goal of either, particularly of the mu-k*sigma sort, where I think the goal is to spur more playing. But you're quite right on the goal of a rating system, and this is really why I would want a mu sort - you are expected to be better than every player below you and worse than every player above you. This isn't the case with the current system.

Quote
Microsoft Research appears to advocate the mu-k*sigma approach,
If they have any research saying this, it's market research. Seriously, they put this in their general information about the system, but you don't see it in the scholarly papers, and there's really no statistical backing for it.
Quote
but they don't take a strong stance on what k should be used. Using any k>0 means sorting players by a deliberate underestimate of their actual skill, but the degree of that underestimate varies with k. With k=3, a player's rank derives from the skill level that we're 99% confident is below their actual skill. To me, that seems a little excessive and possibly unfair to new players. This is what the leaderboard on drunkensailor.org is doing now, and it seems to be what Goko does as well.
We actually still don't really know what Goko does. For sure they have some uncertainty thing such that playing more helps your rating, but for all I know it actually gets folded into a single rating number and not separated out as a mu and sigma kind of thing.

Quote
Isotropic used k=1.
Actually, iso used k = 3. You might be confused because the numbers they showed were mu+/-3*sigma, so it looks like they just subtracted the two numbers. But the second number displayed was 3*sigma, not just sigma.
Quote
In other words, a player's rank derived from the skill level that was 84%<?> certain to be below their actual skill level.
Except that this assumes that players' skills are normally distributed, which isn't true. But I've covered this.
Quote
This is still conservative, but not nearly as brutal to new/lucky players as mu-3*sigma. Iso also used an unusually high starting uncertainty: sigma=mu instead of sigma=mu/3. I'm not sure what the motivation for this was, but it explains why Iso had "levels" as high as 53 and as low as -35, while mine runs from 29 to -3.
As I've explained above, you have this wrong, because they displayed 3sigma and not sigma. But actually yours running from -3 to 29 is not something in your favor - if the ratings were actually normally distributed, you would have, based on your number of players, a much bigger range (of mu!) than you do.

Quote
Finally, note that sigma appears to converge to 0.80 in with my standard Trueskill implementation. A great many of the experience players have ratings between 0.79 and 0.82, and the lowest sigma in all of Goko is 0.78. On Iso, uncertainties never seem to have gotten below 6.5, and they didn't converge nearly as uniformly.
They don't actually converge to .80. It's just that it's very hard to get lower than that by playing the way people actually do. For that, I would have to look, but you would either need higher draw rates, or you'd need to do something like play very weak players a lot and win a lot. But it has to do with how their updating equations work, and basically there's enough uncertainty in the game that you can't get lower than this. I don't' think they should ever go down to 0 though, because you really can't ever get totally sure of what someone's skill is with no uncertainty. Anyway, iso's were higher because they incremented upward a little bit with every day that passed (and with every game? I can't recall exactly), which meant that to get them very low, you not only needed to do what you need to do for your system, but you needed to play a heckuva lot, all the time.

Anyway, the real thing to me is, the proof is in the pudding. You go with the system that best measures things, and the only way we have of telling this is based on the predictions, so you go with the thing which best predicts things. Since you are actually only making predictions centred on the value of mu, that is what you should be sorting by.



Edit: Incidentally, it has been suggested at points in the past that I have made such comments in a way which is self-serving. This is pretty clearly not the case here. Relative to other players (if I have counted correctly), this change would help me relative to 8 players, no change relative to 139 besides myself and hurt me relative to 7153.
« Last Edit: July 20, 2013, 09:08:20 am by WanderingWinder »
Logged

Awaclus

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 11809
  • Shuffle iT Username: Awaclus
  • (´。• ω •。`)
  • Respect: +12848
    • View Profile
    • Birds of Necama
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #33 on: July 20, 2013, 11:03:55 am »
0

While I certainly haven't put as much thought into this as some of you other guys have, you know, Awaclus suggestion doesn't seem half bad to me.
It was AI's suggestion, not mine. I just voted for it.
Logged
Bomb, Cannon, and many of the Gunpowder cards can strongly effect gameplay, particularly in a destructive way

The YouTube channel where I make musicDownload my band's Creative Commons albums for free

markusin

  • Cartographer
  • *****
  • Offline Offline
  • Posts: 3846
  • Shuffle iT Username: markusin
  • I also switched from Starcraft
  • Respect: +2437
    • View Profile
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #34 on: July 20, 2013, 11:45:00 am »
0

If what WW is true, that skill is not normally distributed, then there is little that can be done to accurately model win rates without doing a complete overhaul of the rating system. I don't know much about statistical analysis, but this question has me thinking.

The thing about Dominion is that there is so much inherent randomness, and that randomness is assumed to fit into a normal distribution, but then the skill is also assumed to be normally distributed. Is that correct? Think of games that are almost all pure skill or strength, like Starcraft or arm-wrestling. In those games, a player flat out wins against a much weaker player, barring exceptional circumstances (sickness maybe?). The players of games like that have to be very close in skill for there to even be a contest. So for those games, mu is much more informative about who will win that sigma. Things get complicated because skill/strength clearly decreases without continuous practice/training, so some sort of rating drift seems appropriate.

The normal distribution just seems more convenient than anything else. Both randomness and rating drift can be lumped into sigma, and that seems less controversial that the alternatives like decreasing mu or whatever. Applying another model would just be guesswork without empirical data, and so far the normal distribution doesn't seem to be THAT off. I think the normal distribution mainly has problems with predicting the win rate when the rating gap gets too large.
Logged

rrenaud

  • Administrator
  • *****
  • Offline Offline
  • Posts: 991
  • Uncivilized Barbarian of Statistics
  • Respect: +1197
    • View Profile
    • CouncilRoom
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #35 on: July 20, 2013, 01:39:34 pm »
+5

All models are wrong, some models are useful.
Logged

markusin

  • Cartographer
  • *****
  • Offline Offline
  • Posts: 3846
  • Shuffle iT Username: markusin
  • I also switched from Starcraft
  • Respect: +2437
    • View Profile
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #36 on: July 20, 2013, 01:51:19 pm »
0

So long as a model helps predict outcomes given inputs and makes sense, it's doing what it's supposed to.
Logged

dondon151

  • 2012 US Champion
  • *
  • Offline Offline
  • Posts: 2522
  • Respect: +1856
    • View Profile
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #37 on: July 20, 2013, 03:16:54 pm »
+1

ITT: whining
Logged

Polk5440

  • Torturer
  • *****
  • Offline Offline
  • Posts: 1708
  • Respect: +1788
    • View Profile
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #38 on: July 20, 2013, 03:39:21 pm »
+2

The normal distribution just seems more convenient than anything else.

Well, and you can try to hide behind central limit theorems, even if they don't necessarily apply....
Logged

markusin

  • Cartographer
  • *****
  • Offline Offline
  • Posts: 3846
  • Shuffle iT Username: markusin
  • I also switched from Starcraft
  • Respect: +2437
    • View Profile
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #39 on: July 20, 2013, 09:57:53 pm »
0

I forgot to say this: It's great that you went out of your way to make this. Awesome work, Andrew Iannaccone.
Logged

Titandrake

  • Mountebank
  • *****
  • Offline Offline
  • Posts: 2210
  • Respect: +2854
    • View Profile
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #40 on: July 22, 2013, 01:18:31 am »
+1

So, you're saying I'm only 4 levels behind Stef?

Guys I finally broke the equivalent of Iso level 40!
Logged
I have a blog! It's called Sorta Insightful. Check it out?

SCSN

  • Mountebank
  • *****
  • Offline Offline
  • Posts: 2227
  • Respect: +7140
    • View Profile
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #41 on: July 22, 2013, 02:36:52 am »
+1

So, you're saying I'm only 4 levels behind Stef?

Guys I finally broke the equivalent of Iso level 40!

I'm still 4 steps behind my iso glory days as level 33 :(
Logged

hsiale

  • Duke
  • *****
  • Offline Offline
  • Posts: 383
  • Respect: +244
    • View Profile
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #42 on: July 22, 2013, 05:34:14 am »
+3

I vote for

3. Implement the isotropic leaderboard's algorithm and be done with it
+1. That algorithm was good enough and people are used to it - reimplementing means I can compare how I play to how I played on Iso. Currently I have no idea how the level on Goko official leaderboard or the unofficial one compares to low 20ish level that was my Iso peak.
Logged

ragingduckd

  • Board Moderator
  • *
  • Offline Offline
  • Posts: 1059
  • Respect: +3527
    • View Profile
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #43 on: July 22, 2013, 07:47:38 am »
+8

Ok, I've reached a conclusion. I'm wimping out.

I'm still interested in what sort of rating system is best for Dominion, but I don't feel qualified to answer that question. For now, what I can definitely do is give the peoples something they want and something that's better than Goko.

So please vote on which system you prefer. If there are two or three options you like equally, you can vote for all of them. I'm not going to implement multiple views, as I think that's just inviting chaos.

Incidentally, I found a way to see the full Goko rating data, or at least your mu and sigma together. It turns out that the rating they show is actually mu - 2 * sigma.
Logged
Salvager Extension | Isotropish Leaderboard | Game Data | Log Search & other toys | Salvager Bug Reports

Salvager not working for me at all today. ... Please help! I can't go back to playing without it like an animal!

SCSN

  • Mountebank
  • *****
  • Offline Offline
  • Posts: 2227
  • Respect: +7140
    • View Profile
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #44 on: July 22, 2013, 08:03:55 am »
+7

Incidentally, I found a way to see the full Goko rating data, or at least your mu and sigma together. It turns out that the rating they show is actually mu - 2 * sigma.

I'm very curious as to which part of that is too complicated to plop into a formula. Is it the subtraction, or perhaps the multiplication by two?
Logged

Polk5440

  • Torturer
  • *****
  • Offline Offline
  • Posts: 1708
  • Respect: +1788
    • View Profile
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #45 on: July 22, 2013, 09:33:24 am »
+1

Incidentally, I found a way to see the full Goko rating data, or at least your mu and sigma together. It turns out that the rating they show is actually mu - 2 * sigma.

I'm very curious as to which part of that is too complicated to plop into a formula. Is it the subtraction, or perhaps the multiplication by two?

Goko already told us this directly.

It's the updating of mu and sigma that they are not explicitly revealing.

Edit: Since this post they added the drift downward for inactivity.
« Last Edit: July 22, 2013, 09:34:47 am by Polk5440 »
Logged

sudgy

  • Cartographer
  • *****
  • Offline Offline
  • Posts: 3431
  • Shuffle iT Username: sudgy
  • It's pronounced "SOO-jee"
  • Respect: +2706
    • View Profile
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #46 on: July 22, 2013, 02:12:46 pm »
0

I would vote for Iso's rating system.
Logged
If you're wondering what my avatar is, watch this.

Check out my logic puzzle blog!

   Quote from: sudgy on June 31, 2011, 11:47:46 pm

WanderingWinder

  • Adventurer
  • ******
  • Offline Offline
  • Posts: 5275
  • ...doesn't really matter to me
  • Respect: +4381
    • View Profile
    • WanderingWinder YouTube Page
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #47 on: July 22, 2013, 04:01:39 pm »
0

If what WW is true, that skill is not normally distributed, then there is little that can be done to accurately model win rates without doing a complete overhaul of the rating system. I don't know much about statistical analysis, but this question has me thinking.

The thing about Dominion is that there is so much inherent randomness, and that randomness is assumed to fit into a normal distribution, but then the skill is also assumed to be normally distributed. Is that correct? Think of games that are almost all pure skill or strength, like Starcraft or arm-wrestling. In those games, a player flat out wins against a much weaker player, barring exceptional circumstances (sickness maybe?). The players of games like that have to be very close in skill for there to even be a contest. So for those games, mu is much more informative about who will win that sigma. Things get complicated because skill/strength clearly decreases without continuous practice/training, so some sort of rating drift seems appropriate.

The normal distribution just seems more convenient than anything else. Both randomness and rating drift can be lumped into sigma, and that seems less controversial that the alternatives like decreasing mu or whatever. Applying another model would just be guesswork without empirical data, and so far the normal distribution doesn't seem to be THAT off. I think the normal distribution mainly has problems with predicting the win rate when the rating gap gets too large.

Not exactly. I mean, whatever distribution you're going to use is going to calibrate, so that essentially (this is a bit of a simplification) it will be accurate for your average matchup. Then, the curves aren't going to be *that* far off in the region of your matchup. I could spout to you a whole bunch of plausible curves that for their mid-sections are within half a percent of each other. And you're not going to really notice that - you probably won't even notice the difference between 55% and 58% very much, or at least certainly not very quickly. By far the biggest differences you are going to get though are in the further out matchups, where one player is heavily favored.

Anyway, long story short, the curve can be reasonably bad overall and still perform pretty well for the majority of matchups you're in, but the CIs break down very quickly if you aren't normal.


Also, @AI/ragingduckd or everyone really: I strongly think that the way to sort is just by mu, but for all that... it's not really that important, and I am anyway pretty happy with it so long as you continue to show it broken up, so that everyone can get out of it what they want, either way you end up sorting.

Incidentally, I found a way to see the full Goko rating data, or at least your mu and sigma together. It turns out that the rating they show is actually mu - 2 * sigma.

Interesting. How does it compare to what you have up?

mail-mi

  • Saboteur
  • *****
  • Offline Offline
  • Posts: 1298
  • Shuffle iT Username: mail-mi
  • Come play some Forum Mafia with us!
  • Respect: +1364
    • View Profile
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #48 on: July 22, 2013, 06:37:31 pm »
0

How do i find me? is there a  search engine for names? that would be awesome.
Logged
I currently imagine mail-mi wearing a dark trenchcoat and a bowler hat, hunched over a bit, toothpick in his mouth, holding a gun in his pocket.  One bead of sweat trickling down his nose.

'And what is it that ye shall hope for? Behold I say unto you that ye shall have hope through the atonement of Christ and the power of his resurrection, to be raised unto life eternal, and this because of your faith in him according to the promise." - Moroni 7:41, the Book of Mormon

rrenaud

  • Administrator
  • *****
  • Offline Offline
  • Posts: 991
  • Uncivilized Barbarian of Statistics
  • Respect: +1197
    • View Profile
    • CouncilRoom
Re: Isotropish Leaderboard (alternative to Goko Pro)
« Reply #49 on: July 22, 2013, 07:07:03 pm »
+6

The other thing to consider is that if you are dumping data an HTML table, it's really easy to provide options to sort by various columns.  I used this sorttable Javascript library a lot in councilroom. 

Maybe the best answer to "which way to sort" is "who cares? Just click whichever column you like".
Logged
Pages: 1 [2] 3 4 ... 13  All
 

Page created in 0.123 seconds with 21 queries.