Dominion Strategy Forum

Archive => 2016 DominionStrategy Championships => Archive => 2012 => Topic started by: jonts26 on December 03, 2012, 01:09:40 pm

Title: Win Probability Calculator
Post by: jonts26 on December 03, 2012, 01:09:40 pm
Hey, I was bored and I'm a huge nerd so I made a series win probability calculator in excel. It's still mostly functional in google docs and open office if you don't have excel.

Just plug in each player's mean skill and who goes first. Also you can adjust the first turn advantage parameter.

Title: Re: Win Probability Calculator
Post by: D Bo on December 03, 2012, 01:10:33 pm
Wow, that is super-nerdy. I like it!
Title: Re: Win Probability Calculator
Post by: WanderingWinder on December 03, 2012, 01:31:31 pm
You're telling me that you know the underlying TrueSkill WE algorithm, and it's just a logistic curve?
Title: Re: Win Probability Calculator
Post by: DStu on December 03, 2012, 01:35:35 pm
You're telling me that you know the underlying TrueSkill WE algorithm, and it's just a logistic curve?

WE?
Title: Re: Win Probability Calculator
Post by: WanderingWinder on December 03, 2012, 01:37:48 pm
Win Expectancy. Apologies, should have written that out.


I am developing a rating system... now(?)... mostly stemming from my experience with Dominion, Chess, and a smattering of other games. Hopefully an improvement.
Title: Re: Win Probability Calculator
Post by: jonts26 on December 03, 2012, 01:39:28 pm
Win Expectancy. Apologies, should have written that out.


I am developing a rating system... now(?)... mostly stemming from my experience with Dominion, Chess, and a smattering of other games. Hopefully an improvement.

Logistic curve as I discussed here: http://forum.dominionstrategy.com/index.php?topic=5193.msg127303#msg127303
Title: Re: Win Probability Calculator
Post by: toaster on December 03, 2012, 03:31:15 pm
Nice work.  Next step: start running tournament simulations!   ;)
Title: Re: Win Probability Calculator
Post by: zahlman on December 03, 2012, 05:07:52 pm
Wait, is "logistic" a real name for a real class of functions? o_O

(I wish Noscript's XSS filtering didn't play havoc with Wolfram Alpha links...)
Title: Re: Win Probability Calculator
Post by: DStu on December 03, 2012, 05:11:12 pm
http://en.wikipedia.org/wiki/Logistic_function

:e Actually, I thought TS was Gaussian?
Title: Re: Win Probability Calculator
Post by: greatexpectations on December 03, 2012, 05:23:43 pm
Actually, I thought TS was Gaussian?

yup. (http://research.microsoft.com/en-us/projects/trueskill/details.aspx)
Title: Re: Win Probability Calculator
Post by: rrenaud on December 03, 2012, 05:25:35 pm
If it doesn't have a variance parameter in it, it's definitely not trueskill.

The logistic and cumultive normal functions are pretty similiar looking.

(http://upload.wikimedia.org/wikipedia/commons/thumb/8/88/Logistic-curve.svg/600px-Logistic-curve.svg.png)

(http://upload.wikimedia.org/wikipedia/commons/thumb/c/ca/Normal_Distribution_CDF.svg/720px-Normal_Distribution_CDF.svg.png)
Title: Re: Win Probability Calculator
Post by: DStu on December 03, 2012, 05:28:04 pm
The logistic and cumultive normal functions are pretty similiar looking.
Yeah. both increasing, and reaching from 0 to 1. :P
And symmetric, to add something nontautologic...
Title: Re: Win Probability Calculator
Post by: zahlman on December 03, 2012, 05:39:18 pm
Basically, they're both sigmoids (which is what I'd have called the logistic function if I looked at it and didn't know the exact formula).
Title: Re: Win Probability Calculator
Post by: DStu on December 03, 2012, 05:45:09 pm
but Gaussian is decreasing much faster with ~exp(-x^2), while logistic ~exp(-x)
Title: Re: Win Probability Calculator
Post by: SirPeebles on December 03, 2012, 05:48:29 pm
It's odd really.  I know logistic functions well from teaching so many semesters of differential equations, but I never used them elsewhere when I was a student, nor do I ever use them in my research.  I hope that some of you use this stuff, and that it's more than just a contrived plaything for my students.
Title: Re: Win Probability Calculator
Post by: rrenaud on December 03, 2012, 06:38:14 pm
Maybe this picture is better?  It's inverse CDF of normal vs inverse of logistic, scaled at p = .5.

(http://upload.wikimedia.org/wikipedia/commons/thumb/3/39/Logit-probit.svg/300px-Logit-probit.svg.png)

http://en.wikipedia.org/wiki/Logit#Comparison_with_probit
Quote
Closely related to the logit function (and logit model) are the probit function and probit model. The logit and probit are both sigmoid functions with a domain between 0 and 1, which makes them both quantile functions — i.e. inverses of the cumulative distribution function (CDF) of a probability distribution. In fact, the logit is the quantile function of the logistic distribution, while the probit is the quantile function of the normal distribution. The probit function is denoted , where  is the CDF of the normal distribution, as just mentioned:

As shown in the graph, the logit and probit functions are extremely similar, particularly when the probit function is scaled so that its slope at y=0 matches the slope of the logit. As a result, probit models are sometimes used in place of logit models because for certain applications (e.g. in Bayesian statistics) implementation of them is easier.
Title: Re: Win Probability Calculator
Post by: jonts26 on December 03, 2012, 07:08:17 pm
So I'm not intimately familiar with the trueskill workings.

Does anyone know how the beta parameter for the game (25 in this case) interacts with individual player variances? Certainly I could just use the cfd of the difference in the individual player normal curves, but I'm pretty sure the individual variances don't account for the inherent randomness of the game, only in the players actual skill value.

Anyway, that's why I left them out.

Also I used the logistic distribution instead of the normal because that one paper I linked in the other post said it did a better job of prediction, but that's as much research as I did.
Title: Re: Win Probability Calculator
Post by: DStu on December 03, 2012, 07:11:55 pm
Really don't like that statement of wikipedia. Of course in the middle, they can be fitted to be quite similar, but  they behave differently quite soon at the edges. Especially if you want to do some bayesian statistics, where some more "extreme" regions of the distribution can come into play quite easily, you better don't just change the distribution just because it's easier to compute.

Of course assumed that you have chosen the original distribution in your model for some better reason than being easily But this happens with Gaussian quite often...
Title: Re: Win Probability Calculator
Post by: rrenaud on December 03, 2012, 08:01:37 pm
Kaggle ran a chess prediction competition, and the winner's entry was based on TrueSkill.

http://people.few.eur.nl/salimans/chess.html

But he won basically because he milked side information (the data contained #matches per participant.  High #matches came from advancing further in tournaments, doh!, see WW's post next).

It's not clear to me that the logistic or normal is distribution is more natural than another for player skills though.  Nor that either are a particularly good fit.

Many uses of the logistic I've seen are motivated because it has a super easy to compute derivative, which makes it easy to optimize with gradient descent.
Title: Re: Win Probability Calculator
Post by: WanderingWinder on December 03, 2012, 08:19:07 pm
Kaggle ran a chess prediction competition, and the winner's entry was based on TrueSkill.

http://people.few.eur.nl/salimans/chess.html

But he won basically because he milked side information (the data contained #matches per participant.  High #matches came from advancing further in tournaments, doh!).

It's not clear to me that the logistic or normal is distribution is more natural than another for player skills though.  Nor that either are a particularly good fit.

Many uses of the logistic I've seen are motivated because it has a super easy to compute derivative, which makes it easy to optimize with gradient descent.

I don't think you understood that correctly: it wasn't so much that people with more games went further in tournaments, but that people who played stronger opponents were likely to have done better - particularly if way outrated, it probably means that the event is a swiss that your player has done well in, hence overpredict him a bit. (Incidentally, if you know much about chess, there are very few knockout tournaments...). On the other hand, his closest competitors all did some version of the same thing, too.
The method he used was not deemed feasible as a practicable rating system for FIDE (the world chess federation) to use.
(I am already acquainted with this stuff from my knowledge of the chess world; for a more personable write-up, see say this article: http://www.chessbase.com/newsdetail.asp?newsid=7277 )

Logistic seems much more natural to me than normal - you can get logistic out of the Odds Ratio really simply, and this is not too hard to give some justification for. Normal... I don't see any particular reasoning for. Again, not that either are super hot in all contexts.
Title: Re: Win Probability Calculator
Post by: SirPeebles on December 03, 2012, 08:33:46 pm
Really don't like that statement of wikipedia. Of course in the middle, they can be fitted to be quite similar, but  they behave differently quite soon at the edges.

Even the math discussions devolve into bickering about edge cases  ::)
Title: Re: Win Probability Calculator
Post by: Kirian on December 03, 2012, 09:16:54 pm
Off-topic but possibly important for theory:  I can't see the attachment in Chrome.  I was able to download it in Firefox, however.
Title: Re: Win Probability Calculator
Post by: shark_bait on December 03, 2012, 09:20:45 pm
Highlight the text and move your cursor to the right.
Title: Re: Win Probability Calculator
Post by: Kirian on December 03, 2012, 09:33:52 pm
Highlight the text and move your cursor to the right.

Highlight which text?  (Not that it matters much, the bug is in SMF I'm sure, and not particular to this forum).

This calculator really highlights some cool things!

1. The necessity of a multiple-round matchup.  My advantage in my first round only moves from 76.6% to 74.5% when going from 50% (really, "none") first-person advantage to 70% first-person advantage.

2. The effects of loser-goes-first.  For fun, try changing the first person advantage to something really low, like 0.2, and then look at your graph.  It becomes horribly skewed, as the winner has the advantaged position each time!
Title: Re: Win Probability Calculator
Post by: shark_bait on December 03, 2012, 09:35:38 pm
Highlight the OP text,

Quote
Hey, I was bored and I'm a huge nerd so I made a series win probability calculator in excel. It's still mostly functional in google docs and open office if you don't have excel.

Just plug in each player's mean skill and who goes first. Also you can adjust the first turn advantage parameter.

And then bring your cursor to the right.  You should get the attachment from chrome.
Title: Re: Win Probability Calculator
Post by: rrenaud on December 04, 2012, 12:57:37 am
More research:

A logistic distribution based ranking system that outperforms trueskill on the trueskill data sets.  (I skipped the math and only skimmed the paper).

http://jmlr.csail.mit.edu/papers/volume12/weng11a/weng11a.pdf

Quote
In fact, most currently used Elo variants
for chess data use a logistic distribution rather than Gaussian because it is argued that weaker players have significantly greater winning chances than the Gaussian model predicts.

which is consistent with the exp(-x^2) dropping off faster than exp(-x).
Title: Re: Win Probability Calculator
Post by: DStu on December 04, 2012, 03:51:09 am
Really don't like that statement of wikipedia. Of course in the middle, they can be fitted to be quite similar, but  they behave differently quite soon at the edges.

Even the math discussions devolve into bickering about edge cases  ::)

1) If you don't care about the edge cases, it's not math.
2) Bayesian statistics is not an edge case, but the example explicitly given in the wiki article.

Note of course I'm not talking about rating systems here, I have no idea and not thought about what distribution would fit better in this context, if WW says it's logistic I would just believe it. I don't see any reason why it should be Gaussian, as I don't think you are in a regime for Central Limit Theorem here.


Edit: What come's next is not really well thought off:
If the remainings of my understanding of TS is right, I somehow think the model lacks a parameter anyway.  You have mean skill, ok, and the uncertainity of the system on you skill.  Somehow it's maybe reasonable to assume Gaussian on the uncertainity.  But the same distribution is also used to get the winprobabilities given mean and uncertainity (or?), and I don't see any reason why one should do that.
Title: Re: Win Probability Calculator
Post by: SirPeebles on December 04, 2012, 07:30:42 am
Really don't like that statement of wikipedia. Of course in the middle, they can be fitted to be quite similar, but  they behave differently quite soon at the edges.

Even the math discussions devolve into bickering about edge cases  ::)

1) If you don't care about the edge cases, it's not math.
2) Bayesian statistics is not an edge case, but the example explicitly given in the wiki article.

Note of course I'm not talking about rating systems here, I have no idea and not thought about what distribution would fit better in this context, if WW says it's logistic I would just believe it. I don't see any reason why it should be Gaussian, as I don't think you are in a regime for Central Limit Theorem here.


Edit: What come's next is not really well thought off:
If the remainings of my understanding of TS is right, I somehow think the model lacks a parameter anyway.  You have mean skill, ok, and the uncertainity of the system on you skill.  Somehow it's maybe reasonable to assume Gaussian on the uncertainity.  But the same distribution is also used to get the winprobabilities given mean and uncertainity (or?), and I don't see any reason why one should do that.

It was a joke.  You were literally discussing the edges in the snippet I quoted.
Title: Re: Win Probability Calculator
Post by: WanderingWinder on December 04, 2012, 08:21:34 am
Note of course I'm not talking about rating systems here, I have no idea and not thought about what distribution would fit better in this context, if WW says it's logistic I would just believe it. I don't see any reason why it should be Gaussian, as I don't think you are in a regime for Central Limit Theorem here.

To clarify here, I'm not saying that logistic actually fits better necessarily (though every study I've seen done says that it does....), I'm saying that if you have no data and were making a wild guess by pulling a distribution out of thin air, you might well pick logistic, and there's more reason to pick logistic than gaussian. Having said that, what you *actually* want to do of course is going to look at the data available and seeing what both has some logical reasoning behind and very importantly actually fits the data.

To jump off of this, does anyone have some technical skills and/or knowledge of the isotropic data to be able to grab the results of a bunch of games from a given time period? (What I am looking for is ID player 1, ID player 2..., ID player N, and number of points scored by each. Rating would be nice if possible. Dates of game being played would be even nicer. Chronology of games played would be nicest, but probably overkill. And actually I only care about 2 player games here, but I ought to be able to filter that myself). I would like to do some statistical testing if possible....
Title: Re: Win Probability Calculator
Post by: DStu on December 04, 2012, 08:54:41 am
It was a joke.  You were literally discussing the edges in the snippet I quoted.

It was too early to understand jokes.
Quote
To jump off of this, does anyone have some technical skills and/or knowledge of the isotropic data to be able to grab the results of a bunch of games from a given time period? (What I am looking for is ID player 1, ID player 2..., ID player N, and number of points scored by each. Rating would be nice if possible. Dates of game being played would be even nicer. Chronology of games played would be nicest, but probably overkill. And actually I only care about 2 player games here, but I ought to be able to filter that myself). I would like to do some statistical testing if possible....
Do you want to give some names and get the stats for them, or do you want to get all the IDs for all players in a given time?
Title: Re: Win Probability Calculator
Post by: ipofanes on December 04, 2012, 09:15:32 am
It's odd really.  I know logistic functions well from teaching so many semesters of differential equations, but I never used them elsewhere when I was a student, nor do I ever use them in my research.  I hope that some of you use this stuff, and that it's more than just a contrived plaything for my students.

Next time you can tell your students that the logistic function is used day in day out in statistical analysis. Don't now anything about TrueSkill but the rehashed Elo number uses the logistic distribution: http://en.wikipedia.org/wiki/Elo_rating_system#Implementing_Elo.27s_scheme

Also, genetic linkage: http://en.wikipedia.org/wiki/Genetic_linkage#LOD_score_method_for_estimating_recombination_frequency
Title: Re: Win Probability Calculator
Post by: Kirian on December 04, 2012, 11:33:48 am
Highlight the OP text,

Quote
Hey, I was bored and I'm a huge nerd so I made a series win probability calculator in excel. It's still mostly functional in google docs and open office if you don't have excel.

Just plug in each player's mean skill and who goes first. Also you can adjust the first turn advantage parameter.

And then bring your cursor to the right.  You should get the attachment from chrome.

Got nuttin'.  Weird.

Edited:  Ohhhhhh.  You meant click-and-drag to the right.  That's quite a bit difference.

Also, genetic linkage: http://en.wikipedia.org/wiki/Genetic_linkage#LOD_score_method_for_estimating_recombination_frequency

Since I'm not seeing the word logistic in there, are you willing to explain how OD/LOD is related to the logistic function?  I did linkage analysis for about a year and a half and always simply treated it as a Bayesian probability analysis.  (Granted, the word Bayesian isn't in that article either, which makes me less than impressed with that WP article).
Title: Re: Win Probability Calculator
Post by: DStu on December 04, 2012, 12:08:50 pm
Since I'm not seeing the word logistic in there, are you willing to explain how OD/LOD is related to the logistic function?  I did linkage analysis for about a year and a half and always simply treated it as a Bayesian probability analysis.  (Granted, the word Bayesian isn't in that article either, which makes me less than impressed with that WP article).

Without clicking the link:
It's still Bayesian, you just have different underlying distributions in your model.
Title: Re: Win Probability Calculator
Post by: Kirian on December 04, 2012, 01:04:20 pm
Since I'm not seeing the word logistic in there, are you willing to explain how OD/LOD is related to the logistic function?  I did linkage analysis for about a year and a half and always simply treated it as a Bayesian probability analysis.  (Granted, the word Bayesian isn't in that article either, which makes me less than impressed with that WP article).

Without clicking the link:
It's still Bayesian, you just have different underlying distributions in your model.

OK... I think I see what it is.  Doing genetics work I didn't really think of there being a "distribution" per se.  I guess the distribution is a distribution of [recombination frequency](x, y) over all pairs (x, y)?  But I can only imagine that would be exponentially distributed.
Title: Re: Win Probability Calculator
Post by: Watno on December 04, 2012, 01:07:27 pm
So happy i chose not to take the Intrdoduction to Proability Theory lecture.
Title: Re: Win Probability Calculator
Post by: WanderingWinder on December 04, 2012, 01:27:21 pm
It was a joke.  You were literally discussing the edges in the snippet I quoted.

It was too early to understand jokes.
Quote
To jump off of this, does anyone have some technical skills and/or knowledge of the isotropic data to be able to grab the results of a bunch of games from a given time period? (What I am looking for is ID player 1, ID player 2..., ID player N, and number of points scored by each. Rating would be nice if possible. Dates of game being played would be even nicer. Chronology of games played would be nicest, but probably overkill. And actually I only care about 2 player games here, but I ought to be able to filter that myself). I would like to do some statistical testing if possible....
Do you want to give some names and get the stats for them, or do you want to get all the IDs for all players in a given time?
I need all the IDs - can be names, can be some ID number, I don't know what iso gives. But if I want to make an overall good system, I need all of them, not just some players. Or at least all with over some minimum threshhold of games.
Title: Re: Win Probability Calculator
Post by: DStu on December 04, 2012, 01:57:49 pm
It was a joke.  You were literally discussing the edges in the snippet I quoted.

It was too early to understand jokes.
Quote
To jump off of this, does anyone have some technical skills and/or knowledge of the isotropic data to be able to grab the results of a bunch of games from a given time period? (What I am looking for is ID player 1, ID player 2..., ID player N, and number of points scored by each. Rating would be nice if possible. Dates of game being played would be even nicer. Chronology of games played would be nicest, but probably overkill. And actually I only care about 2 player games here, but I ought to be able to filter that myself). I would like to do some statistical testing if possible....
Do you want to give some names and get the stats for them, or do you want to get all the IDs for all players in a given time?
I need all the IDs - can be names, can be some ID number, I don't know what iso gives. But if I want to make an overall good system, I need all of them, not just some players. Or at least all with over some minimum threshhold of games.
And probably you want the results of the match of ID1 vs ID2, or just the (relative) number of wins of ID1?
Title: Re: Win Probability Calculator
Post by: WanderingWinder on December 04, 2012, 02:00:53 pm
It was a joke.  You were literally discussing the edges in the snippet I quoted.

It was too early to understand jokes.
Quote
To jump off of this, does anyone have some technical skills and/or knowledge of the isotropic data to be able to grab the results of a bunch of games from a given time period? (What I am looking for is ID player 1, ID player 2..., ID player N, and number of points scored by each. Rating would be nice if possible. Dates of game being played would be even nicer. Chronology of games played would be nicest, but probably overkill. And actually I only care about 2 player games here, but I ought to be able to filter that myself). I would like to do some statistical testing if possible....
Do you want to give some names and get the stats for them, or do you want to get all the IDs for all players in a given time?
I need all the IDs - can be names, can be some ID number, I don't know what iso gives. But if I want to make an overall good system, I need all of them, not just some players. Or at least all with over some minimum threshhold of games.
And probably you want the results of the match of ID1 vs ID2, or just the (relative) number of wins of ID1?
I don't understand your question. I want the results of every game, most preferably in this format:
ID1                ID2                         Number of wins for ID1
dummy1         dummy2                  1
dummyA         dummyB                  0
dummy2         dummyA                  0.5

etc. etc.           
Title: Re: Win Probability Calculator
Post by: DStu on December 04, 2012, 02:08:10 pm
It was a joke.  You were literally discussing the edges in the snippet I quoted.

It was too early to understand jokes.
Quote
To jump off of this, does anyone have some technical skills and/or knowledge of the isotropic data to be able to grab the results of a bunch of games from a given time period? (What I am looking for is ID player 1, ID player 2..., ID player N, and number of points scored by each. Rating would be nice if possible. Dates of game being played would be even nicer. Chronology of games played would be nicest, but probably overkill. And actually I only care about 2 player games here, but I ought to be able to filter that myself). I would like to do some statistical testing if possible....
Do you want to give some names and get the stats for them, or do you want to get all the IDs for all players in a given time?
I need all the IDs - can be names, can be some ID number, I don't know what iso gives. But if I want to make an overall good system, I need all of them, not just some players. Or at least all with over some minimum threshhold of games.
And probably you want the results of the match of ID1 vs ID2, or just the (relative) number of wins of ID1?
I don't understand your question.
Obviously you don't need to, because you nevertheless answered it...
Title: Re: Win Probability Calculator
Post by: rrenaud on December 04, 2012, 02:14:03 pm
This used to have a data set of size 1.2 million with player names, winning margin, and the kingdom supply.

http://forum.dominionstrategy.com/index.php?topic=20.msg438#msg438

The dataset link isn't going to work now, but I can find some other place to put the data if you'd actually use it.
Title: Re: Win Probability Calculator
Post by: Kirian on December 04, 2012, 04:39:59 pm
This used to have a data set of size 1.2 million with player names, winning margin, and the kingdom supply.

http://forum.dominionstrategy.com/index.php?topic=20.msg438#msg438

The dataset link isn't going to work now, but I can find some other place to put the data if you'd actually use it.

How big is the dataset?  If it's <~30MB, I can host it.
Title: Re: Win Probability Calculator
Post by: ipofanes on December 06, 2012, 06:51:49 am
Also, genetic linkage: http://en.wikipedia.org/wiki/Genetic_linkage#LOD_score_method_for_estimating_recombination_frequency

Since I'm not seeing the word logistic in there, are you willing to explain how OD/LOD is related to the logistic function?  I did linkage analysis for about a year and a half and always simply treated it as a Bayesian probability analysis.

You can transform the term such that the only factor depending on the random variate, R, is (theta/1-theta)^R, and there you have the logistic term.

Quote
  (Granted, the word Bayesian isn't in that article either, which makes me less than impressed with that WP article).

There is a straightforward frequentist interpretation of the LOD score. The customary -3 limit for the lod score directly corresponds with a test level, I am currently to dumb to calculate which one.