Topic: Probability applied to one aspect of "luck vs skill" (Read 5008 times)

blueblimp · « **on:** November 14, 2014, 10:20:50 pm »

(Edit: Added a paragraph to clarify what I mean by "high impact". Added another footnote.)

I'm assuming you've seen Richard Garfield's excellent talk about why luck and skill are not necessarily in conflict: https://www.youtube.com/watch?v=dSg408i-eKw. Ben Brode (one of Hearthstone's designers) also said some words about this recently at Blizzcon: https://www.youtube.com/watch?v=TDUiLhK2JIs#t=7m25s.

I'd like to focus specifically on one benefit of luck in games: the possibility for a bad player to beat a good player. This is good to prevent new players from becoming discouraged. It turns out that if you look at this from a mathematical perspective, it helps clarify what exactly the goal is here and which forms of luck help achieve it. (Not all kinds of randomness help achieve this goal!) I know there are a handful of math nerds on this forum, so you may enjoy this take.

My view is that there are fundamentally two goals here, which initially appear in conflict but actually aren't:

Accessibility: A bad player should have at least some base chance to beat a good player in a single match. Let's say 20% for concreteness.
Discrimination: When two good players play each other, the outcome of a match should be mostly determined by how well they play in that match.

These can be formulated mathematically as follows. Consider the function w that maps skill difference to win probability. "Accessibility" wants w to be bounded below by 20%. "Discrimination" wants w to have a large derivative at zero. These can both be satisfied by a sigmoid function (http://en.wikipedia.org/wiki/Sigmoid_function) that is appropriately scaled and translated [1].

Now that we know what we want w to look like, there is some trickiness in how to achieve it. First, think about what happens if each source of randomness in the game is low-impact. If there are just a few, they won't significantly affect the outcome in any case (failing to bound w below), and if there are many of these, then by the central limit theorem, w looks like a normal CDF, which although sigmoidal drops to near-zero very quickly and so can't be made to be bounded below by 20% like we wanted [2]. Instead of bounding w below for all skill differences, an alternative approach could be to make the normal distribution's variance so high that for all practical skill differences w is bounded below, but then the randomness will dominate matches between skilled players, failing "discrimination".

That means a small number of sources of high-impact randomness are necessary to be both accessible and discriminative. I believe that in all games that are sufficiently accessible and discriminative in this sense, you can find examples of this kind of randomness. For example, in Dominion, a crucial opening buy missing the first shuffle can sometimes put you so far behind that it's very difficult to win even against a player who makes relatively many misplays. Note though that if it's usually possible to recover from the high-impact randomness with skilled play, then it wasn't high-impact enough! To truly promote accessibility, the impact has to be so strong that even optimal play from you and bad play from your opponent still has them winning often.

To be clear, by "high impact" I mean something more specific than simply "high variance". I mean a source of randomness that with small probability produces a large swing in win probability, and the rest of the time the swing is negligible. High-impact randomness in this sense is inherently high variance, but as an example of high-variance randomness that isn't high impact in this sense, consider flipping a coin to decide the game outcome. In the coin-flip case, there is no possibility to not have a huge swing in win probability. (I called this kind of randomness lotteries/anti-lotteries in an earlier post: http://forum.dominionstrategy.com/index.php?topic=11910.msg432264#msg432264.)

This is all fine for matches between players of mismatched skill. The better player will still usually win, but often enough get unlucky so the worse player can pick up some wins too. It also doesn't prevent matches between players of similar skill to be decided by skill the 80% of the time that the high-impact randomness doesn't decide the match. In a tournament, using a BoX format can amplify this to discriminate between similarly skilled players with high probability.

The trouble is that when two good players play each other, they don't want 20% of their games to be decided by a purely random element, because they want to feel like how they play matters in every game. This concern isn't about the win probability, which may be fine. The concern is that deciding a match by variance of goodness of play feels better than deciding a match by variance of goodness of luck, even if the win probability is equal. An example from Dominion is how Tournament can decide matches simply by who gets Followers, even if play up to that point was similarly skilled. I think this is where a lot of games go astray, falling into the pitfall of having high-impact randomness decide games between skilled players, such as in Hearthstone where Ramp Druid's strength varies wildly depending on whether Wild Growth is drawn early on.

One solution here is to only inject the high-impact randomness into games involving players of mismatched skill. How? In a strategy game, coax bad players into strategies that have a single source of high-impact randomness but poor win probability, and allow good players to choose strategies where the randomness has relatively low impact. When two good players face each other, as long as their skill is similar enough, it's optimal for them to choose the strategies that will cause the match to always be decided mostly by variance of goodness of play.

With good players, because they are skilled, no special effort is needed to encourage them to pick appropriate strategies. But how can bad players be persuaded to pick a strategy that's best for them, given that their lack of skill makes it hard to evaluate the strength of strategies? One way to accomplish this is with "trap cards" (cards which appear better than they are) and "fun cards" (cards that are simply fun to play regardless of how good they are), which players initially like and good players don't mind skipping to win more. By injecting a single high-impact source of randomness into such a card, bad players automatically get a strategy that will win them some proportion of games against good players. For example, in Dominion, unassisted Treasure Map can fulfill this purpose in some kingdoms, by looking better than it is, being fun through the theme, and sometimes winning games through fluke collision. The key is that good players won't choose to play unassisted Treasure Map, so for their games the high-impact randomness of that strategy won't be relevant. Compare with low-assistance Tournament, which is a strong strategy and has a single high-impact source of randomness (who connects Province first).

I intentionally didn't touch on the many other uses of randomness in games. Randomness definitely has a place in matches between skilled players too, just not single-source high-impact randomness. I also didn't say much about actually achieving the ideal of high variance of player skill within matches, which is arguably required to make matches between good players interesting. I think that's some combination of the intuitive concepts of "variety" and "depth", and is one reason that Dominion's random kingdom mechanic keeps the game interesting for thousands of plays.

Hopefully the analysis above helps with understanding why some randomness feels good and other randomness feels bad. It has for me, at least.

Footnotes:

[1] Think about what w is for Richard Garfield's toy example of Rando Chess. It's an extreme case of exactly this sort of sigmoid function, essentially a step function from 1/6 to 5/6. That's one reason that I think this model captures part of what he refers to as high-luck high-skill games. Not all high-luck high-skill games fall into this category though: a single hand of poker is relatively low-discrimination.

[2] TrueSkill uses a normal distribution in its model, so maybe TrueSkill can't simultaneously be a good model of the discrimination of a game and the accessibility of a game. If the matches are exclusively between players of similar skill, then TS can still do okay because the accessibility is mostly irrelevant.

qmech · « **Reply #1 on:** November 15, 2014, 07:24:53 am »

This is an excellent post.

pacovf · « **Reply #2 on:** November 15, 2014, 08:07:20 am »

Quote from: qmech on November 15, 2014, 07:24:53 am

This is an excellent post.

A truly excellent post, indeed.

Somehow, this would have been better if it had been posted by a general with massive sideburns, though.

pingpongsam · « **Reply #3 on:** November 15, 2014, 10:28:43 am »

Best thing I've read on here in a long while.

A Drowned Kernel · « **Reply #4 on:** November 15, 2014, 10:51:12 am »

Here it is, Adam. Scientific proof that Black Market is a fun card.

assemble_me · « **Reply #5 on:** November 15, 2014, 11:21:20 am »

Radomly +1ed the OP. Watching Garfield's thing about skill vs. luck first.

ipofanes · « **Reply #6 on:** November 19, 2014, 06:03:25 am »

Quote from: A Drowned Kernel on November 15, 2014, 10:51:12 am

Here it is, Adam. Scientific proof that Black Market is a fun card.

Only if it is a bad strategy in the long run, while it can lopside the occasional game. In a similar vein, I'm more inclined to buy Treasure Map against stronger opponents in Dominion, or obscure coffee house gambits against stronger opponents in Chess. For details, see the How to trap heffalumps chapter in the late Simon Webb's Chess for Tigers.

timchen · « **Reply #7 on:** November 22, 2014, 03:19:22 am »

Very interesting, lots of insights.

I think there is one more kind of "randomness" which helps discriminating games between high-skill players, yet gives accessibility.

It is actually pretty similar to your idea of "trap choices", being probably only quantitatively different. It is just that this "trap choice" need not be that inferior to the optimal choice, and the variance is inherent but cancelled by the same choices among experts.

I am thinking in terms of duplicate bridge. For simplicity let me talk about a very easy example: if you have a 9-card suit without queen, experienced players know the percentage play is to play the Ace and king hoping to drop the queen. A slightly inferior play will be to take a finesse. However, the success chance difference is small (less than 5%), so you can say the chance for each play to win out is pretty much 50-50. This means large variance and assessibility: an expert has no way to guarantee a win against a newbie via this strategic choice. On the other hand, if both players are experienced it is very likely that they will make the same optimal play, and completely cancel this large variance.

The difference between this and the "trap choice" is just that, the suboptimal choice is not drastically worse, which also implies there is higher intrinsic variance, which is avoided by optimal choices. To get to the 80-20 or any desirable accessibility bottomline, just have a few choices like this in a single match.

ipofanes · « **Reply #8 on:** November 24, 2014, 03:27:40 am »

Quote from: timchen on November 22, 2014, 03:19:22 am

I am thinking in terms of duplicate bridge.

This is where I started to think of colliding terminals.

Quote

For simplicity let me talk about a very easy example: if you have a 9-card suit without queen, experienced players know the percentage play is to play the Ace and king hoping to drop the queen. A slightly inferior play will be to take a finesse.

OT: How about dropping the Ace first, checking for a 0-4 distribution, then still consider both options? I can't see where it could be worse than both other options.

timchen · « **Reply #9 on:** November 25, 2014, 06:03:31 am »

That part is implied. When you finesse you also play your ace/king first. Otherwise the percentage difference is not that small (~12.5% more difference)

blueblimp · « **Reply #10 on:** November 26, 2014, 01:53:01 pm »

The duplicate bridge example is very interesting, since I don't know that game. If I understand right, the abstract idea then is that when two players make the same choice, there is zero variance, but if they make a different choice, then there is large variance, but slightly favouring one of the two players, so that it's sub-optimal play for one of the two players unless the advantage of increasing the variance outweighs the disadvantage of reducing the mean, with respect to win probability. Nice design.

On another note, something I didn't think of in the OP: in Garfield's talk, he argues (paraphrasing) that having discrimination that's too high can be bad, because if you have a limited pool of players to play against, you might not be able to find somebody close enough in skill to play a close game. (Even if the weaker player has some probability to win by resorting to high variance strategies, maybe they would prefer to play a more skill-oriented style.) So really, the amount of discrimination in a game should be tailored to the matchmaking situation.

timchen · « **Reply #11 on:** November 30, 2014, 04:20:46 am »

Yeah that is pretty much what I mean. Zero variance is an idealization though; there are still choices in game whose mean difference are hard to measure yet can lead to significantly different results. It's also possible that a strategic choice can directly lower your expectation in a single board yet may increase your overall expectation for the whole match via psychological effects. In this case it's probably like rock-paper-scissors (but with different expectations for each) such that the optimal strategy needs intrinsic randomness and thus variance.

Dominion Strategy Forum

News:

Author Topic: Probability applied to one aspect of "luck vs skill" (Read 5008 times)

blueblimp

Probability applied to one aspect of "luck vs skill"

qmech

Re: Probability applied to one aspect of "luck vs skill"

pacovf

Re: Probability applied to one aspect of "luck vs skill"

pingpongsam

Re: Probability applied to one aspect of "luck vs skill"

A Drowned Kernel

Re: Probability applied to one aspect of "luck vs skill"

assemble_me

Re: Probability applied to one aspect of "luck vs skill"

ipofanes

Re: Probability applied to one aspect of "luck vs skill"

timchen

Re: Probability applied to one aspect of "luck vs skill"

ipofanes

Re: Probability applied to one aspect of "luck vs skill"

timchen

Re: Probability applied to one aspect of "luck vs skill"

blueblimp

Re: Probability applied to one aspect of "luck vs skill"

timchen

Re: Probability applied to one aspect of "luck vs skill"