(Edit: Added a paragraph to clarify what I mean by "high impact". Added another footnote.)
I'm assuming you've seen Richard Garfield's excellent talk about why luck and skill are not necessarily in conflict:
https://www.youtube.com/watch?v=dSg408i-eKw. Ben Brode (one of Hearthstone's designers) also said some words about this recently at Blizzcon:
https://www.youtube.com/watch?v=TDUiLhK2JIs#t=7m25s.
I'd like to focus specifically on one benefit of luck in games: the possibility for a bad player to beat a good player. This is good to prevent new players from becoming discouraged. It turns out that if you look at this from a mathematical perspective, it helps clarify what exactly the goal is here and which forms of luck help achieve it. (Not all kinds of randomness help achieve this goal!) I know there are a handful of math nerds on this forum, so you may enjoy this take.
My view is that there are fundamentally two goals here, which initially appear in conflict but actually aren't:
- Accessibility: A bad player should have at least some base chance to beat a good player in a single match. Let's say 20% for concreteness.
- Discrimination: When two good players play each other, the outcome of a match should be mostly determined by how well they play in that match.
These can be formulated mathematically as follows. Consider the function w that maps skill difference to win probability. "Accessibility" wants w to be bounded below by 20%. "Discrimination" wants w to have a large derivative at zero. These can both be satisfied by a sigmoid function (
http://en.wikipedia.org/wiki/Sigmoid_function) that is appropriately scaled and translated [1].
Now that we know what we want w to look like, there is some trickiness in how to achieve it. First, think about what happens if each source of randomness in the game is low-impact. If there are just a few, they won't significantly affect the outcome in any case (failing to bound w below), and if there are many of these, then by the central limit theorem, w looks like a normal CDF, which although sigmoidal drops to near-zero very quickly and so can't be made to be bounded below by 20% like we wanted [2]. Instead of bounding w below for all skill differences, an alternative approach could be to make the normal distribution's variance so high that for all practical skill differences w is bounded below, but then the randomness will dominate matches between skilled players, failing "discrimination".
That means
a small number of sources of high-impact randomness are necessary to be both accessible and discriminative. I believe that in all games that are sufficiently accessible and discriminative in this sense, you can find examples of this kind of randomness. For example, in Dominion, a crucial opening buy missing the first shuffle can sometimes put you so far behind that it's very difficult to win even against a player who makes relatively many misplays. Note though that if it's usually possible to recover from the high-impact randomness with skilled play, then it wasn't high-impact enough! To truly promote accessibility, the impact has to be so strong that even optimal play from you and bad play from your opponent still has them winning often.
To be clear, by "high impact" I mean something more specific than simply "high variance". I mean a source of randomness that with small probability produces a large swing in win probability, and the rest of the time the swing is negligible. High-impact randomness in this sense is inherently high variance, but as an example of high-variance randomness that isn't high impact in this sense, consider flipping a coin to decide the game outcome. In the coin-flip case, there is no possibility to not have a huge swing in win probability. (I called this kind of randomness lotteries/anti-lotteries in an earlier post:
http://forum.dominionstrategy.com/index.php?topic=11910.msg432264#msg432264.)
This is all fine for matches between players of mismatched skill. The better player will still usually win, but often enough get unlucky so the worse player can pick up some wins too. It also doesn't prevent matches between players of similar skill to be decided by skill the 80% of the time that the high-impact randomness doesn't decide the match. In a tournament, using a BoX format can amplify this to discriminate between similarly skilled players with high probability.
The trouble is that when two good players play each other, they don't want 20% of their games to be decided by a purely random element, because they want to feel like how they play matters in every game. This concern isn't about the win probability, which may be fine. The concern is that deciding a match by variance of goodness of play feels better than deciding a match by variance of goodness of luck, even if the win probability is equal. An example from Dominion is how Tournament can decide matches simply by who gets Followers, even if play up to that point was similarly skilled. I think this is where a lot of games go astray, falling into the pitfall of having high-impact randomness decide games between skilled players, such as in Hearthstone where Ramp Druid's strength varies wildly depending on whether Wild Growth is drawn early on.
One solution here is to only inject the high-impact randomness into games involving players of mismatched skill. How? In a strategy game, coax bad players into strategies that have a single source of high-impact randomness but poor win probability, and allow good players to choose strategies where the randomness has relatively low impact. When two good players face each other, as long as their skill is similar enough, it's optimal for them to choose the strategies that will cause the match to always be decided mostly by variance of goodness of play.
With good players, because they are skilled, no special effort is needed to encourage them to pick appropriate strategies. But how can bad players be persuaded to pick a strategy that's best for them, given that their lack of skill makes it hard to evaluate the strength of strategies? One way to accomplish this is with "trap cards" (cards which appear better than they are) and "fun cards" (cards that are simply fun to play regardless of how good they are), which players initially like and good players don't mind skipping to win more. By injecting a single high-impact source of randomness into such a card, bad players automatically get a strategy that will win them some proportion of games against good players. For example, in Dominion, unassisted Treasure Map can fulfill this purpose in some kingdoms, by looking better than it is, being fun through the theme, and sometimes winning games through fluke collision. The key is that good players won't choose to play unassisted Treasure Map, so for their games the high-impact randomness of that strategy won't be relevant. Compare with low-assistance Tournament, which is a strong strategy and has a single high-impact source of randomness (who connects Province first).
I intentionally didn't touch on the many other uses of randomness in games. Randomness definitely has a place in matches between skilled players too, just not single-source high-impact randomness. I also didn't say much about actually achieving the ideal of high variance of player skill within matches, which is arguably required to make matches between good players interesting. I think that's some combination of the intuitive concepts of "variety" and "depth", and is one reason that Dominion's random kingdom mechanic keeps the game interesting for thousands of plays.
Hopefully the analysis above helps with understanding why some randomness feels good and other randomness feels bad. It has for me, at least.
Footnotes:
[1] Think about what w is for Richard Garfield's toy example of Rando Chess. It's an extreme case of exactly this sort of sigmoid function, essentially a step function from 1/6 to 5/6. That's one reason that I think this model captures part of what he refers to as high-luck high-skill games. Not all high-luck high-skill games fall into this category though: a single hand of poker is relatively low-discrimination.
[2] TrueSkill uses a normal distribution in its model, so maybe TrueSkill can't simultaneously be a good model of the discrimination of a game and the accessibility of a game. If the matches are exclusively between players of similar skill, then TS can still do okay because the accessibility is mostly irrelevant.