Nash Equilibrium, Explained

Game theory · An equilibrium always exists (Nash, 1950)

Stable does not mean good: the trap rational players cannot escape alone.

Temptation payoff T

The game is symmetric: both cooperating pays 3 each, both defecting pays 1 each, and being the lone cooperator pays 0. The slider sets T, the payoff for defecting while the other cooperates. In The matrix view, green arrows are Player 1's best replies and amber are Player 2's; a cell with every arrow pointing in is a Nash equilibrium. Slide T above 3 and defecting takes over: the only equilibrium becomes (Defect, Defect), worse for both than cooperating. Below 3 it is a stag hunt with two equilibria. The map view shows the same answers as the crossings of the two best-response curves.

Some choices only make sense in the light of what other people choose. Game theory is the maths of exactly those situations, and its most famous idea is the Nash equilibrium: a set of choices where nobody can do better by changing their own move alone, while everyone else keeps theirs.

The classic example is the prisoner's dilemma. Two suspects are questioned in separate rooms. Each can stay quiet or betray the other. If both stay quiet, they each get a light sentence. If one betrays while the other stays quiet, the betrayer walks free and the silent one takes the fall. If both betray, both get a medium sentence. Sit in either room and think it through: whatever the other person does, betraying gets you a better result. So both betray, and both end up worse off than if they had simply trusted each other and stayed quiet.

That "both betray" outcome is the Nash equilibrium. It is stable, because neither prisoner can improve things by switching alone. The uncomfortable lesson is that stable is not the same as good. Perfectly rational people, each doing the sensible thing, can get stuck in an outcome that is bad for all of them.

The mathematician John Nash proved that every game of this kind has at least one such equilibrium, sometimes only if players are allowed to mix their choices at random. He won the 1994 Nobel Prize in economics for it, and equilibria now turn up everywhere: traffic jams, price wars, arms races, even animals sharing a watering hole.

A game has players, each with a set of strategies, and a payoff for every player at every combination of choices. A choice of one strategy per player is a Nash equilibrium when every player is already playing a best response to the others: no single player can raise their own payoff by switching while everyone else stays put.

Reading the matrix. The widget shows a two-by-two game. Each cell holds two payoffs, one per player. The arrows are best responses: down each column the row player points toward their higher payoff, and along each row the column player points toward theirs. Wherever every arrow points into a cell and none leaves it, that cell is a Nash equilibrium. It is the spot from which neither player wants to move.

The dilemma in numbers. Take cooperate or defect, with both cooperating worth 3 each, both defecting worth 1 each, the lone cooperator getting 0, and the lone defector getting 5. Look at the row player: if the other cooperates, defecting scores 5 against 3; if the other defects, defecting scores 1 against 0. Defecting wins either way, so it is a dominant strategy, and the same holds for the column player. The result is a single Nash equilibrium at (Defect, Defect), paying 1 each, even though (Cooperate, Cooperate) would have paid 3 each. A dominant strategy drives both players to an outcome that is worse for everyone.

Not every game is a trap. Drag the temptation payoff down below 3 and the game changes character. Now, if you trust the other to cooperate, your best move is to cooperate too. Two equilibria appear: (Cooperate, Cooperate), which pays more, and (Defect, Defect), which is the safe fallback if you do not trust your partner. This is the stag hunt, a problem of coordination and trust rather than greed, and many real situations look more like it than like the pure dilemma.

When there is no pure answer. Some games have no equilibrium in straight, fixed choices. In matching pennies, one player wins by matching and the other by mismatching, so any predictable choice can be exploited. Nash's theorem says that if you allow mixed strategies, choosing each option with some probability, then every finite game has at least one equilibrium. At a mixed equilibrium each player randomises in just the right proportions to leave the other with nothing to gain by switching. The map view shows equilibria as the points where both players' best-response curves cross.

The definition. A finite game has players \(i\), a finite strategy set \(S_i\) for each, and payoff functions \(u_i\). Write a profile as \(s = (s_i, s_{-i})\), where \(s_{-i}\) is everyone but \(i\). The profile \(s^*\) is a Nash equilibrium if for every player and every alternative strategy,

\[ u_i\!\left(s^*_i, s^*_{-i}\right) \;\ge\; u_i\!\left(s_i, s^*_{-i}\right) \quad \text{for all } i \text{ and all } s_i \in S_i. \]

It is a pure-strategy equilibrium if each \(s^*_i\) is a single strategy, and a mixed-strategy equilibrium if each is a probability distribution \(\sigma_i\) over \(S_i\), with payoffs taken in expectation.

Existence. Nash proved in 1950 that every finite game has at least one equilibrium in mixed strategies. The argument is a fixed-point theorem. The best-response correspondence maps each profile of mixed strategies to the set of profiles that best-respond to it; this map sends the product of strategy simplices to itself, and it is upper-hemicontinuous with non-empty convex values. Kakutani's fixed-point theorem then guarantees a fixed point, a profile that best-responds to itself, which is exactly an equilibrium. (With a smoothing argument the simpler Brouwer theorem suffices.)

The indifference principle. In a mixed equilibrium a player only ever randomises over strategies that are all best responses, so they must yield equal expected payoff; the player is indifferent across their support, and it is the opponent's mixture that makes them so. This gives the standard recipe for solving mixed equilibria: set each player's expected payoffs equal across the strategies they actually use, and solve for the opponent's probabilities.

Refinements, hardness and limits. Nash equilibrium says nothing about how players reach it, and many games have several, so refinements add discipline: subgame-perfect equilibrium uses backward induction to rule out non-credible threats in sequential games, trembling-hand perfection survives small mistakes, and the evolutionarily stable strategy describes equilibria robust to invasion in populations. Equilibria are stable, not efficient, and the price of anarchy measures that gap, visible in Braess's paradox, where adding a road can slow everyone down. Finding an equilibrium is also computationally hard: the problem is PPAD-complete, strong evidence that no efficient general algorithm exists, which ties game theory to complexity theory and the P versus NP question. Finally, repetition can rescue cooperation that the one-shot dilemma forbids: in repeated play, strategies like tit-for-tat support cooperative equilibria, which is part of why cooperation survives in the real world at all.

Related: P versus NP · next: The Birthday Paradox · or go back to all topics.