All explorers

Clay Mathematics Institute · est. 2000

The Million-Dollar
Problems

Seven problems chosen as the deepest open questions in mathematics. A prize of $1,000,000 for each. Twenty-six years on, six remain unclaimed; one has been solved.

Plain language, no notation. High-school level: light notation, why it's hard. Rigorous: the statement, significance and current status.

Problem 01 · Unsolved

P versus NP

Is finding an answer as easy as checking one?

Some puzzles are hard to solve but easy to check once someone hands you the answer. A finished jigsaw is obviously correct at a glance, yet doing it from scratch takes hours. A completed Sudoku is instant to verify; filling the grid is not.

P vs NP asks whether that gap is real or an illusion. Is every problem whose answer we can quickly check also one we could quickly solve, if only we were clever enough? Almost everyone believes solving is genuinely harder than checking, but nobody has proved it. If the two turned out to be equally easy, the consequences would be staggering: the codes protecting banking, messaging and privacy all rely on solving being hard.

Computer scientists sort problems by how the effort grows with the size of the input \(n\). P ("polynomial time") is the class of problems a computer can solve in time bounded by a fixed power of \(n\) — like \(n^2\) or \(n^3\) — which we count as efficient. NP is the class whose answers can be checked in polynomial time, given a short hint called a certificate.

A Sudoku makes the difference concrete. Handed a completed grid you can verify it at a glance — scan each row, column and box for a repeat — so Sudoku belongs to NP. Filling a blank grid, though, seems to demand searching astronomically many possibilities. The same split is everywhere: a satisfying assignment for a logical formula (SAT), a route through every city, a proper colouring of a map — all trivial to check, all apparently hard to find.

Plainly \(\text{P} \subseteq \text{NP}\): if you can solve fast, you can certainly check fast. The question is whether the reverse holds. A pivotal discovery is that thousands of these problems are NP-complete — each is a disguised copy of every other, so a fast method for any single one would instantly crack them all. They stand or fall together.

It resists proof because showing \(\text{P} \neq \text{NP}\) means ruling out every conceivable algorithm, including clever ones nobody has imagined, for some one problem — and the few general techniques we have provably cannot do it. The stakes are enormous: if instead \(\text{P} = \text{NP}\), the codes protecting banking and privacy would collapse, and mathematics itself would change, since finding a proof would become as easy as checking one.

Formally, P is the class of languages decided by a deterministic Turing machine in time \(O(n^k)\) for some constant \(k\). NP is the class of languages \(L\) admitting a polynomial-time verifier \(V\) and polynomial-size certificates: \(x \in L \iff \exists\, w,\ |w| \le \operatorname{poly}(|x|),\ V(x,w)=1\) — equivalently, what a nondeterministic Turing machine decides in polynomial time.

The Cook–Levin theorem (1971) shows SAT is NP-complete: every problem in NP reduces to it by a polynomial-time many-one reduction. Karp (1972) gave 21 further natural NP-complete problems — clique, vertex cover, Hamiltonian cycle, 3-colouring, subset-sum — and thousands are now known. Hence \(\text{P}=\text{NP}\) if and only if any single NP-complete problem lies in P.

Three barriers explain the difficulty. Relativization (Baker–Gill–Solovay, 1975): methods true relative to every oracle cannot decide it, ruling out naive diagonalization. Natural proofs (Razborov–Rudich, 1994): broad combinatorial lower-bound techniques would break cryptographic pseudorandom functions. Algebrization (Aaronson–Wigderson, 2008) extends both. Structurally, \(\text{P}=\text{NP}\) would collapse the polynomial hierarchy, while if \(\text{P}\neq\text{NP}\), Ladner's theorem forces NP-intermediate problems strictly between P and NP-complete.

Status: open — the central problem of theoretical computer science, governing cryptography, optimisation and the limits of automated reasoning. The overwhelming consensus (Gasarch's surveys of experts) is \(\text{P}\neq\text{NP}\), but no proof, even a conditional one, is in sight.

A 3-colouring of the graph: each vertex a colour, no edge joining two of the same. Verifying it just scans the edges. The graph grows on a loop. Checking is linear, O(m); searching every colouring grows as 3ⁿ. n grows on a loop — the search explodes while the check stays instant. SAT, 3-colouring, Hamiltonian cycle … all reduce to one another — the NP-complete problems. n grows on a loop; verification stays O(m).

Problem 02 · Unsolved

The Hodge Conjecture

Which shapes secretly come from equations?

Mathematicians study smooth, curved spaces that can twist through many dimensions, far too many to picture. To understand such a space, they examine the holes and loops inside it, the way you might understand a doughnut by noticing the hole in the middle.

Some of those loops are special: they can be traced out exactly by simple equations, like a circle drawn by one tidy formula. Others seem to be just floppy, rubbery features with no clean equation behind them. The Hodge Conjecture says that for an important family of spaces, a particular well-behaved class of these features is always secretly built from equations: there is always an algebraic skeleton underneath. It is a bridge between geometry you can draw and algebra you can write down.

The spaces in question are smooth projective complex varieties — shapes carved out by polynomial equations in (complex) space. Because each complex dimension is two real ones, even a curve is a real surface, and these objects quickly twist through dimensions too high to picture. Two very different lenses reveal their structure.

The first is topology: cohomology measures the holes and loops, packaging them into vector spaces of "classes" that survive any continuous deformation — soft, flexible, rubbery information. The second is algebra: an algebraic cycle is a subvariety actually cut out by more polynomial equations — a circle inside a torus, say — and each such cycle leaves its own shadow in cohomology. Hodge theory refines the picture further, splitting cohomology into pieces tagged by a bidegree \((p,q)\) recording how the complex structure acts.

A Hodge class is a (rational) cohomology class that lands squarely in the balanced \((p,p)\) piece. These are exactly the classes that could, on grounds of type, come from an algebraic cycle. The conjecture asserts they always do: every Hodge class is a rational combination of the shadows of genuine algebraic cycles.

It is hard because the two sides have opposite temperaments. Cohomology is plentiful and floppy; algebraic cycles are rigid and scarce, and producing an actual subvariety to realise a prescribed class is enormously constrained. We have almost no general machinery for building cycles to order, so the conjecture asks us to conjure rigid algebra to match every piece of soft topology of the right type.

Let \(X\) be a smooth projective variety over \(\mathbb{C}\) of dimension \(n\). The Hodge decomposition gives \(H^{k}(X,\mathbb{C}) = \bigoplus_{p+q=k} H^{p,q}(X)\) with \(\overline{H^{p,q}} = H^{q,p}\), and the rational Hodge classes are \(\mathrm{Hdg}^{p}(X) = H^{2p}(X,\mathbb{Q}) \cap H^{p,p}(X)\).

The cycle class map sends a codimension-\(p\) algebraic cycle to a class in \(\mathrm{Hdg}^{p}(X)\). The conjecture: after \(\otimes\,\mathbb{Q}\) this map is surjective — every rational \((p,p)\) class is a \(\mathbb{Q}\)-linear combination of algebraic cycle classes.

It is a theorem for \(p=1\): the Lefschetz \((1,1)\) theorem identifies \(\mathrm{Hdg}^{1}\) with divisor classes (via the exponential sequence and the Néron–Severi group). Beyond that only fragments are known — abelian varieties of low dimension, and cases forced by the Lefschetz hyperplane and hard Lefschetz theorems. The integral Hodge conjecture is outright false (Atiyah–Hirzebruch, 1962; Kollár), so the rational coefficients are essential, and the variational form (Cattani–Deligne–Kaplan) shows Hodge loci are algebraic.

Status: open. It is the keystone of the conjectural dictionary — algebraic geometry ↔ topology ↔ the theory of motives and the standard conjectures — that organises much of modern arithmetic geometry.

A torus (a doughnut surface) with a loop drawn on it. Drag the figure to orbit. Its two independent 1-cycles — longitude (teal) and meridian (violet) — the model Hodge classes. Drag to orbit. Hodge classes are the (p, p) part of cohomology; the conjecture says each comes from an algebraic cycle. The (p, p) diagonal is scanned on a loop.

Problem 03 · Unsolved

The Riemann Hypothesis

Where the primes hide their rhythm.

Prime numbers (2, 3, 5, 7, 11) are the atoms of arithmetic: every whole number is built by multiplying them together. As you count higher they seem to show up at random, with no pattern anyone can see.

But in 1859 Riemann found a hidden "music" underneath the primes: a kind of secret rhythm that controls how thickly they are scattered. His hypothesis says every note in that music falls exactly on one straight line. If even a single note were off the line, the primes would clump or thin out in ways our deepest theorems quietly assume they never do. In more than 160 years no one has found a stray note, and no one has proved there isn't one.

Start with the Riemann zeta function, \(\zeta(s)=1+\tfrac{1}{2^{s}}+\tfrac{1}{3^{s}}+\tfrac{1}{4^{s}}+\cdots\) — add up one over every whole number, each raised to the power \(s\). When \(s\) is bigger than \(1\) this settles on a finite value (for example \(\zeta(2)=1+\tfrac14+\tfrac19+\cdots=\tfrac{\pi^2}{6}\)). At \(s=1\) it is just \(1+\tfrac12+\tfrac13+\cdots\), which grows without bound, so the plain sum only works for \(s\gt 1\).

Riemann's move was to feed in complex numbers, written \(s=a+b\,i\), where \(i=\sqrt{-1}\) and \(a\) (the "real part") and \(b\) (the "imaginary part") are ordinary numbers. A complex number is just a point on a plane: \(a\) across, \(b\) up. There is a standard technique called analytic continuation — think of it as the one and only smooth way to extend a formula past where it first made sense — and it stretches \(\zeta\) to (almost) the whole plane.

Once extended, \(\zeta\) hits zero at certain points. Some are boring: the trivial zeros at \(s=-2,-4,-6,\dots\) Everything interesting happens in the vertical critical strip where the real part \(a\) lies between \(0\) and \(1\). There sit infinitely many nontrivial zeros, and every one ever found lands exactly on the critical line \(a=\tfrac12\). Riemann's hypothesis is the bet that all of them do.

Why care where some function vanishes? Because of a remarkable bridge: writing the same sum as a product over the primes ties \(\zeta\) directly to them, and the position of the zeros controls how evenly the primes are spread — the closer the zeros cling to the \(\tfrac12\) line, the smaller the error when you estimate how many primes lie below a given number. It resists proof because the zeros only exist out in the continued function, not the original sum, there are infinitely many to corral at once, and the left–right symmetry that ought to pin them to the line has never been shown to force them there.

For \(\operatorname{Re}(s)\gt 1\) the Euler product \(\zeta(s)=\sum_{n\ge1}n^{-s}=\prod_{p}\left(1-p^{-s}\right)^{-1}\) already encodes the primes. Analytic continuation extends \(\zeta\) to a meromorphic function on \(\mathbb{C}\) whose only singularity is a simple pole at \(s=1\) with residue \(1\). The completed function \(\xi(s)=\pi^{-s/2}\,\Gamma\!\left(\tfrac{s}{2}\right)\zeta(s)\) is entire and obeys the functional equation \(\xi(s)=\xi(1-s)\), giving the reflection symmetry about the line \(\operatorname{Re}(s)=\tfrac12\). The \(\Gamma\)-factor's poles force the trivial zeros at \(s=-2,-4,\dots\)

Statement. Every nontrivial zero \(\rho\) (those in \(0\lt\operatorname{Re}(s)\lt1\)) satisfies \(\operatorname{Re}(\rho)=\tfrac12\). The arithmetic content runs through the explicit formula \(\psi(x)=x-\sum_{\rho}\tfrac{x^{\rho}}{\rho}-\log 2\pi-\tfrac12\log(1-x^{-2})\), where \(\psi(x)=\sum_{p^{k}\le x}\log p\): each zero contributes an oscillation of size \(x^{\operatorname{Re}(\rho)}\) to the prime count, so RH is equivalent to the optimal bound \(\pi(x)=\operatorname{li}(x)+O\!\left(\sqrt{x}\log x\right)\). The Prime Number Theorem is exactly the statement that there are no zeros on \(\operatorname{Re}(s)=1\) (Hadamard, de la Vallée Poussin, 1896).

Partial results. Hardy (1914) proved infinitely many zeros lie on the critical line; Selberg gave a positive proportion; Levinson (1974) pushed it past \(\tfrac13\) and Conrey (1989) past \(\tfrac25\); all nontrivial zeros are known to lie in the open strip (PNT), and numerically the first \(>10^{13}\) zeros are verified on the line. Montgomery's pair-correlation, matching the eigenvalue statistics of random Hermitian matrices (GUE), underlies the Hilbert–Pólya spectral hope. The hypothesis generalises to Dirichlet \(L\)-functions (GRH) and beyond.

Status: open — one of Hilbert's problems (the eighth) and the only one also among the seven Millennium Problems. A vast body of theorems is proved conditionally on it.

The number line: primes (gold) thin out as you count, the π(x) staircase tallies them, and a faint wave hints at the hidden rhythm. The count climbs along the line on a loop. The complex plane, shaded by |ζ|: the critical strip, the bright critical line \(\operatorname{Re}(s)=\tfrac12\), and the nontrivial zeros sitting in the dark valleys. It climbs the line on a loop; zeros light up as it passes. The image of the critical line under \(\zeta\): every loop through the origin is a zero, and the Euler product ties it to the primes. \(t\) climbs on a loop; each origin-crossing is a zero.

Problem 04 · Unsolved

Yang–Mills & the Mass Gap

Why the strong force has a heaviest-lightest ripple.

The strong nuclear force glues together the cores of atoms. The mathematics physicists use to describe forces like light and magnetism works beautifully, but the same kind of theory for the strong force has a strange feature nobody can fully justify with rigour.

Light comes in massless particles that can be vanishingly faint. The strong force behaves as if its faintest possible ripple still carries a minimum weight: there is a floor, an energy gap, below which nothing can exist. Experiments and computer simulations show this gap clearly. The challenge is to prove, starting from the equations alone and with full mathematical rigour, that such a theory even exists and that this gap is really there.

Electromagnetism is the simplest gauge theory: a field whose internal symmetry can be re-set independently at every point of space, with the photon as the bookkeeper that keeps those choices consistent. Yang–Mills theory (1954) replaces that symmetry with a larger, non-abelian group — \(SU(3)\) for the strong force, \(SU(2)\times U(1)\) for the electroweak — whose force-carriers, unlike the photon, also push on one another. That self-interaction is what makes the theory rich, and hard.

Classically these equations are well understood. The trouble is the quantum version: physicists compute with it daily and it underlies the entire Standard Model, yet no one has built it as a mathematically rigorous quantum field theory in four-dimensional spacetime — the integrals and limits involved have never been given a sound definition.

The mass gap is the precise statement of a striking fact. Unlike light, whose ripples can be vanishingly faint, the excitations of the strong field — the bound "glueballs" — have a strictly positive lowest mass \(\Delta \gt 0\): the energy spectrum sits at zero (the vacuum) and then jumps, with nothing in between. This gap is why the strong force is short-ranged and why quarks are confined.

It is hard because the gap is a deeply non-perturbative effect, invisible to the term-by-term expansions that work for weak forces, and because making the quantum theory rigorous in 4D means controlling infinitely many interacting degrees of freedom at every length scale at once. Lattice simulations see the gap clearly; a proof from the equations is another matter entirely.

The problem (Jaffe–Witten) asks: for every compact simple gauge group \(G\), construct a nontrivial quantum Yang–Mills theory on \(\mathbb{R}^{4}\) satisfying the Wightman axioms (equivalently, a Euclidean theory obeying the Osterwalder–Schrader reflection-positivity axioms), and prove it has a mass gap \(\Delta \gt 0\) — the Hamiltonian's spectrum is \(\{0\} \cup [\Delta,\infty)\) with a unique vacuum and \(0\) isolated.

Concretely one needs a well-defined measure on gauge-equivalence classes of connections with the expected Euclidean symmetries, controlling the ultraviolet (short-distance) limit while keeping the theory interacting. Asymptotic freedom (Gross–Wilczek–Politzer, 1973) makes the UV tractable, but the gap and confinement live in the strongly-coupled infrared.

Constructive quantum field theory has succeeded for related models in spacetime dimensions 2 and 3 (Glimm–Jaffe and others); Balaban, Magnen–Rivasseau–Sénéor and others controlled 4D Yang–Mills in finite volume, but the continuum, infinite-volume theory with a proven gap remains out of reach. A rigorous solution would place the Standard Model's strong sector on a sound footing and explain confinement.

Status: open. Existence and the mass gap together are unproven; the 4D physical case is the obstacle, even though both are overwhelmingly supported by experiment and lattice computation.

The strong force as a field: warm crests, cool troughs. Its faintest ripple still carries a minimum weight. Click the field to excite a ripple; the mass cycles on its own. The dispersion \(\omega=\sqrt{k^2+\Delta^2}\) never reaches zero — a mass gap \(\Delta\). The mass cycles; the gap floor rises and falls but never hits zero. The spectrum is \(\{0\}\cup[\Delta,\infty)\); the problem is to prove \(\Delta>0\) rigorously. Click to excite; the mass cycles on its own.

Problem 06 · Unsolved

Birch & Swinnerton-Dyer

Counting the rational points on a curve.

Picture a particular kind of gracefully looping curve. A natural question is how many points on it have coordinates that are simple fractions, whole-number ratios. For some such curves there are only a handful; for others, infinitely many, scattered endlessly. Telling which is which, just by looking, is fiendishly hard.

In the 1960s two mathematicians ran early computer experiments and noticed a pattern: a completely different quantity, built by counting solutions on the curve, seemed to secretly know the answer. Their conjecture says this hidden quantity acts like a detector: when it reads zero in a certain sense, the curve has infinitely many fraction-points, and the precise way it reads zero even tells you "how infinitely many."

An elliptic curve is a curve \(y^2 = x^3 + ax + b\) (with no sharp corners). Its remarkable feature is a group law: take two rational points — points whose coordinates are fractions — draw the line through them, find where it meets the curve again, and reflect; the result is another rational point. So the rational points can be "added," and they form a self-contained algebraic system.

Mordell's theorem says that system is finitely generated: \(E(\mathbb{Q}) \cong \mathbb{Z}^{r} \oplus T\), a finite torsion part \(T\) together with \(r\) independent points of infinite order. That rank \(r\) is the whole game — it is \(0\) when there are only finitely many rational points and positive when there are infinitely many. Yet there is no known general recipe to compute it.

Birch and Swinnerton-Dyer, running some of the first number-theory computer experiments in the 1960s, attached to the curve a completely different object: an analytic function \(L(E,s)\) assembled from the counts of solutions modulo each prime \(p\). They noticed its behaviour at the special point \(s=1\) seemed to know the rank.

Their conjecture: \(L(E,s)\) vanishes at \(s=1\) to order exactly \(r\). It is hard — and tantalising — because it ties two worlds with no visible bridge: the algebra of rational solutions on one side, the analysis of a function built from prime counts on the other.

Let \(E/\mathbb{Q}\) be an elliptic curve with Hasse–Weil \(L\)-function \(L(E,s)=\prod_p \left(1 - a_p p^{-s} + p^{1-2s}\right)^{-1}\) (Euler factors modified at bad primes), where \(a_p = p+1-\#E(\mathbb{F}_p)\) and \(|a_p|\le 2\sqrt p\) (Hasse). By the modularity theorem (Wiles, Taylor–Wiles, Breuil–Conrad–Diamond–Taylor) \(E\) is modular, so \(L(E,s)\) continues to an entire function with a functional equation \(s\leftrightarrow 2-s\).

The conjecture: \(\operatorname{ord}_{s=1} L(E,s) = r = \operatorname{rank} E(\mathbb{Q})\). The strong form is an exact formula for the leading Taylor coefficient, \(\tfrac{1}{r!}L^{(r)}(E,1) = \dfrac{\#\ ext{Ш}\cdot \Omega_E \cdot R_E \cdot \prod_p c_p}{(\#E(\mathbb{Q})_{\mathrm{tors}})^2}\), in terms of the regulator \(R_E\), the real period \(\Omega_E\), the Tamagawa numbers \(c_p\), and the order of the Tate–Shafarevich group \(\ ext{Ш}\).

It is a theorem in analytic rank \(0\) and \(1\): Gross–Zagier (heights of Heegner points) together with Kolyvagin (Euler systems) prove \(\operatorname{ord}_{s=1}L = r\) and the finiteness of \(\ ext{Ш}\) in those cases; Kolyvagin, and later work of Skinner–Urban and Bhargava–Shankar (average ranks), extend the picture. Rank \(\ge 2\), and the finiteness of \(\ ext{Ш}\) in general, are open.

Status: open — the prototype of the deep conjectures (Beilinson, Bloch–Kato) tying special values of \(L\)-functions to arithmetic invariants.

An elliptic curve \(y^2=x^3-x+1\); its rational points have fractional coordinates. P drifts along the curve; drag it to place it. Two points P (teal) and Q (violet) add by chord-and-tangent to give P+Q (gold). Drag P or Q along the curve. The multiples nP generate a subgroup — infinitely many points when the rank > 0, which BSD reads off \(L(E,s)\) at \(s=1\). The prime scan steps along on its own.

Problem 07 · Solved · Perelman, 2003

The Poincaré Conjecture

How to recognise a sphere from the inside.

Imagine stretching a rubber band across the surface of an orange. You can always slide and shrink that loop, without lifting it off the surface, until it pulls down to a single point. Now try the same on a doughnut: a loop threaded through the hole gets stuck, and cannot shrink away. That difference is how a mathematician "feels" the shape of a surface without ever seeing it from outside.

Poincaré asked the same question one dimension up, for the three-dimensional "surface" of a four-dimensional ball: if every loop can shrink to a point, must the space be, in essence, a sphere? For a century it resisted everyone. In 2003 Grigori Perelman proved that the answer is yes, and then famously declined both the million dollars and the field's highest medal.

A manifold is a space that, viewed up close, looks like ordinary flat space — a smooth surface is a 2-manifold, and the universe around us a 3-manifold. The surface of a ball is the 2-sphere; one dimension up, the 3-sphere \(S^3\) is the (3-dimensional) boundary of a four-dimensional ball. We call a space closed if it is finite and has no edge, and simply connected if every loop in it can be shrunk continuously to a point — no hole anywhere to snag it.

Poincaré's question (1904): is the 3-sphere the only closed simply connected 3-manifold? The conjecture says yes — any such space must be homeomorphic to \(S^3\), i.e. topologically identical to it. Strangely, the higher-dimensional analogues fell first: Smale settled dimensions \(\ge 5\) (1961) and Freedman dimension 4 (1982). Dimension three — our own — was the stubborn holdout.

Perelman's route was geometric, not topological. Ricci flow evolves a manifold's shape so that highly curved regions smooth out, much as heat spreads to even out temperature. Run on a simply connected 3-manifold, the flow wants to round everything into a sphere — but it can pinch into singular necks along the way, and the art is to cut those out by surgery and continue, proving the process terminates in round pieces.

Statement: every closed, simply connected, smooth 3-manifold is homeomorphic to \(S^3\) — equivalently, \(S^3\) is the unique closed 3-manifold with trivial fundamental group \(\pi_1(M)=0\). (In dimension three the smooth, PL and topological categories agree, so "homeomorphic" and "diffeomorphic" coincide.) It is the central case of Thurston's Geometrization Conjecture, which decomposes any closed 3-manifold into pieces modelled on eight homogeneous geometries.

Proof: Hamilton's Ricci-flow programme evolves the metric by \(\partial_t g_{ij} = -2R_{ij}\). Grigori Perelman (2002–2003), in three preprints, completed it for geometrization: a no-local-collapsing theorem (via the monotone \(\mathcal{W}\)-entropy and reduced volume) rules out cigar singularities, canonical-neighbourhood and surgery control tame the singular times, and "Ricci flow with surgery" runs for all time, forcing \(\pi_1(M)=0\) to round up into \(S^3\).

Status: solved (2003). Verified by independent expositions (Kleiner–Lott; Morgan–Tian; Cao–Zhu). Perelman was awarded the Fields Medal (2006) and the Millennium Prize (2010), declining both. It is the only one of the seven resolved to date.

A sphere, with a loop on its surface that you can shrink to a point. Drag to orbit; the loop contracts on its own. On a sphere every loop contracts — it is simply connected, \(\pi_1=0\). Drag to orbit; the loop contracts on a loop. That property characterises the 3-sphere \(S^3\) — proved by Perelman via Ricci flow. Drag to orbit; the Ricci flow runs on its own.