Goodnight Wiki / Goodhart's Law

Goodhart's Law

"When a measure becomes a target, it ceases to be a good measure." Marilyn Strathern's elegant restatement of Charles Goodhart's 1975 observation about monetary policy captures something that anyone who has worked inside a bureaucracy already suspects: metrics corrupt the thing they measure.

Goodhart's original insight was narrow and technical. He noticed that when the Bank of England tried to control monetary supply by targeting specific statistical indicators, the indicators themselves became unreliable. The regulated entities changed their behaviour to hit the numbers, not to do the thing the numbers were supposed to track. It was a critique of Thatcher-era monetarism, but the idea turned out to be far more general than monetary policy.1

The Taxonomy of Corruption

There are at least four distinct ways that Goodhart's law operates, and conflating them leads to fuzzy thinking about when metric-targeting actually fails.

Regressional Goodhart is the simplest: your metric is a proxy that correlates with the thing you actually care about, but when you optimise the proxy, you push into regions where the correlation breaks down. Test scores correlate with learning, but teaching-to-the-test optimises the proxy while degrading the underlying skill. Citation counts correlate with research impact, but citation rings game the proxy without producing better science.

Extremal Goodhart is what happens when you push optimization hard enough that you leave the normal distribution entirely. A moderately optimized system behaves roughly as your model predicts. A heavily optimized system finds the bizarre edge cases where proxy and target diverge catastrophically. This is why the most aggressively optimised AI reward functions produce the most alien-looking specification gaming — the agent finds solutions that score perfectly on the metric while looking nothing like intended behaviour.

Causal Goodhart is subtler. The metric and the outcome share a common cause, but the metric doesn't cause the outcome. Healthy employees are both more productive and take fewer sick days. If you penalise sick days to increase productivity, you'll get people working while ill — the metric improves but productivity doesn't, because you were intervening on the wrong node in the causal graph.

Adversarial Goodhart is the most dramatic: agents actively game the metric because their incentives are misaligned with yours. Campbell's law — "the more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures" — is really about this variant. The Volkswagen emissions scandal is a textbook case: when regulators measured emissions under test conditions, VW built software that detected the test and behaved differently from normal driving.1

Why It's Worse Than You Think

The depressing thing about Goodhart's law is that it gets stronger as the stakes get higher. Low-stakes metrics are usually fine — no one games their kitchen thermometer. But the moment a metric determines funding, promotion, regulatory compliance, or survival, the pressure to optimise the metric rather than the underlying reality becomes intense. And the people best positioned to game the metric are usually the most sophisticated actors in the system, which means the gaming is hardest to detect.

Jerry Muller's The Tyranny of Metrics documents how this plays out across education (teach-to-the-test destroying curiosity), healthcare (hospitals avoiding risky patients to improve survival statistics), policing (CompStat encouraging downgrading crimes to meet targets), and academia (publication counts incentivising paper-slicing over substantive research). In each case, the metrics were introduced by well-meaning reformers who wanted accountability and transparency. The metrics delivered the appearance of accountability while eroding the thing itself.

This connects to a deeper problem in legibility and state power. James Scott's insight was that states need to simplify complex realities into legible, measurable categories in order to govern them. Goodhart's law says that the act of governing through those simplified measures will reliably degrade them. The two ideas together suggest that there's a fundamental tension between institutional control and institutional knowledge — the tools you need to manage something at scale are precisely the tools that distort your picture of what's actually happening.

What Survives the Critique

The instinctive response to Goodhart's law is to throw out metrics entirely and rely on "qualitative judgment." This is wrong. Qualitative judgment is just as gameable — it's easier to bullshit a committee than a statistical test. The problem isn't measurement per se; it's making a single measure the sole target of an optimisation process.

The practical defences are familiar but hard to execute: use multiple uncorrelated metrics, rotate them unpredictably, combine quantitative measures with contextual judgment, and pay the costs of actually checking whether the metric tracks reality rather than assuming it does. Gall's law — that complex systems that work evolved from simple systems that work — applies here too. You can't design a perfect metric system from scratch. You have to build one, watch it get gamed, fix the obvious exploits, and iterate. The system is never finished.

What really holds inadequate equilibria in place is often a metric that has become load-bearing. Everyone knows the number is meaningless — the teachers know test scores don't measure learning, the doctors know survival rates don't measure care quality — but the institutional machinery runs on those numbers, and replacing them would require coordinated action that no individual has the incentive or authority to take. The metric has become a Schelling point not because it's good but because it's shared.

Footnotes

  1. Goodhart's lawsource 2

Open in stacked reader →