The post-it note has now fallen off my computer, because I got involved in a short social media argument which reminded me of something I’ve been meaning to write about for ages. Goodhart’s Law, named after the former Bank of England Chief Economist who famously said “When a measure is used as a target, it ceases to be a good measure”.
Except no he didn’t. He said “Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.”
One might indeed say that “when a technical point begins to be used as an aphorism, it ceases to be a good technical point”.
Goodhart wasn’t talking in general about it being a bad idea to have targets. He really couldn’t have been, given the job he was in. In my view, it’s really not possible or serious to be completely opposed to the concept of setting targets in general; it’s too close to the idea that there should be no feedback from output to input.
I used to have three questions which I asked of people who claimed to be in favour of affirmative action for women and minorities, but against “quotas”:
1. If you never measure what kinds of people are getting admitted to university, how can you claim that you care about it?
2. If you measure, but then don’t do anything about the measurement, how can you claim that you care about it?
3. If you measure the outcomes, and then take action if the values are unsatisfactory, how is that different from a quota?
This isn’t necessarily a wholly generalisable argument (as Paul Willmott said “this bit is not wholly rigorous, mainly because it’s wrong”. It’s one of my “being annoying on purpose” bits). People can disagree on the scope of the word “target”, but I think there’s a solid idea at the middle of it – unless a system is small enough for the manager to hold all of its complexity in their own head, there will need to be measurements made and reports generated.
Those measurements and reports are going to be translated into action, and one important way in which that happens is that some of them will take the form of numerical measures, which trigger further investigation and action when they depart sufficiently far from norms. It’s just about possible to manage things without those numerical measures but it’s difficult, it’s probably not the best way, and it gets more and more untenable as a system grows – the cognitive task of getting the overall “vibe” of whether things are going well or badly becomes increasingly impossible for people who are determined to deny themselves the use of those incredibly handy things, numbers.
But targets are always misleading, that’s what Goodhart told us! As pointed out above, no he didn’t. Even allowing for the very interesting slip from “statistical regularity” to “measure”, he was talking in the context of monetary base targeting in the UK. The specific problem that Goodhart’s Law was meant to dramatize was that before the policy regime changed, the M0 money supply seemed to bear a reasonably stable relationship to the actual quantities of interest – the level of activity and prices in the economy. When it was used as a target with the hope of manipulating those quantities, it broke down.
What we can see here is that the original formulation of “statistical regularity” lost a lot when it was simplified to “measure” for aphoristic purposes. Goodhart’s Law is really a statement about the process of trying to make policy based on proxy measures of “internal states of complex systems” which are not themselves directly observable.
After all, a thermostat both measures and targets the ambient temperature, but that doesn’t mean that the temperature ceases to be a good measure of what the thermostat is trying to control. If you want to eliminate smallpox, then however complicated the overall ecological and social context, the number of smallpox infections is a very good measure of whether you’re winning or not. Central banks these days have more or less given up on intermediate targets and simply target the actual inflation rate – this has a number of its own problems, but they’re not problems of the sort that Goodhart experienced.
So the message of Goodhart’s Law is that if you’re setting targets, they ought to target the thing that you care about, not something which you believe to be related to it, no matter how much easier that intermediate thing is to measure. That doesn’t guarantee success; the phenomenon of “gaming the system” or the tendency of control systems to be undermined by adversarial activity is much more general and complicated than this single problem.
Targets are usually an information reducing filter on the system (which is really just saying that they’re a tool of management - attenuating information to, literally. “make it manageable” is the whole nature of the job). That’s the fundamental reason why they sometimes go wrong. Goodhart’s Law is one specific example of the general phenomenon.
On the other hand, I think there is one major and empirically important case which is pure Goodhart’s Law and where people really could help themselves out a bit by respecting it. As far as I can see, “teaching to the test” is a one hundred and eighty degrees inverted description of a phenomenon that ought to be called “not testing for the outcomes you want”.
To a psychologist interested in cognition, what jumps out about Goodhart's law is the behavioral aspect. Measuring something OR making it a target, then labeling it, making it public, or otherwise calling attention to it, changes the behavior of people. People will always try to game a system for maximum benefit, in any context; that is rational behavior. Therefore, because of human cognition, announcing a measurement or target will alter people's behavior and change the meaning of the target measurement, in many cases.
Simple example: you are targeting a particular measurement in medical lab tests, and you make that the goal while keeping on with unhealthy lifestyle choices, then the target, the measurement, has become a poor representative of health. It's original meaning is changed. It has been changed precisely because the measurement was made a goal, a proxy for the real goal (health), and this invited "gaming the system."
Perplexity.ai had a few worthy "comments":
'Goodhart's Law is an adage that states, "When a measure becomes a target, it ceases to be a good measure"' - That had four citations, so that is the proper pithy version I guess. The bot gave several good examples, including this:
'An illustrative example of Goodhart's Law is the bounty on cobras in colonial India, where citizens started breeding cobras to receive the reward, ultimately increasing the cobra population'
And a fair summary: 'The law suggests that when a particular measure is used as a target or goal, people tend to optimize their actions to meet that target, often at the expense of other important aspects or unintended consequences.'