Regression to the mean

August 8, 2009 by James

Don’t panic – I know it sounds like maths, and I know I have a tendency to get mathsologically professorial at you from time to time. But I’ll try and leave the algebra out of this as much as possible. (You don’t know how much fun you’re missing, though.)

An overlong explanation

So, to get an idea of regression to the mean, let’s pick an example. Take the children in a school. (If I have to clarify for you that I’m speaking conceptually, and not demanding their physical abduction, then stop reading this now and turn yourself in to the appropriate authorities before you commit any more inadvertent felonies.) Maybe just remember your own classmates. You should find that, in most regards, most of them are around average. That’s kinda what “average” means, after all. But some will be a lot brighter than the others. Or way taller. Or they’ll have a thumb that bends back way more than it should so they can actually touch it to their wrist. (Francis, I’m looking at you, that shit is not right.)

With varying factors like brains, height, or thumb-bendiness, you’ll find people scattered along a scale, with the majority clustered around the middle, but a few way out at the fringes. It’s entirely expected that, completely by chance, a large population will produce a few (for want of a better word) freaks. Out of all the people you know, there aren’t likely to be many geniuses, giants, or extreme contortionists, but they ought to crop up randomly from time to time. (It’s causing me physical pain here trying to explain this without using phrases like “normal distribution” or “standard deviation from the mean”, but I’ll struggle bravely on.)

In these cases, you basically have a single shot at getting lucky. You rolls the dice and takes your chances on the genetic lottery, and then you’re stuck with what you get. It’s a one-off thing. But the principle also applies to repeated experiments, where you measure the same thing at different times, rather than just in different people.

Say you give a multiple choice test on quantum dynamics to a third-grade class. They’re all just going to have to guess from the four options for each answer (assume there are no abnormally intelligent genetic super-beings among them for now), so everyone can expect to get 25%, on average (one question correct in four). But not everyone’s going to get exactly one in four. A few will have lousy luck that day and keep stumbling into all the wrong answers, and a few will probably do surprisingly well. Maybe a small handful will end up with scores around 50%, without even having to be psychic.

If you then took all the kids who’d excelled on the first test, and gave them a similar quiz on applied neurosurgery, they’d all generally do no better than chance. You might find one or two getting high scores in this one as well, depending on the size of your class, but even though they all scored highly to begin with, the averages in the second test will be, well, average. It’d be exactly the same if you took the lowest scorers from the first test.

And yet, just because most of them did worse on the second test, now is not the time for the careers counsellor to be steering these kids away from brain surgery and down the path of theoretical physics.

The kids who initially did well were just the luckiest of a random bunch. So, most likely, their luck is going to be lower next time, and their scores will be less impressive.

They’ll regress to the mean, is what I’m saying.

The mean average is where a random variable is most likely to land. The further you stray from the mean, the less likely you are to find something landing that far out. So, any kid who scored highly is much more likely to come closer to the 25% mean average next time.

So, this matters why?

It matters because it’s an insidious way for errors of thinking to creep into your analysis. Or, if you prefer, it makes you get stuff wrong.

In most tests and games, there will be people who are just more talented or able than others, and they’ll tend to do better – but everyone will also have good or bad days, when they do better or worse than would normally be expected of them. It’s easy to get stuff wrong when you fail to distinguish between the effects of luck and skill. When a result regresses to the mean, as you should expect it to do by chance, you might attribute this change to a variation in skill, or in something else that’s been introduced between the two different results.

For instance, if you wanted to try out some new stupid pills you’ve invented on a selection of non-consenting minors (and who wouldn’t?), you might want to slip them to your high-scoring kids from the test, before they move on to the neurosurgery module. Most of them would do worse, and you might conclude that they have indeed got stupider. Actually, they’re just as average as they ever were, but their luck has evened out.

And speaking of stupid pills, a more realistic example can be found in the exciting world of alternative medicine. If you get a bad cold, chances are you’ll weather it for the first few days of minor sniffles. It’s only once you’re really suffering that you’ll head down to the local homeopath for some ~~water~~ totally-for-realz medicine. So, because you’re probably pretty close to being at your worst at the time when you’re treated, your only option is to start getting better (or die, but that’s a drastic step to take just to undermine my point). Whether or not the treatment has any effect, your health will improve after taking it, so it’s easy to leap to conclusions. You might assume that correlation implies causation, if the regression effect is assumed to be something more significant.

To be fair, this fallacy could make for an equally insidious trap in the case of actual medicine. The way to avoid tripping up, of course, is to run some kind of controlled test, so that the people getting the treatment are compared to some other people whose health is just regressing at a natural rate anyway. Then you have a chance of being able to tell whether the purported medicine actually did anything to change the natural course of events.

Regression to the mean can also be useful in explaining certain apparent ‘jinx’ effects. Since I am a British person, I am not well acquainted with many popular transatlantic publications, but one of some notoriety goes by the name “Sports Illustrated”. One supposes it to contain depictions of various different athletic activities, and it will tend to emblazon on its cover the most remarkable, outstanding, successful competitor of the moment in one such arena.

It’s said to be jinxed, because after being featured on the cover these successful athletes tend to experience a “fall from grace”. But the reason they were there in the first place was because of how extraordinarily well they were doing at the time. From there, the only way is down. They were probably in the middle of a run of particularly good luck, and they’re unlikely to continue such good fortune for long, or achieve another attention-grabbing victory on the same scale anytime soon. They’ve not become any less good at what they do, and their overall luck probably hasn’t changed much. They were just caught in a snapshot at their peak, in a position that was never going to be sustainable.

There’s also an interesting connection to the problem of whether praise for good behaviour, or punishment for bad, is a better motivator. In some cases, it may seem that people do worse after you praise their achievements, and better after you punish them. But it’s likely that you’re praising someone for doing particularly well – immediately after which their luck is bound to even out, so their show of skill will seem less impressive next time. Similarly, if you shout at them for sucking, they might suck less next time – but they probably wouldn’t be having such a bad day next time anyway, so the decrease in suck is just a natural thing, and may even have been greater if you hadn’t been bitching at them. If you look at the more general trends of the motivating effects of praise or punishment, the results might be entirely different from how it intuitively seems.

So. Avoid false reasoning by taking note of regression. It’s inevitable that things will start getting better (or worse) when they’re at an extreme, particularly when luck plays a significant role compared with skill or natural tendencies. I probably could have skipped ahead to this summary a bit sooner, couldn’t I? Ah well.

Posted in science, skepticism, skeptictionary | Tagged alternative medicine, logic, reason, statistics | 1 Comment

One Response

on August 9, 2009 at 4:21 am | Reply Michael K Gray

When explaining ‘regression to the mean’ I find that turning the Gaussian distribution curve upside-down really helps understanding in non-mathematical folk.
It is obvious that gravity makes stuff move to the middle, no matter where it starts.
(I get them to imagine a blob of honey as the object of interest)

Comments RSS

	Cayke Is Art on Hiatus, episode a jillion
	ยืนยันเบอร์ รับเครดิ… on Alternative medicine
	Visit Website on The War on Winterval
	Tyler Calvert on There’s a theist on my…
	Ramontar on Sex and God and animals

Cubik's Rube

Regular blogging on atheism, skepticism, genderism, journalism, anarchism, politics, and other stuff that infuriates or inspires me.