Posts Tagged ‘statistics’

So over on Tumblr (yes, that’s still a thing that’s happening), ozymandias271 explained what “condoms are 98% effective” actually means in a recent post and it’s kinda made my brain explode.

I’ve been hearing that statistic (or other similar ones) for ages, and never concerned myself with it too closely. Given how little casual (or any kind of) sex I’ve generally been having, it wasn’t of much personal importance, and while I advocate strongly for comprehensive sex and relationships education, it should definitely be someone better-informed than me doing it. But I knew enough to know that condoms are good, and knowing how they work is also good, and if I needed more detailed data than that I’d surely be able to do the research.

But 98% always seemed oddly low. I wasn’t sure how much it was affected by issues like compliance or user error – is that remaining 2% at least partly explained by people just applying them wrong? – but taken on its face that’s actually quite a high-sounding failure rate. Do you really only have to have fifty sexual encounters involving a condom before you’ve statistically had one for which it might as well not have been there and you’re facing all the risks of unprotected sex? Given how much sex straight people on TV seem to be having, this makes it sound like unplanned pregnancies due to contraceptive ineffectiveness would be cropping up pretty regularly, and just something to be accepted as par for the course.

Anyway it turns out that’s totally not what “98% effective” means. Taking the outcome of unplanned pregnancy specifically, here’s how one website describes the effectiveness of condoms:

In one year, only two of every 100 couples who use condoms consistently and correctly will experience an unintended pregnancy—two pregnancies arising from an estimated 8,300 acts of sexual intercourse, for a 0.02 percent per-condom pregnancy rate.

98% effective doesn’t mean a condom is only doing its job in 98% of sexual encounters. It means that 98% of people using condoms for a year will avoid unplanned pregnancies in that year.

Or, assuming you’re using them correctly and having sex about as often as these statisticians imagine, the length of time the average person would have to keep having regular safe sex before encountering a condom failure isn’t fifty sexual encounters, but fifty years.

I have been massively misunderstanding this for YEARS because of what seems like REALLY UNCLEAR COMMUNICATION AND UNHELPFULLY OBSCURE PHRASING, GUYS. Seriously, I can’t be the only one who finds that a totally counter-intuitive interpretation of the “98% effective” line. Did everybody but me already have this figured out? I mean, it’s less important that I understand this than almost anyone else, but still.

Read Full Post »

A proposed law in North Carolina would restrict scientists and limit how much science they’re actually allowed to use when doing science.

In case that’s a big vague for you, here’s a quote from the bill being considered, describing the ways in which they’d be permitted to examine and describe the rates at which the sea level is rising:

These rates shall only be determined using historical data, and these data shall be limited to the time period following the year 1900. Rates of seas-level rise may be extrapolated linearly…

Lawmakers seem to share the concerns of Tom Thompson, a spokesman from a local economic development group, who worries about science being done by “nothing but computers and speculation”.

Science which depends on such arcane and incomprehensible techno-wizardry as “computers” is, of course, well known to be less reliable than simply declaring the world to be how you want it and assuming everything will work out for the best.

And the restrictions due to be placed on scientists make perfect sense. Just like how, sometimes, it’s better for everyone if you insist that the defendant in a criminal trial enter a plea without resorting to use of the word “not”. It’s still perfectly fair on them; it just assures that reality lines up neatly with your own desired outcome.

Perhaps canny state legislators noticed how Springfield was never threatened with destruction by a comet again after its residents burned down the observatory.

Also, if NASA had had to assume that gravity decreases linearly as you move away from the Earth, instead of making things all complicated, maybe we would have reached the Moon a lot sooner. I guess we’ll never know.

Anyway, in a spirit of true North Carolinian enquiry, I’ve done a bit of my own research into other trends that can be foreseen, using the same conditions as these oceanographers will be working under, and I’ve discovered some fascinating facts about the world of the future.

Here are just a few examples.

In 1920, the men’s 100m sprint world record was 10.6 seconds. As of 2009, it now stands at just under 9.6 seconds. Having improved by a whole second in just under 90 years, it can be linearly extrapolated that by the year 2873, men will be able to run the 100m instantaneously.

At the turn of the next millennium, they’ll be crossing the finish line a second and a half before the starting pistol is fired.

Oddly enough, running a marathon in no time flat will be achieved by the first man in 2244, even while a much shorter dash still takes several whole seconds. Meanwhile, women will be starting the marathon more than an hour after they’ve already finished it.

In 1900, the tallest building in the world was the Eiffel Tower, at 300m. This has since been surpassed by the Chrysler Building, the Empire State Building, and numerous others. The current record-holder is a ridiculous half-mile skyscraper in Dubai. Very approximately, then, we seem to be extending an extra 500m skyward each century.

By 2100 the tallest structure will be 1300m high. By around 2160, the toppermost of top floors will be a mile off the ground. That’ll double in a further 320 years. I’m not sure what’s going to motivate us to keep building up and up and up like this, how we’ll keep these things structurally sound against high winds and earthquakes, and whether low temperatures will become problematic as we start nearing the edge of the troposphere – but hey, I’m just extrapolating linearly from the available data.

Alarmingly, if we follow the same trend back in time, then we discover that the only things constructed before the year 1840 were basements and cellars. How this can be squared with the discovery of, say, the Pyramids, I’m not clear – but we’re only using historical data from the past century, so we’re a bit stuck.

But we’re barely scratching the surface of what this new form of science can tell us. For instance: the improvements in infant mortality over the past few decades can only be seen as wonderfully encouraging, but it also produces perhaps the most startling future predictions. Since 1950, the UK’s infant mortality rate has gone from 29 deaths (per 1,000 live births) to 5. That means we’re saving about one more child, out of every thousand, every two-and-a-bit years.

This leads us inexorably to the conclusion that, by the year 2030, for every 1,000 children born in this country, 1,002 of them will survive.

I’m sure I don’t need to explain to you the catastrophic effect this will have on population scientists’ spreadsheets.

Forget whether North Carolina’s going to have any coastline left in a hundred years. Clearly the world has bigger problems on its hands.

Read Full Post »

Ben Goldacre’s got a fab example of misleading statistics, and the ways in which you can learn to think about things to avoid jumping to a wrong conclusion.

Look at his first nerdy table of data on that article. All they’ve done is take a bunch of people who drink alcohol, and a bunch who don’t, and counted how many from each group ended up with lung cancer. It turns out that the drinkers are more likely to get lung cancer than the non-drinkers.

The obvious conclusion – and (spoiler alert) the wrong one – is that drinking alcohol somehow puts you at greater risk of developing lung cancer. You might conclude, from that table, that if you currently drink alcohol, you can reduce your risk of developing cancer by no longer drinking alcohol, thus moving yourself to the safer “non-drinkers” group.

This is actually a fine example of the Bad Science mantra, and Ben makes an important point which many non-nerds might not naturally appreciate about statistics: the need to control for other variables.

If drinking doesn’t give you cancer, then why do drinkers get more cancer? The other two tables offer a beautiful explanation. Of all the drinkers and non-drinkers originally counted, try asking them another question: whether or not they smoke cigarettes. What you get when you do that is the next two tables.

If you just look at the smokers, then the chances of a drinker and a non-drinker getting lung cancer are almost exactly the same. If you look only at the non-drinkers, ditto. In other words, once you know whether someone smokes cigarettes, whether or not they drink makes no difference to their odds of getting lung cancer.

Which is a long way away from the obvious conclusion we were tempted to draw from the first set of data.

What we did here was to control for another variable – namely smoking – before drawing sweeping conclusions from the data. When we give smokers and non-smokers their own separate tables, it means that smoking cigarettes isn’t unfairly weighing the data we’ve already got any more. It becomes clear that drinkers aren’t simply more likely to get cancer; they’re more likely to be smokers.

And although Ben’s right to point out the importance of controlling for other variables like this, what interests me is the reminder of the importance of Bayesian probability.

In particular, the thing to remember is that the probability of an event is a measure of your uncertainty, and not something inherent in the event itself.

For instance, if that first table is all the data you have, then all you know is that drinkers are more at risk of cancer than non-drinkers. If you were to estimate somebody’s odds of getting lung cancer, and the only thing you knew about them is that they’re a drinker, the best you could do is to place it at 16% – the amount of drinkers who developed lung cancer in the study.

If you later acquire the extra data in the second tables, and find out that the individual you’re interested in is not a smoker, then suddenly you can re-adjust your estimate, and give them about a 3% chance of getting lung cancer. They haven’t done anything differently; nothing about their situation has changed for them to suddenly appear much more healthy. You’ve just learned more about them.

And it’s still not true that their odds of developing cancer are exactly 3% in any objective sense. Maybe tomorrow you’ll learn something about their age, or gender, or family history, and adjust your estimate again based on the new data. Maybe you don’t know that a doctor actually diagnosed them with lung cancer yesterday. This, obviously, makes a huge difference to their odds of having lung cancer – but it doesn’t change the fact that they’re in a low-risk group, and a 3% estimate is the best you can do based on your current knowledge.

In conclusion: stats are hard, listen to maths geeks (or become one yourself) before panicking about the latest tabloid healthscare.

Read Full Post »

You know, I’m a bit uncomfortable with some of this discussion about children being sold as sex slaves.

I know, I know. Me and my crazy hang-ups.

This post on Bound, not Gagged, a blog for sex workers, is worth reading. It provides some context to some of the hyperbole around the issue of child sex trafficking in America.

Because yes, even around something as serious and terrible as child sex trafficking, hyperbole is still possible.

A number of celebrities have recently appeared in short filmed segments as part of a big campaign against this scourge, which has cited a figure of 100,000 to 300,000 for the number of children currently involved in sex trafficking in America. That’s an utterly horrifying idea, and may have motivated some people into some sort of action… but it’s also completely inaccurate.

If you look at those numbers and where they came from, it turns out that this is really nothing more than a guess, not backed up by any particularly vigorous science, as to how many children might potentially be at risk of some sort of abuse, sexual or otherwise.

It takes a monumental and seemingly deliberate misinterpretation of the data to start touting this as the number of children currently involved in the sex trafficking industry.

The author of the post refers to “fetishists” of child sex trafficking – meaning not those vile criminals directly involved in the activity, but those with a tendency to become zealous in their righteous campaigning against it. And it may not be an inappropriate word. It’s a subject which stirs some understandably strong emotions, and there can be a tendency to start assuming the worst, believing every half-credible factoid that comes your way which confirms the worst, and riding a wave of well-meaning indignation for as long as there’s enough (mis)information to fuel it.

A consultant involved in the campaign is quoted as saying:

I don’t frankly care if the number is 200,000, 500,000, or a million, or 100,000 — it needs to be addressed.

Which doesn’t do much to diminish the validity of the “fetishist” label. I can’t help thinking you really should care about numbers. There are more things we can do with numbers than point to how small they are and dismiss the problem, as some campaigners seem to fear is all will happen. Numbers should also have an impact on how we craft our response. If we thought there were a thousand children in sex trafficking in the US, we’d deal with it differently than if we thought there were a million.

And if you think people will only respond with enough concern to a thousand kids in sex slavery if they’re made to think there’s actually a million… well, you’re not giving your fellow humans much credit.

Perhaps part of the objection is that this kind of fact-checking downplays and dismisses the enormity of the crime in question. But if you think that the actual numbers of children suffering sexual abuse, which might be in the thousands rather than hundreds of thousands, are something that people will choose to dismiss or ignore, then you’re doing the other members of your species a rather condescending disservice. We get that it’s still horrible and deserves a response when it’s not exaggerated by a factor of a hundred.

“There are over a hundred thousand child sex slaves in this country!”
“Actually, it’s probably on the order of a thousand.”
“Why are you trying to make it sound like this isn’t an important issue?”
“Not important? Dude, there’s a thousand kids out there in sex trafficking, that doesn’t sound important to you?”

Of course, I’m veering a little close to straw-mannery here, or at least to being uncharitable. Most of the celebs involved were no doubt simply asked if they’d mind giving a little of their time to capitalise on their fame for what is unquestionably a good cause. It’d be a bit harsh to start blaming Justin Timberlake for not looking closely enough at the statistics.

And even the people claiming not to care about numbers are surely well motivated, even if they sometimes let reality get a little blurred in the face of their need to be seen to be acting nobly.

But the issue of truth is not one to be easily discarded. And if addressing something accurately doesn’t also allow us to address it better… Well, then, I just don’t know where we are.

Read Full Post »

Don’t run away.

This post is going to be about maths and probabilit


There was a scientific paper recently published, in a respected academic journal, which purported to demonstrate evidence of human precognition.

Yep, science says people can tell the future.

Except, not really. Not yet, anyway. As the study’s author, psychology research Daryl Bem, said himself in the published paper, it was important for other scientists to repeat the experiment, and see if they got the same results. Richard Wiseman has been among those involved in such attempted replications, which so far have failed to support Bem’s original conclusion.

There’s a big moan I’m not quite in the mood to make, about how science generally gets publicised in the media, and the tabloids’ tendency to make a massive fuss over preliminary results, without concerning themselves with facts which later emerge and completely undermine their sensationalist headlines.

But I want to talk about the maths.

Replication is always important in science, particularly where the results look unlikely, or demonstrate something completely new. This is partly because, for all we know, Bem’s original research could have been dishonest or deeply flawed. Most people seem to consider both of these unlikely, though, and I’m certainly not suggesting that he’s faked his results.

But people often seem to assume that these are the only two options: that positive results must mean either an important and revolutionary breakthrough, or very bad science. The idea that something could just happen “by chance” now and then never seems to get much credibility.

Almost every time someone in a TV show or a movie proclaims something to be “just a coincidence”, or that there’s a “perfectly rational explanation”, we’re meant to take it as an ultra-rationalist denial of the obvious – usually supernatural – facts. Remarkable coincidences just don’t happen in the way that ghosts and werewolves obviously do. In fictional drama, there are good reasons for this. In the real world, this is a severe misunderstanding of probability.

When deciding whether or not to get excited about a result, scientists often look for significance “at the 5% level”. Bem’s results, supporting his precognition hypothesis, were significant at this level. But this does not mean, as you might think, that there’s only a 5% chance of the hypothesis being wrong.

What it means is: there would be a 5% chance of getting results this good, just by chance, if people aren’t really psychic.

So, getting results like this – statistically significant at the 5% level – is actually slightly less impressive than rolling a double-six. (If you have two regular six-sided dice, the odds of both landing on 6 on a single roll is 1 in 36, which is slightly less than 3%.)

I’ve rolled plenty of double-sixes. If you’ve rolled a lot of dice, so have you. And if you do a lot of science, you’d expect just as many random chance results to look significant.

So, if you’re thinking that we should probably ask for something a bit more conclusive than a double-six roll before accepting hitherto unconfirmed magic powers, you’re probably right.

This is the essence of Bayesian probability. Imagine having one of the following two conversations with a friend who has two dice:
“These are loaded dice, weighted to always land on a double-six. Watch.”

“Huh, so they are. Neat.”
“I’m going to use my psychic powers to make these dice land on double-six. Watch.”

“…Okay, that’s a little spooky, but you could’ve just got lucky. Do it again.”
You see why you might not believe it right away when your friend claims something really outlandish? But when it was something pretty normal, you’d be more likely to buy it?

In either case, the odds of rolling sixes by chance were exactly the same, 1 in 36, independent of what was allegedly influencing the outcome. But that doesn’t mean you should be equally convinced in either case when the same result comes up.

Both claims become more likely when the double-six is thrown. After all, if the dice really are loaded (or psychically influenced), then what you’ve just seen is exactly what you’d expect to see. But they’re not both getting more likely from the same starting point. One started out as a much more plausible claim than the other, and it’s still more plausible now.

Loaded dice? Sure, they have those. Telekinesis? Well, you have my attention, but let’s see you do it again. And again. And a dozen more times with a fresh set of dice.

This is part of my recurring, occasional project to convince the world that Bayesian probability is both important and intuitive, when it’s expressed right.

Ben Goldacre wrote about Bem’s research, the New Scientist also discussed it, there are some details of the replication attempts at The Psychologist, and I was prodded into thinking about all this in some more depth by a recent episode of the Righteous Indignation podcast.

Read Full Post »

Yay, another maths lecture!

Click through to see the whole cartoon at XKCD. Really do it. It’s important. Especially if you want the rest of my burblings to make sense.

So. It’s partly funny because it satirises the sensationalism of tabloid news, and the urge to cram as much excitement into a headline as possible only to leave a sober assessment of actual facts to the blogosphere. But it actually addresses a much more common problem with our understanding of probability.

Most people who pay much attention to any kind of sciencey talk are probably familiar with the p-values referenced in the comic. When scientists are testing a hypothesis, they’ll often check whether the p-value (p for probability) of the results from their experiments is less than 5%. The smaller the p-value is, the less likely it is that their results are purely down to chance.

However, the p-value kinda means the exact reverse of what a lot of people assume it means.

When scientists talk about results being “significant at the 5% level”, say, it sounds like this means there’s a 95% chance of a real connection. In this cartoon’s case, it sounds like the scientists are 95% certain of a link between green jelly beans and acne.

Applicants for James Randi’s million dollar challenge are required to meet rather more stringent criteria, but it’s often expressed the same way. For instance, a dowser might have to psychically deduce which of several sealed containers is the one with water in, and repeat it a number of times, so that the p-value becomes very small. They want to be certain there’s really something going on, and it’s not just chance, before the money will be handed over.

But the intuitive idea of what the p-value means in these cases isn’t quite right.

Here’s what you actually need to do. Assume that there is no connection between the things being tested – jelly beans don’t affect acne, and all psychics are just guessing. Then, what are the odds of getting results at least as persuasive as the ones you saw, purely by chance?

That’s your p-value.

So, a p-value of 5% tells us something useful. It means that the results you’ve got are kinda iffy, given what you’d usually expect, if there’s no deeper underlying pattern there. You’d only expect to see results this skewed about 1 time in 20, if you’re relying on randomness. So maybe something’s up.

But if you do a whole bunch of tests, like the jelly bean scientists did, once in a while you will get some iffy results like that just by chance.

Now, clearly one thing this tells us is to be wary of data which has been cherry-picked, like the jelly bean journalists did. There were lots of negative results being ignored, and a single positive outcome highlighted. But the implications for how we assess probabilities more generally are, I think, more interesting.

In particular, it tells us that how likely something is doesn’t just depend on this one set of results. If a 5% p-value means “we’re 95% sure of this”, then this one study has entirely determined your estimate of the likelihood. It fails to take on board any information about how likely or unlikely something seemed before you started – and often this information is really important.

For instance, say you were studying differences between smokers and non-smokers, and the rate at which they get cancer. Any good analysis of data along these lines should easily pass a 5% significance test. It’s a highly plausible link, given what we already know, and 95% sounds like a significant under-estimate of the likelihood of a correlation between smoking and cancer.

But now imagine you’ve done a different test. This time, you just put a bunch of people into two groups, with no information about whether they smoke, or anything else about them, and flipped a coin to decide which group each person would go into. And imagine you get the same, seemingly convincing results as the smoking study.

Are you now 95% convinced that your coin-tossing is either diagnosing or causing cancer in people you’ve never met?

I hope you’re not. I hope you’d check your methodology, look for sources of bias or other things that might have crept in and somehow screwed up your data, and ultimately put it down to a bizarre fluke.

And it makes sense to do that, in this case, even despite the data. The idea that you could accurately sort people by cancer risk simply by flipping a coin is utterly ridiculous. We’d give it virtually zero probability to begin with. The results of your study would nudge that estimate up a little, but not much. Random fluke is still far more likely. If multiple sources kept repeating the experiment and getting the same persuasive results, over and over… then maybe, eventually, the odds would shift so far that your magic coin actually became believable. But they probably won’t.

And this idea of shifting the probability of something, rather than fixing it firmly based on a single outcome, is at the heart of Bayesian probability.

This is something the great Eliezer Yudkowsky is passionate about, and I’m totally with him. That link’s worth a read, though someday I’d like to try and write a similar, even more gently accessible explanation of these ideas for the mathematically un-inclined. He does a great job, but the arithmetic starts to get a bit overwhelming at times.

And if the thrill of counter-intuitive mathematics isn’t enough to convince you that this is fascinating and important stuff, read this. And then this.

Short version: a number of women have been convicted and jailed for murdering their children, then later released when somebody actually did some better statistics.

The expert witness for the prosecution in these trials estimated that the odds of two children in the same family both dying of cot death was 1 in 73,000,000. General population data puts the overall rate of cot deaths at around 1 in 8,500, so multiplying the 8,500s together gives the 1 in 73,000,000 figure for the chance of it happening twice. This was presented as the probability that the children could have died by accident, and thus it was assumed to be overwhelmingly likely that they were in fact deliberately killed.

But, as we learned with the cancer stuff earlier, we should consider these substantial odds against our prior assessment of how likely it is that these women would murder their children. This should start off minuscule, because very few women do murder their children. The fact that both their children died should make us adjust our likelihood estimate up a way – someone with two dead children is a more likely candidate for a child murderer than someone whose offspring are alive and well, after all – but it’s still far from conclusive.

Another way of expressing the central point of Bayesian probability is to consider the probability of A given B, for two events A and B. In this case, the odds of two children randomly picked from the population both dying of cot death may well be around 1 in 73,000,000 – but given that the children you’re considering both died in infancy, and were both siblings and so might have genetic or environmental factors in common, the cot death scenario becomes far more likely.

I wanted to expand on that last point some more, and touch on some other interesting things, but I’m hungry and you’re bored.

Ha. I said “briefly”. Classic.

Read Full Post »

Damned lies

I have no idea what to think about polls like this.

41% of Americans questioned believe that the second coming of Jesus “probably/definitely will happen” in the next forty years. Is that more or less than I should have expected? The proportion goes up among evangelicals, predictably enough, and is relatively low among “mainline” white Protestants – but even then, it’s over 25%. More than one person in every four who think Jesus is going to come back within my lifetime.

In fact, given that a recent Gallup poll had only 78% of Americans identifying as Christian, this implies that more than half of American Christians are expecting the second coming of Christ really, really soon.

Perhaps most baffling is the 1 in 5 “religiously unaffiliated” who share this belief. I must be failing to account for some significant number of non-Protestant, non-Catholic Christians, because how do you believe in the imminent second coming of Christ while not even being a Christian? Are that mysterious 20% all just big fans of zombie movies, who think that a rabbi from two thousand years ago will be among the dead walking the earth and hungry for brains?

But in the same poll, 65% thought that “religion in the United States will be about as important as it is now in 40 years”. 30% say it will be less important.

So, I suppose some people might think they’re seeing an increased secularisation in America today, and predict that this will continue (though I imagine most of them would consider this a bad thing). But I’m surprised there aren’t more people thinking it would be more important. (Was that even an option in the answers?)

In particular, taking into account the 41% figure from earlier, a lot of people apparently think that religion isn’t going to be any more important a factor in American life than it is now, even though Jesus will have come back.

Maybe he’s not planning to make much of a fuss. I haven’t read the book of Revelation, but I should think that the second coming of the son of God is expected to be a fairly low-key affair that won’t shake people’s lives up all that much.

Aren’t statistics fun?

(h/t Atheist Revolution)

Read Full Post »

Older Posts »

%d bloggers like this: