Feeds:
Posts
Comments

Posts Tagged ‘mathematics’

Ben Goldacre’s got a fab example of misleading statistics, and the ways in which you can learn to think about things to avoid jumping to a wrong conclusion.

Look at his first nerdy table of data on that article. All they’ve done is take a bunch of people who drink alcohol, and a bunch who don’t, and counted how many from each group ended up with lung cancer. It turns out that the drinkers are more likely to get lung cancer than the non-drinkers.

The obvious conclusion – and (spoiler alert) the wrong one – is that drinking alcohol somehow puts you at greater risk of developing lung cancer. You might conclude, from that table, that if you currently drink alcohol, you can reduce your risk of developing cancer by no longer drinking alcohol, thus moving yourself to the safer “non-drinkers” group.

This is actually a fine example of the Bad Science mantra, and Ben makes an important point which many non-nerds might not naturally appreciate about statistics: the need to control for other variables.

If drinking doesn’t give you cancer, then why do drinkers get more cancer? The other two tables offer a beautiful explanation. Of all the drinkers and non-drinkers originally counted, try asking them another question: whether or not they smoke cigarettes. What you get when you do that is the next two tables.

If you just look at the smokers, then the chances of a drinker and a non-drinker getting lung cancer are almost exactly the same. If you look only at the non-drinkers, ditto. In other words, once you know whether someone smokes cigarettes, whether or not they drink makes no difference to their odds of getting lung cancer.

Which is a long way away from the obvious conclusion we were tempted to draw from the first set of data.

What we did here was to control for another variable – namely smoking – before drawing sweeping conclusions from the data. When we give smokers and non-smokers their own separate tables, it means that smoking cigarettes isn’t unfairly weighing the data we’ve already got any more. It becomes clear that drinkers aren’t simply more likely to get cancer; they’re more likely to be smokers.

And although Ben’s right to point out the importance of controlling for other variables like this, what interests me is the reminder of the importance of Bayesian probability.

In particular, the thing to remember is that the probability of an event is a measure of your uncertainty, and not something inherent in the event itself.

For instance, if that first table is all the data you have, then all you know is that drinkers are more at risk of cancer than non-drinkers. If you were to estimate somebody’s odds of getting lung cancer, and the only thing you knew about them is that they’re a drinker, the best you could do is to place it at 16% – the amount of drinkers who developed lung cancer in the study.

If you later acquire the extra data in the second tables, and find out that the individual you’re interested in is not a smoker, then suddenly you can re-adjust your estimate, and give them about a 3% chance of getting lung cancer. They haven’t done anything differently; nothing about their situation has changed for them to suddenly appear much more healthy. You’ve just learned more about them.

And it’s still not true that their odds of developing cancer are exactly 3% in any objective sense. Maybe tomorrow you’ll learn something about their age, or gender, or family history, and adjust your estimate again based on the new data. Maybe you don’t know that a doctor actually diagnosed them with lung cancer yesterday. This, obviously, makes a huge difference to their odds of having lung cancer – but it doesn’t change the fact that they’re in a low-risk group, and a 3% estimate is the best you can do based on your current knowledge.

In conclusion: stats are hard, listen to maths geeks (or become one yourself) before panicking about the latest tabloid healthscare.

Advertisements

Read Full Post »

Yay, another maths lecture!

Click through to see the whole cartoon at XKCD. Really do it. It’s important. Especially if you want the rest of my burblings to make sense.

So. It’s partly funny because it satirises the sensationalism of tabloid news, and the urge to cram as much excitement into a headline as possible only to leave a sober assessment of actual facts to the blogosphere. But it actually addresses a much more common problem with our understanding of probability.

Most people who pay much attention to any kind of sciencey talk are probably familiar with the p-values referenced in the comic. When scientists are testing a hypothesis, they’ll often check whether the p-value (p for probability) of the results from their experiments is less than 5%. The smaller the p-value is, the less likely it is that their results are purely down to chance.

However, the p-value kinda means the exact reverse of what a lot of people assume it means.

When scientists talk about results being “significant at the 5% level”, say, it sounds like this means there’s a 95% chance of a real connection. In this cartoon’s case, it sounds like the scientists are 95% certain of a link between green jelly beans and acne.

Applicants for James Randi’s million dollar challenge are required to meet rather more stringent criteria, but it’s often expressed the same way. For instance, a dowser might have to psychically deduce which of several sealed containers is the one with water in, and repeat it a number of times, so that the p-value becomes very small. They want to be certain there’s really something going on, and it’s not just chance, before the money will be handed over.

But the intuitive idea of what the p-value means in these cases isn’t quite right.

Here’s what you actually need to do. Assume that there is no connection between the things being tested – jelly beans don’t affect acne, and all psychics are just guessing. Then, what are the odds of getting results at least as persuasive as the ones you saw, purely by chance?

That’s your p-value.

So, a p-value of 5% tells us something useful. It means that the results you’ve got are kinda iffy, given what you’d usually expect, if there’s no deeper underlying pattern there. You’d only expect to see results this skewed about 1 time in 20, if you’re relying on randomness. So maybe something’s up.

But if you do a whole bunch of tests, like the jelly bean scientists did, once in a while you will get some iffy results like that just by chance.

Now, clearly one thing this tells us is to be wary of data which has been cherry-picked, like the jelly bean journalists did. There were lots of negative results being ignored, and a single positive outcome highlighted. But the implications for how we assess probabilities more generally are, I think, more interesting.

In particular, it tells us that how likely something is doesn’t just depend on this one set of results. If a 5% p-value means “we’re 95% sure of this”, then this one study has entirely determined your estimate of the likelihood. It fails to take on board any information about how likely or unlikely something seemed before you started – and often this information is really important.

For instance, say you were studying differences between smokers and non-smokers, and the rate at which they get cancer. Any good analysis of data along these lines should easily pass a 5% significance test. It’s a highly plausible link, given what we already know, and 95% sounds like a significant under-estimate of the likelihood of a correlation between smoking and cancer.

But now imagine you’ve done a different test. This time, you just put a bunch of people into two groups, with no information about whether they smoke, or anything else about them, and flipped a coin to decide which group each person would go into. And imagine you get the same, seemingly convincing results as the smoking study.

Are you now 95% convinced that your coin-tossing is either diagnosing or causing cancer in people you’ve never met?

I hope you’re not. I hope you’d check your methodology, look for sources of bias or other things that might have crept in and somehow screwed up your data, and ultimately put it down to a bizarre fluke.

And it makes sense to do that, in this case, even despite the data. The idea that you could accurately sort people by cancer risk simply by flipping a coin is utterly ridiculous. We’d give it virtually zero probability to begin with. The results of your study would nudge that estimate up a little, but not much. Random fluke is still far more likely. If multiple sources kept repeating the experiment and getting the same persuasive results, over and over… then maybe, eventually, the odds would shift so far that your magic coin actually became believable. But they probably won’t.

And this idea of shifting the probability of something, rather than fixing it firmly based on a single outcome, is at the heart of Bayesian probability.

This is something the great Eliezer Yudkowsky is passionate about, and I’m totally with him. That link’s worth a read, though someday I’d like to try and write a similar, even more gently accessible explanation of these ideas for the mathematically un-inclined. He does a great job, but the arithmetic starts to get a bit overwhelming at times.

And if the thrill of counter-intuitive mathematics isn’t enough to convince you that this is fascinating and important stuff, read this. And then this.

Short version: a number of women have been convicted and jailed for murdering their children, then later released when somebody actually did some better statistics.

The expert witness for the prosecution in these trials estimated that the odds of two children in the same family both dying of cot death was 1 in 73,000,000. General population data puts the overall rate of cot deaths at around 1 in 8,500, so multiplying the 8,500s together gives the 1 in 73,000,000 figure for the chance of it happening twice. This was presented as the probability that the children could have died by accident, and thus it was assumed to be overwhelmingly likely that they were in fact deliberately killed.

But, as we learned with the cancer stuff earlier, we should consider these substantial odds against our prior assessment of how likely it is that these women would murder their children. This should start off minuscule, because very few women do murder their children. The fact that both their children died should make us adjust our likelihood estimate up a way – someone with two dead children is a more likely candidate for a child murderer than someone whose offspring are alive and well, after all – but it’s still far from conclusive.

Another way of expressing the central point of Bayesian probability is to consider the probability of A given B, for two events A and B. In this case, the odds of two children randomly picked from the population both dying of cot death may well be around 1 in 73,000,000 – but given that the children you’re considering both died in infancy, and were both siblings and so might have genetic or environmental factors in common, the cot death scenario becomes far more likely.

I wanted to expand on that last point some more, and touch on some other interesting things, but I’m hungry and you’re bored.

Ha. I said “briefly”. Classic.

Read Full Post »

Today is Albert Einstein’s birthday, and also Pi Day.

But that link’s currently broken, so maybe you should just listen to pi in musical form instead.

Except, The Tau Manifesto makes a pretty good case.

And anyway, I’m English. We don’t write today’s date as 3/14. Technically my Pi Day should wait until the 31st April. (Or, as @thornae pointed out earlier, the 3rd of Decembruary.)

I’ve got a headache, so that’s all you’re getting from me for now.

Read Full Post »

Scott Adams, the cartoonist behind Dilbert, posted a thing yesterday.

He was considering a step-by-step argument, which seems to result in the likely conclusion that life on Earth was the result of a deliberate seeding operation by aliens. Read it through on his blog before deciding it’s nonsense. I’ve summarised it very coarsely, and it’s more lucidly reasoned out than you might think.

His point, though, was to ask his readers to spot the flaw in the logic, which he finds himself unable to do, despite assuming apparently a priori that there definitely is a flaw. He doesn’t lay out explicitly why he’s unconvinced by what seems to him like watertight reasoning, and you may in fact be in agreement with the conclusion yourself.

But, a few problems with it did occur to me as I was reading, so I thought I’d try fleshing them out here, in a purely speculative and thoroughly uninformed manner.

– Firstly, I think the principle of indifference may be being inappropriately applied.

This is a mathsy thing. The idea is, you can basically guess equally between a number of possibilities when you don’t know anything about what’s going on, and simply have a number of options presented to you. If I ask you to guess what playing card I just randomly picked out of a deck, for instance, you might just as well say the nine of diamonds as the seven of spades. Nothing stands out about any one option, so you can apply the principle of indifference, and treat them all as being equally likely.

But sometimes it’s inappropriately applied. One way I’ve seen this done before is to argue that our Universe is likely to be only a simulation. We think we live in a reality that really exists, but as we approach a time when it’s feasible to create a Matrix-like simulation in which conscious beings could live unawares, we have to consider that maybe we already exist in such a simulation.

But maybe the reality that’s simulating us is itself only a simulation, within a reality which is also only a simulation, and so on, Inception-style, with as many layers as you like. Then, the possibility that ours is the real reality, and we just haven’t created any universe simulations ourselves yet, is just one among indefinitely many. So (the fallacious argument goes) the odds on that being the case are vanishingly small.

The reason it’s not convincing is that all the various options – that our reality is real, or that we’re the first simulation, or the second, or the seventy-fourth – should not be treated as equally likely. The idea that our reality is real makes fewer assumptions about the plausibility or the existence of colossal universe-simulating machines, and can legitimately be given a greater weight than the other options.

Scott’s argument may suffer from the same false application of this principle. It says: we could soon be the first species ever to send spaceships to other planets and “seed” them with the building blocks of Earth-like life – or we could be one of many stops in an indefinitely long chain of other species which have already done that. That is, Earth may have been seeded by an alien civilisation, which itself was seeded by another, and another, and so on.

If you consider that we could be at any point in the chain, and treat them all with the principle of indifference, then it may seem unlikely that we just happen to be the first, “unseeded” life-forms in the cosmos. But there are different assumptions involved in “It’s already happened” than “It hasn’t happened yet”, and so, barring any other evidence which directly supports it, I don’t think we’re obliged to give the possibility that our own world was “seeded” so much weight.

– Because, don’t forget, there is no other evidence directly supporting the idea that this seeding is what’s happened here. However solid the arguments might be that it could happen, or that it’s virtually inevitable to happen with any life that reaches a certain threshold of intelligence, it’s all just speculation. Nothing wrong with that, but it’s not the same thing as empirical data. However unlikely you want to argue that the “unseeded Earth” possibility is, it’s entirely consistent with the current data, and it makes fewer assumptions about the Universe than the alternatives.

– I’m also not fully convinced that any intelligent life-forms would necessarily reach the point where this seeding of other worlds becomes both practical and desirable. There are various assumptions on which this rests, like our (or other life-forms’) ability to get that far technologically without destroying ourselves; the superior plausibility of the seeding option over any other methods for sustaining life; the eventual success of even a well planned seeding mission in giving rise to intelligent life again; and the timescale necessary for this to happen. (We have pretty good evidence that life on Earth has been evolving slowly for about a quarter of the age of the Universe. It can’t have happened that many times, going by this iteration rate.)

– We also have no idea how likely the possibility of alien life actually is. There’s so much uncertainty over so many variables of the Drake equation, that whether or not any other life has yet been able to arise anywhere else in the galaxy is still deeply contentious. A lot of things needed to be exactly right on Earth for life to get going and start becoming complex and interesting, and we don’t really know how rare those conditions are. The scenario of other aliens having got there before us is far from being a given.

Leave a comment if there are any more obvious points leaping out at you which demonstrate that one of us is going wrong.

Read Full Post »

Simpson’s Paradox

I suck at weekends. I’ve done nothing useful today. But something earlier reminded me about this, and for lack of anything else worth saying I’m going to talk about maths some more. I say bug humbah to your Hallowe’en malarkey. If you want spooky monsters and candy, go bother someone else. At my house, you get a lecture on algebra.

Simpson’s paradox is one of those really weird quirks of mathematics, which more people could do with understanding. It’s not even enormously complicated – the deep maths behind it can get pretty weird, but it’s really easy to appreciate how bizarrely counter-intuitive this stuff can be.

So, the paradox, and an example lifted straight from Wikipedia.

Some medical research happened a while back, into treatment for kidney stones. They took 700 people, split them into two groups, and tested a different treatment on each group. Treatment A worked on 273 out of 350 people in the first group, a success rate of 78%. Treatment B worked on 289 out of 350, or 83%.

So Treatment B works better, right?

Well, it turns out there are two different types of kidney stones. Broadly speaking, you can divide them into the “small” kind, and the “large” kind. So, even though Treatment B works better overall, maybe Treatment A is better for either small or large ones specifically. Right?

Well, half-right.

In fact, they found that Treatment A worked 93% of the time on small stones, while Treatment B worked 87% of the time. Meanwhile, with large stones, Treatment A hits 73% to Treatment B’s 69%.

So, for small kidney stones, Treatment A works demonstrably better than Treatment B. And for large kidney stones, A is still more successful than B. Treatment A actually works better in both individual cases.

But for kidney stones in general, Treatment B has a better overall success rate.

I’m a pretty intelligent person who studied mathematics more than anything else in life until I was 22, and I still don’t know how the fuck that works.

I mean, I understand all the maths behind it, it just still hurts my head. So now I’m going to go lie down. (This may also be related to the fact that it’s midnight now.)

Read Full Post »

Numbers and lies

Oh god I love this thread at FAIL Blog so much. You’re probably not a maths geek, so it might not mean much to you, but think about how much fun it is when people who don’t quite know what they’re talking about are convinced they do. Then apply that to a field of study in which absolute truth exists, and any answer or way of doing things is either definitely right or definitely wrong.

I know the actual calculus problem isn’t the point of the fail, but since when does that stop me? I solved it after a minute’s scribbling on a post-it, and got it right, because it’s not actually that complicated a question. What was more interesting was figuring out exactly how the over-confident engineering majors near the start of the discussion came up with their wrong answer. And I pretty quickly figured out what they’d done, and it’s quite funny. But only because I’m a real geek.

It’s interesting because they’re doing some moderately high-end maths, beyond the level most people would have studied to, but at the same time the mistakes they’re making indicate a fundamental lack of understanding about how differential calculus works. And that’s a perfectly okay thing to lack – I know a lot of fine, upstanding citizens with no concept of how differential calculus works at all, and I wouldn’t think to count it against them. But they have the sense not to go on internet message boards and try to teach people maths.

Also, I’ve started a new blog, which I’m planning to post to every weekday, as well as this one. It’s mostly a writing exercise for me, but I may start trying to get it noticed a bit too, now that it’s been going a week and I’m fairly sure my interest isn’t going to just fizzle out. It’s called The Daily Half-Truth, and the idea is to write weird and surreal news stories based on actual topical events, but with some strange and entirely fictional quirks. I’m having fun with it.

Okay, I think I’m done. Have fun noticing Hallowe’en. I’ll be probably not doing that.

Read Full Post »

Let’s get one thing straight first of all. Animals are stupid.

Oh, don’t look at me like that. It’s not like it isn’t obviously true, and they’re too dumb to know they’re being insulted anyway. Even the ones I like are complete idiots. I’ve seen two-year-old kids who can talk better than any cat; I’ve watched dogs repeatedly fall for the same trick where I pretend to throw a ball, and every time they bounce away with moronic excitement chasing after nothing; we all know how terrible monkeys are at trying to move a piano; and don’t get me started on the legendary inability of voles to solve even the most rudimentary cryptic crosswords, no matter how simply and slowly you explain it to them.

I’ll admit that they’re not universally inept. Many of them can capture and tear apart a fast-moving hunk of raw meat more efficiently than I’m ever likely to; they’re often enviably cute; and those spiders which can leap out and grab something faster than you can blink are pretty cool. But in general, the point stands.

Our mighty human brains are the reason we’ve so easily and inevitably wrenched control of the world from Mother Nature’s puny green fingers, and the only time we ever deign to be impressed with the intelligence of one of her lesser creatures is when we’re patronisingly judging them by their usual standards of dumb-assery. We’re amazed whenever they show any slight proficiency for a skill at which every human is assumed to be naturally capable. This is why things like dolphins cleaning their tank, cats leaning not to crap in your shoe, or a horse being able to count to five by clopping his hoof cause such a stir.

Thing is, even then we’re giving them too much credit.

Clever Hans was a horse that wowed audiences in late 19th century Germany, by tapping out the answers to some really easy maths problems. Someone would ask the horse, say, “What’s three plus two?” and he would tap his hoof five times. I mean, I’ve seen four-year-old humans solving quadratic equations, but whatever.

Okay, so I am being overly disparaging. The maths is hardly impressive, but if a horse can really understand human words, and the syntax which holds them together in a sentence, that would be worth knowing. You’d start being more careful what you said around them, if you knew they might actually understand it, and be able to use their hooves to gossip about you later in Morse code or something. So, it caught people’s attention, because nobody had previously known of any animals that could do this, even if it does credit a simpleton quadruped way too highly naming it “Clever” for being able to add single-digit numbers.

But it caught a few scientists’ attention too, and those scientists started doing what scientists will tend to do when a new discovery is supposedly made – sticking their noses in further than anyone invited them and trying to see how true it is.

They wondered, not unreasonably, whether Hans mightn’t be getting his hoof-tapping cues from somewhere other than his unprecedented equine cognitive powers. No horse had ever shown any signs of this level of mental acuity before, or even anything close. I mean, look at how some of these questions were phrased: “If the eighth day of the month comes on a Tuesday, what is the date of the following Friday?” Now granted, as far as the mathematics goes, we’re still about on a par with modern GCSE papers. But that’s some fairly sophisticated sentence structure there, with the conditional clause and everything, not to mention the background knowledge about our modern calendar that you’d need for it to make any sense. Humans are good at all this, but it’s something we still haven’t had much luck teaching computers to learn, and it’s more than has ever been observed in even the smartest monkeys. And some of those monkeys can put particularly stupid humans to shame. This was seriously big news, if the horse really was that clever.

So although it was possible that nobody had looked closely enough to notice such language skills in horses before, or that Hans was some kind of prodigy, it might be something simpler. Maybe his handler was subtly signalling for the horse to tap the requisite number of times, and all the horse was doing was following simple instructions. It wouldn’t necessarily have been noticed if this was the case – people probably weren’t paying much attention to the guy just hanging around with the wonder-steed. Maybe it was all just a cruel and cynical hoax, to win the hearts and loose change of gullible audiences.

Well… not exactly. It doesn’t look like anyone ever knowingly cheated to simulate Clever Hans’ talents. Even when someone other than his handler was asking the questions, his success rate was still impressive. But it turns out they didn’t need to be cheating. Hans was picking up cues, but not intentional ones, and giving his answers solely based on the expectaions of his audience.

Remember that Hans wasn’t declaring his answer aloud, or writing down any unambiguous symbols. He would tap his foot, and again, and again, with a short pause between each time. One way to give an infallibly correct answer to any numerical question, without needing even a primitive understanding of mathematics, would be to start tapping, and somehow work out when you’re supposed to stop. If you have a captive audience eagerly watching your every move, and who do know exactly when you should stop to give the right answer to the problem, this might be possible. If you’ve asked Hans to calculate 3 + 2, your thoughts as you watch him might run along the lines of:

“Okay, let’s see if he can do this… One, two, good, you’re on the right track so far, three, still looking good, four, well done, almost there, this is a truly astonishing feat, don’t stop now… five! He’s done it! Is that it? He’s stopping there? Hurrah! This horse is a genius! Put him in charge of our country’s major financial institutions immediately!”

It seems likely that your body language and facial expression would have changed noticeably over the course of this internal dialogue, even if you didn’t do anything silly like leap to your feet applauding wildly the moment the fifth tap landed. And it seems that horses like Clever Hans can pick up on that kind of thing, and react accordingly.

What gave it away was when psychologist Oskar Pfungst, who was part of a genuine thing called the Hans Commission, checked what happened when Hans couldn’t see the person asking the questions. The success rate plummeted. When he couldn’t read the increasing tension on people’s faces as he neared the right point to stop, and the relief and relaxation that swept over them when he got there, he was just a horse tapping his foot and hoping it would be good enough to earn him another salt lick.

This is a good example of why, when establishing the validity of any claim, we need to do everything we can to be rigorously scientific about it. We’re going to end up wandering blindly down a completely fallacious route, if we don’t rule out any alternative explanation, from any source, in exactly the way that kooks and pseudoscientists and the delusioned always object to. It’s not a matter of “taking their word for it” that something’s really going on the way they describe, because even if they’re being completely honest (which a great deal of woo-merchants are), reality can always surprise you by being weird in a completely different way from how you expected. In this case, it seems that horses can infer a surprising amount of information from faces that peple don’t even know they’re making, which itself is actually pretty cool. (This curious phenomenon of subconscious non-verbal cues creeping in to provide misleading data has become known as the “Clever Hans effect”.) But there’s just no reason left to believe that the original story is true.

It’s not that Pfungst refused to be “open-minded”. He was open to the possibility of the claims about Hans being correct, but he didn’t completely and unthinkingly believe everything he was told straight away. He knew that a lot of the hype sounded unlikely, so he was also open to the idea that there might be a more mundane explanation. The bizarre and unprecedented claim was rejected, not because of “closed-mindedness”, but because of a complete lack of evidence. The evidence for the idea that horses can do sums has been stripped back to literally nothing. If we hadn’t been able to use science to do that, we’d still be stuck believing something ridiculous.

Of course, the science that blew his entire claim totally out of the water didn’t stop Wilhelm von Osten, the owner of the horse, from touring the country with him and continuing to make utterly baseless claims. This, in turn, is a good example of how retarded some people can get when they shut their basic critical faculties down in favour of not having to admit that they’ve ever been wrong.

Read Full Post »

%d bloggers like this: