Archive for the ‘maths’ Category

I’ve stopped being round, and I’m back in my prime.

At least, in a numerical sense. Physically, after the lunch and cake assortment laid on by my mother-in-law this afternoon, I’m still feeling pretty much spherical.

I’m also a Mersenne prime again as of today. The last time that was the case, the Cold War wasn’t over, Charles and Diana were still making a go of it, Mike Tyson wasn’t a convicted rapist, and Sonic The Hedgehog 2 and Disney’s Aladdin were both yet to rock the word with their cultural impact.

So much has changed, in merely the time it takes to go from 2n-1 to 2n+1-1. By the time 2n+2-1 rolls around… I can only wonder what brave new world awaits.

On with the year, then.

Read Full Post »

I am currently sandwiched between a pair of prime twins. The last time that was true, I was barely legal. The next time it happens, I’ll be the Ultimate Answer.

I’m also the product of a series of the first consecutive primes. That hasn’t happened in 4! years. And if I want to see it happen again, Aubrey de Grey is going to have to step up his game.

The other thing I tend to do on my birthday is look at how much other historically successful people had achieved by whatever age I’ve reached now, but I think I’m done with that. I mean, of course there’s plenty of people who’d achieved all kinds of hugely impressive stuff by my age. I’m 30. I’m a proper, legitimate grown-up. There’s no point continuing to compare my own accomplishments to those of the most prominently well known individuals in various creative fields throughout all of history, as if there were some kind of expectation on me to live up to some equivalent level.

So I’m just going to go get drunk instead. Seeya.

Read Full Post »


Regular readers will be familiar with this annual tradition.

As of today, I am no longer perfect. And, short of some dramatic advances in life-extending technology, I never will be again.

But worry not, because I’m back in my prime. It’s been a while. The last prime year I had, I spent walking out of my really well paid but horribly unsuitable insurance job, mooching around getting nothing done for a few months, then working in a psychiatric hospital. I’d only just started living in my own flat, enjoying some space I could call completely my own for the very first time. For the months when I didn’t have a job, I was alone a lot. It was pretty awesome.

This year I’m getting married. Things have changed. This is better.

The last time I was part of a twin prime, I was moving out of halls and into my student flat in Exeter, but there’s no need to regress that far.

I’m also Tetranacci this year, which I hadn’t been since boarding school, and which I haven’t heard of before today. Nice.

Next year I’ll be a pyramid, but I’ll also be quite round. Paradox THAT, bitches.

Oh, and apparently I’ll be semi-perfect, so that’s something to look forward to.

Meanwhile, Kirsty’s shortly going to stop being highly powerful and very binarily round, and instead become extremely three. We won’t both simultaneously be in our prime until we’ve been married for nearly eight years.


Read Full Post »

Here’s a thorough examination of federal revenue data which shows just how unfair it is for all those horrible poor people to go without paying any taxes at all, and how rough America’s billionaires have it as a result.

Oh, wait.

It actually further undermines the efforts to paint the less wealthy half of America as leeches, which I mentioned a little while ago.

As you actually look at the data, the sound bite of “Half of Americans don’t pay any taxes” becomes “Half of adult Americans don’t pay federal income taxes”, and then “18% of adult Americans don’t pay any federal taxes”, before finally arriving at something which might accurately describe the “freeloading” situation:

Once you count those who pay income taxes, those who pay payroll taxes, the elderly (who are more likely to be retired) and those earning less than $20,000 a year, fewer than 1% of American adults remain who aren’t contributing.

(That poverty line of $20,000, by the way, is less than 0.2% of the total annual income of the average CEO on Standard & Poor’s 500 Index.)

And that’s just federal taxes. The number of people genuinely not paying any taxes goes down even further if you assume that they ever buy stuff.

Absolutely nothing about the idea that the lowest earners in the country are the ones who should be paying the greatest price for the debt crisis adds up. Pretty much across the board, the richer you are, the smaller the proportion of your income that the government takes from you.

And yet, in some parts of the media, the cries of class warfare only seem to go one way.

Read Full Post »

Ben Goldacre’s got a fab example of misleading statistics, and the ways in which you can learn to think about things to avoid jumping to a wrong conclusion.

Look at his first nerdy table of data on that article. All they’ve done is take a bunch of people who drink alcohol, and a bunch who don’t, and counted how many from each group ended up with lung cancer. It turns out that the drinkers are more likely to get lung cancer than the non-drinkers.

The obvious conclusion – and (spoiler alert) the wrong one – is that drinking alcohol somehow puts you at greater risk of developing lung cancer. You might conclude, from that table, that if you currently drink alcohol, you can reduce your risk of developing cancer by no longer drinking alcohol, thus moving yourself to the safer “non-drinkers” group.

This is actually a fine example of the Bad Science mantra, and Ben makes an important point which many non-nerds might not naturally appreciate about statistics: the need to control for other variables.

If drinking doesn’t give you cancer, then why do drinkers get more cancer? The other two tables offer a beautiful explanation. Of all the drinkers and non-drinkers originally counted, try asking them another question: whether or not they smoke cigarettes. What you get when you do that is the next two tables.

If you just look at the smokers, then the chances of a drinker and a non-drinker getting lung cancer are almost exactly the same. If you look only at the non-drinkers, ditto. In other words, once you know whether someone smokes cigarettes, whether or not they drink makes no difference to their odds of getting lung cancer.

Which is a long way away from the obvious conclusion we were tempted to draw from the first set of data.

What we did here was to control for another variable – namely smoking – before drawing sweeping conclusions from the data. When we give smokers and non-smokers their own separate tables, it means that smoking cigarettes isn’t unfairly weighing the data we’ve already got any more. It becomes clear that drinkers aren’t simply more likely to get cancer; they’re more likely to be smokers.

And although Ben’s right to point out the importance of controlling for other variables like this, what interests me is the reminder of the importance of Bayesian probability.

In particular, the thing to remember is that the probability of an event is a measure of your uncertainty, and not something inherent in the event itself.

For instance, if that first table is all the data you have, then all you know is that drinkers are more at risk of cancer than non-drinkers. If you were to estimate somebody’s odds of getting lung cancer, and the only thing you knew about them is that they’re a drinker, the best you could do is to place it at 16% – the amount of drinkers who developed lung cancer in the study.

If you later acquire the extra data in the second tables, and find out that the individual you’re interested in is not a smoker, then suddenly you can re-adjust your estimate, and give them about a 3% chance of getting lung cancer. They haven’t done anything differently; nothing about their situation has changed for them to suddenly appear much more healthy. You’ve just learned more about them.

And it’s still not true that their odds of developing cancer are exactly 3% in any objective sense. Maybe tomorrow you’ll learn something about their age, or gender, or family history, and adjust your estimate again based on the new data. Maybe you don’t know that a doctor actually diagnosed them with lung cancer yesterday. This, obviously, makes a huge difference to their odds of having lung cancer – but it doesn’t change the fact that they’re in a low-risk group, and a 3% estimate is the best you can do based on your current knowledge.

In conclusion: stats are hard, listen to maths geeks (or become one yourself) before panicking about the latest tabloid healthscare.

Read Full Post »

Don’t run away.

This post is going to be about maths and probabilit


There was a scientific paper recently published, in a respected academic journal, which purported to demonstrate evidence of human precognition.

Yep, science says people can tell the future.

Except, not really. Not yet, anyway. As the study’s author, psychology research Daryl Bem, said himself in the published paper, it was important for other scientists to repeat the experiment, and see if they got the same results. Richard Wiseman has been among those involved in such attempted replications, which so far have failed to support Bem’s original conclusion.

There’s a big moan I’m not quite in the mood to make, about how science generally gets publicised in the media, and the tabloids’ tendency to make a massive fuss over preliminary results, without concerning themselves with facts which later emerge and completely undermine their sensationalist headlines.

But I want to talk about the maths.

Replication is always important in science, particularly where the results look unlikely, or demonstrate something completely new. This is partly because, for all we know, Bem’s original research could have been dishonest or deeply flawed. Most people seem to consider both of these unlikely, though, and I’m certainly not suggesting that he’s faked his results.

But people often seem to assume that these are the only two options: that positive results must mean either an important and revolutionary breakthrough, or very bad science. The idea that something could just happen “by chance” now and then never seems to get much credibility.

Almost every time someone in a TV show or a movie proclaims something to be “just a coincidence”, or that there’s a “perfectly rational explanation”, we’re meant to take it as an ultra-rationalist denial of the obvious – usually supernatural – facts. Remarkable coincidences just don’t happen in the way that ghosts and werewolves obviously do. In fictional drama, there are good reasons for this. In the real world, this is a severe misunderstanding of probability.

When deciding whether or not to get excited about a result, scientists often look for significance “at the 5% level”. Bem’s results, supporting his precognition hypothesis, were significant at this level. But this does not mean, as you might think, that there’s only a 5% chance of the hypothesis being wrong.

What it means is: there would be a 5% chance of getting results this good, just by chance, if people aren’t really psychic.

So, getting results like this – statistically significant at the 5% level – is actually slightly less impressive than rolling a double-six. (If you have two regular six-sided dice, the odds of both landing on 6 on a single roll is 1 in 36, which is slightly less than 3%.)

I’ve rolled plenty of double-sixes. If you’ve rolled a lot of dice, so have you. And if you do a lot of science, you’d expect just as many random chance results to look significant.

So, if you’re thinking that we should probably ask for something a bit more conclusive than a double-six roll before accepting hitherto unconfirmed magic powers, you’re probably right.

This is the essence of Bayesian probability. Imagine having one of the following two conversations with a friend who has two dice:
“These are loaded dice, weighted to always land on a double-six. Watch.”

“Huh, so they are. Neat.”
“I’m going to use my psychic powers to make these dice land on double-six. Watch.”

“…Okay, that’s a little spooky, but you could’ve just got lucky. Do it again.”
You see why you might not believe it right away when your friend claims something really outlandish? But when it was something pretty normal, you’d be more likely to buy it?

In either case, the odds of rolling sixes by chance were exactly the same, 1 in 36, independent of what was allegedly influencing the outcome. But that doesn’t mean you should be equally convinced in either case when the same result comes up.

Both claims become more likely when the double-six is thrown. After all, if the dice really are loaded (or psychically influenced), then what you’ve just seen is exactly what you’d expect to see. But they’re not both getting more likely from the same starting point. One started out as a much more plausible claim than the other, and it’s still more plausible now.

Loaded dice? Sure, they have those. Telekinesis? Well, you have my attention, but let’s see you do it again. And again. And a dozen more times with a fresh set of dice.

This is part of my recurring, occasional project to convince the world that Bayesian probability is both important and intuitive, when it’s expressed right.

Ben Goldacre wrote about Bem’s research, the New Scientist also discussed it, there are some details of the replication attempts at The Psychologist, and I was prodded into thinking about all this in some more depth by a recent episode of the Righteous Indignation podcast.

Read Full Post »

Yay, another maths lecture!

Click through to see the whole cartoon at XKCD. Really do it. It’s important. Especially if you want the rest of my burblings to make sense.

So. It’s partly funny because it satirises the sensationalism of tabloid news, and the urge to cram as much excitement into a headline as possible only to leave a sober assessment of actual facts to the blogosphere. But it actually addresses a much more common problem with our understanding of probability.

Most people who pay much attention to any kind of sciencey talk are probably familiar with the p-values referenced in the comic. When scientists are testing a hypothesis, they’ll often check whether the p-value (p for probability) of the results from their experiments is less than 5%. The smaller the p-value is, the less likely it is that their results are purely down to chance.

However, the p-value kinda means the exact reverse of what a lot of people assume it means.

When scientists talk about results being “significant at the 5% level”, say, it sounds like this means there’s a 95% chance of a real connection. In this cartoon’s case, it sounds like the scientists are 95% certain of a link between green jelly beans and acne.

Applicants for James Randi’s million dollar challenge are required to meet rather more stringent criteria, but it’s often expressed the same way. For instance, a dowser might have to psychically deduce which of several sealed containers is the one with water in, and repeat it a number of times, so that the p-value becomes very small. They want to be certain there’s really something going on, and it’s not just chance, before the money will be handed over.

But the intuitive idea of what the p-value means in these cases isn’t quite right.

Here’s what you actually need to do. Assume that there is no connection between the things being tested – jelly beans don’t affect acne, and all psychics are just guessing. Then, what are the odds of getting results at least as persuasive as the ones you saw, purely by chance?

That’s your p-value.

So, a p-value of 5% tells us something useful. It means that the results you’ve got are kinda iffy, given what you’d usually expect, if there’s no deeper underlying pattern there. You’d only expect to see results this skewed about 1 time in 20, if you’re relying on randomness. So maybe something’s up.

But if you do a whole bunch of tests, like the jelly bean scientists did, once in a while you will get some iffy results like that just by chance.

Now, clearly one thing this tells us is to be wary of data which has been cherry-picked, like the jelly bean journalists did. There were lots of negative results being ignored, and a single positive outcome highlighted. But the implications for how we assess probabilities more generally are, I think, more interesting.

In particular, it tells us that how likely something is doesn’t just depend on this one set of results. If a 5% p-value means “we’re 95% sure of this”, then this one study has entirely determined your estimate of the likelihood. It fails to take on board any information about how likely or unlikely something seemed before you started – and often this information is really important.

For instance, say you were studying differences between smokers and non-smokers, and the rate at which they get cancer. Any good analysis of data along these lines should easily pass a 5% significance test. It’s a highly plausible link, given what we already know, and 95% sounds like a significant under-estimate of the likelihood of a correlation between smoking and cancer.

But now imagine you’ve done a different test. This time, you just put a bunch of people into two groups, with no information about whether they smoke, or anything else about them, and flipped a coin to decide which group each person would go into. And imagine you get the same, seemingly convincing results as the smoking study.

Are you now 95% convinced that your coin-tossing is either diagnosing or causing cancer in people you’ve never met?

I hope you’re not. I hope you’d check your methodology, look for sources of bias or other things that might have crept in and somehow screwed up your data, and ultimately put it down to a bizarre fluke.

And it makes sense to do that, in this case, even despite the data. The idea that you could accurately sort people by cancer risk simply by flipping a coin is utterly ridiculous. We’d give it virtually zero probability to begin with. The results of your study would nudge that estimate up a little, but not much. Random fluke is still far more likely. If multiple sources kept repeating the experiment and getting the same persuasive results, over and over… then maybe, eventually, the odds would shift so far that your magic coin actually became believable. But they probably won’t.

And this idea of shifting the probability of something, rather than fixing it firmly based on a single outcome, is at the heart of Bayesian probability.

This is something the great Eliezer Yudkowsky is passionate about, and I’m totally with him. That link’s worth a read, though someday I’d like to try and write a similar, even more gently accessible explanation of these ideas for the mathematically un-inclined. He does a great job, but the arithmetic starts to get a bit overwhelming at times.

And if the thrill of counter-intuitive mathematics isn’t enough to convince you that this is fascinating and important stuff, read this. And then this.

Short version: a number of women have been convicted and jailed for murdering their children, then later released when somebody actually did some better statistics.

The expert witness for the prosecution in these trials estimated that the odds of two children in the same family both dying of cot death was 1 in 73,000,000. General population data puts the overall rate of cot deaths at around 1 in 8,500, so multiplying the 8,500s together gives the 1 in 73,000,000 figure for the chance of it happening twice. This was presented as the probability that the children could have died by accident, and thus it was assumed to be overwhelmingly likely that they were in fact deliberately killed.

But, as we learned with the cancer stuff earlier, we should consider these substantial odds against our prior assessment of how likely it is that these women would murder their children. This should start off minuscule, because very few women do murder their children. The fact that both their children died should make us adjust our likelihood estimate up a way – someone with two dead children is a more likely candidate for a child murderer than someone whose offspring are alive and well, after all – but it’s still far from conclusive.

Another way of expressing the central point of Bayesian probability is to consider the probability of A given B, for two events A and B. In this case, the odds of two children randomly picked from the population both dying of cot death may well be around 1 in 73,000,000 – but given that the children you’re considering both died in infancy, and were both siblings and so might have genetic or environmental factors in common, the cot death scenario becomes far more likely.

I wanted to expand on that last point some more, and touch on some other interesting things, but I’m hungry and you’re bored.

Ha. I said “briefly”. Classic.

Read Full Post »

Today is Albert Einstein’s birthday, and also Pi Day.

But that link’s currently broken, so maybe you should just listen to pi in musical form instead.

Except, The Tau Manifesto makes a pretty good case.

And anyway, I’m English. We don’t write today’s date as 3/14. Technically my Pi Day should wait until the 31st April. (Or, as @thornae pointed out earlier, the 3rd of Decembruary.)

I’ve got a headache, so that’s all you’re getting from me for now.

Read Full Post »

Scott Adams, the cartoonist behind Dilbert, posted a thing yesterday.

He was considering a step-by-step argument, which seems to result in the likely conclusion that life on Earth was the result of a deliberate seeding operation by aliens. Read it through on his blog before deciding it’s nonsense. I’ve summarised it very coarsely, and it’s more lucidly reasoned out than you might think.

His point, though, was to ask his readers to spot the flaw in the logic, which he finds himself unable to do, despite assuming apparently a priori that there definitely is a flaw. He doesn’t lay out explicitly why he’s unconvinced by what seems to him like watertight reasoning, and you may in fact be in agreement with the conclusion yourself.

But, a few problems with it did occur to me as I was reading, so I thought I’d try fleshing them out here, in a purely speculative and thoroughly uninformed manner.

– Firstly, I think the principle of indifference may be being inappropriately applied.

This is a mathsy thing. The idea is, you can basically guess equally between a number of possibilities when you don’t know anything about what’s going on, and simply have a number of options presented to you. If I ask you to guess what playing card I just randomly picked out of a deck, for instance, you might just as well say the nine of diamonds as the seven of spades. Nothing stands out about any one option, so you can apply the principle of indifference, and treat them all as being equally likely.

But sometimes it’s inappropriately applied. One way I’ve seen this done before is to argue that our Universe is likely to be only a simulation. We think we live in a reality that really exists, but as we approach a time when it’s feasible to create a Matrix-like simulation in which conscious beings could live unawares, we have to consider that maybe we already exist in such a simulation.

But maybe the reality that’s simulating us is itself only a simulation, within a reality which is also only a simulation, and so on, Inception-style, with as many layers as you like. Then, the possibility that ours is the real reality, and we just haven’t created any universe simulations ourselves yet, is just one among indefinitely many. So (the fallacious argument goes) the odds on that being the case are vanishingly small.

The reason it’s not convincing is that all the various options – that our reality is real, or that we’re the first simulation, or the second, or the seventy-fourth – should not be treated as equally likely. The idea that our reality is real makes fewer assumptions about the plausibility or the existence of colossal universe-simulating machines, and can legitimately be given a greater weight than the other options.

Scott’s argument may suffer from the same false application of this principle. It says: we could soon be the first species ever to send spaceships to other planets and “seed” them with the building blocks of Earth-like life – or we could be one of many stops in an indefinitely long chain of other species which have already done that. That is, Earth may have been seeded by an alien civilisation, which itself was seeded by another, and another, and so on.

If you consider that we could be at any point in the chain, and treat them all with the principle of indifference, then it may seem unlikely that we just happen to be the first, “unseeded” life-forms in the cosmos. But there are different assumptions involved in “It’s already happened” than “It hasn’t happened yet”, and so, barring any other evidence which directly supports it, I don’t think we’re obliged to give the possibility that our own world was “seeded” so much weight.

– Because, don’t forget, there is no other evidence directly supporting the idea that this seeding is what’s happened here. However solid the arguments might be that it could happen, or that it’s virtually inevitable to happen with any life that reaches a certain threshold of intelligence, it’s all just speculation. Nothing wrong with that, but it’s not the same thing as empirical data. However unlikely you want to argue that the “unseeded Earth” possibility is, it’s entirely consistent with the current data, and it makes fewer assumptions about the Universe than the alternatives.

– I’m also not fully convinced that any intelligent life-forms would necessarily reach the point where this seeding of other worlds becomes both practical and desirable. There are various assumptions on which this rests, like our (or other life-forms’) ability to get that far technologically without destroying ourselves; the superior plausibility of the seeding option over any other methods for sustaining life; the eventual success of even a well planned seeding mission in giving rise to intelligent life again; and the timescale necessary for this to happen. (We have pretty good evidence that life on Earth has been evolving slowly for about a quarter of the age of the Universe. It can’t have happened that many times, going by this iteration rate.)

– We also have no idea how likely the possibility of alien life actually is. There’s so much uncertainty over so many variables of the Drake equation, that whether or not any other life has yet been able to arise anywhere else in the galaxy is still deeply contentious. A lot of things needed to be exactly right on Earth for life to get going and start becoming complex and interesting, and we don’t really know how rare those conditions are. The scenario of other aliens having got there before us is far from being a given.

Leave a comment if there are any more obvious points leaping out at you which demonstrate that one of us is going wrong.

Read Full Post »

Here is an advert pointing out the disparity between the Shell oil company’s recent profits, and the humanitarian results of their recent massive oil spill. Here is a statement from Amnesty expressing disappointment that the Financial Times newspaper decided not to run this advert. Here are Naomi McAuliffe’s thoughts.

– I’ve been hearing about Project Prevention a lot lately. They’re an organisation set up to help children born to drug-addicted mothers. The primary way they do this in the US is by offering addicts $300 to receive “long-term contraception”, which in some cases involves a form of sterilisation.

They’re coming to my attention because the woman behind it all, Barbara Harris, has come to the UK recently. And while I’m sure she’s filled with the best of intentions, I do not support this organisation.

I work in a substance misuse treatment centre. One of the nurses in my building is a Pregnancy Liaison, and works closely with a clinic at a local hospital to deal specifically with clients coming to us for treatment who are also pregnant. There are detailed protocols in place for handling this kind of thing, and I’ve typed up many assessments for substance-addicted women detailing their medical and psychiatric condition in the weeks before and after delivering a baby.

My point is that, in the UK, the NHS is kinda on this one already. It’s not totally escaped everyone’s notice that sometimes drug addicts have babies, and those babies might have problems that need medical support. If there’s good reason to support certain kinds of medical intervention to assist with this – such as long-term contraception – then why should this be done entirely independently by someone like Barbara Harris? Why should it not be integrated into the existing infrastructure?

It’s not at all clear that Project Prevention’s approach is based on good science or in their patients’ best interests. The fact that people have to be paid to submit to these treatments surely counts as a red flag that they’re not always the most healthy and sensible thing to do, otherwise why would they need such coaxing? And consider the first thing stated on their website’s page titled “Objectives”:

The main objective of Project Prevention is to reduce the number of substance exposed births to zero.

Maybe I’m being picky about bad writing more than anything else here, but I’d have thought that the main objective of a charitable medical organisation ought to be more along the lines of providing a high quality of support and care to as many patients as possible, rather than simply attempting to completely eradicate a certain type of behaviour.

It’d be like a family planning centre saying that their main objective was to reduce the number of abortions to zero. Sure, a world with no unwanted pregnancies might be a wonderful idea, but the focus of your activities should surely be to provide care where it’s needed.

So yeah. Not comfortable with this at all. The Northern Doctor is far more scathing.

– Nick Clegg gave a speech today about political reform. I’m cynical enough not to be falling over myself until I see some of this actually happening, and it’s disappointing not to see a repeal of the Digital Economy Act mentioned specifically. But hey, maybe something’ll come of it.

– And lastly, go watch my new favourite TED talk ever. This is so awesome. This is so awesome it almost makes me want to be a maths teacher. Seriously, I just love this guy and cannot fathom why he and people like him aren’t basically in charge of everything. Or at least everything to do with maths textbooks. I need to write about fun maths stuff here more often. So much of its unpopularity among kids is down to the dismal way it’s taught, and it’s tragically unfair.

Read Full Post »

Older Posts »

%d bloggers like this: