I’ve mentioned a couple of times that Rebecca Watson has been carrying out something she’s calling The Great Apple Experiment over the past week or so. Inspired by some pretty terrible tabloid-friendly pseudoscience in the Daily Fail, Rebecca has been dutifully re-creating the initial “experiment” that Nikki Owen apparently thinks demonstrates that an apple will respond to being spoken to in a loving or hateful way, or to having loving or hateful words written on its container.
She’s kept a log of her methodology on her YouTube account, and recently unveiled the final state of all three of her apple chunks – one treated with love, one with hate, and one control treated indifferently. Her loyal followers then voted on which sample they thought had degraded the most, and which the least, and some mathemagic was done to the poll numbers to decide whether any apples’ feelings had genuinely been hurt in this experiment.
The reason for the title of this blogpost is that I’ve been doing a similar thing myself. I haven’t been recording my progress as extensively all week, and I’ve been a little more lax with my protocols, but, like Rebecca, I’m planning to tighten this up and repeat the experiment in the near future. I’m ready to unveil the results of my own Meh Apple Experiment here today.
Here’s what my three pieces of apple looked like before being sealed in little transparent jars from the 99p Store for a week:
Now, before you scroll past this next picture, take a look at how they turned out after a week, and make up your mind which had degraded the most, and which the least. Then I’ll tell you which I was treating positively, which negatively, and which was the control. I’ve laid them out in the same order as before, so that you can see the progression, not just the final state – if one looks worse in the end, it may have been a little grubbier to begin with, so this seems like useful knowledge to add. The protocol’s still far from ideal, and my camera appears to have forgotten how to focus on stuff, but never mind, here’s what they looked like after:
So, which do you think was put in the “hate” jar, and spoken hatefully to? Now’s the time to choose.
For me, it’s got to be the one on the right. The other two had gone a bit soft, but didn’t really look that bad, but the right-hand one has that big icky splotch of mould right there. It definitely seemed to fare worse than the other two. So, which was actually the “hate” apple?
None of them. It’s a trick question. In a cunning bit of scientific mischief, I got lazy and totally couldn’t be bothered drawing up and attaching labels to the jars, or talking to bits of an apple like some sort of idiot. They’ve all just been sitting on a shelf for a week, in as close to identical conditions as you could hope for. As it turns out, one of them just seemed to decay a bit more than the others in that time. Does there need to be a reason? Shit just happens.
So. Having smoothly passed off my sloth and disregard for scientific integrity as a clever piece of deliberate subterfuge, I do in fact plan to follow this up with a proper experiment in the near future. But this might as well serve as a useful reminder that chaos is always going to play a part in this kind of thing, and you need to control for randomness. In particular, you need a sample size larger than one.
This is probably what annoys me most about the ridiculous piece in the Mail. Even if you accept that the apple chunk she was nice to really did decay more slowly, she’s holding up this one example of something happening which had a 50% chance of happening randomly anyway as evidence for supernatural forces at work in the universe. Look, if I flip a coin twice, I would expect, on average, to get tails once and heads once – 50% each way. But even if I happen to achieve a massive 100% score one way or the other, my psychic mastery of the physical world may still have a way to go.
There are now many more data points, and I don’t think it’s fair to conclude that the overall result is favourable to Nikki Owen’s claims of magic. And I wonder, of the people who dismissed Rebecca’s experiment due to insufficient scientific rigor, just how closely they examined Nikki’s own methodology to make sure her results were also valid. Or maybe it’s not important to do that, because her data supports what they want to believe.
In her own summary of results, though, I do slightly take issue with Rebecca’s mathematical reasoning. I’m hesitant to be too critical, since she was being advised by a proper maths guy who should know his stuff, but hear me out.
72% of respondents thought that the “love” apple looked the best, and only 10% thought it looked the worst. Between the other two, slightly more people thought the control “indifferent” apple looked worse than the “hate” apple. So Skepchick readers’ collaborative effort to determine which apple was which resulted in a 1/3 rate of success (since they got the “love” one right but mixed the other two up).
Now, Rebecca’s conclusion is that this result “failed to prove [Nikki Owen’s] hypothesis”, because only one of the three apples was correctly placed. I think she’s right, but for the wrong reason. Her conclusion fails to “prove” anything, about any hypothesis, because it also has a sample size of one.
One lone, isolated trial of something like this can’t single-handedly “prove” anything, in the same way that nothing is “proven” about alternative medicine by that one time you took some homeopathy and your ‘flu totally went away after like a week.
Rebecca’s result is compatible with pretty much any hypothesis that doesn’t contain any overwhelming generalisations. It’s entirely in line with the idea that the decay of apples can be slowed by speaking to them lovingly and caringly; it’s also perfectly consistent with the (correct) theory that this is all total bunk. It’s just one apple.
I am personally compatible with the hypothesis that people with a green left eye tend to have a green right eye, because both my eyes are green. There is, in fact, a strong correlation between those two variables, but you’d have to look at more faces than mine before you could conclude that. Also, the existence of other people whose eyes aren’t both the same colour doesn’t completely invalidate the model which says that these two things tend to be associated. I’m also consistent with the hypothesis that people with brown hair tend to wear glasses, but if I were the only human you’d ever studied, you wouldn’t know what to make of that idea either.
No individual data point is going to lead to any useful conclusions on its own here. If we’re going to treat this like an idea that deserves to be checked out, we need to get much more data in before we have any idea what to do with the null hypothesis.
Personally, I wonder how much it’s even worth treating this kind of thing seriously, and the extent to which skeptics are obliged to do proper science on this kind of insubstantive nonsense before we’re just allowed to tell the silly people to go away. But that’s a musing for another time. I’ve rambled way too much on this already.