Why Science Keeps Changing Its Mind

Spencer Greenberg and Travis M.
Sep 3
10 min read

Updated: Sep 4

Why is it that health recommendations often seem to contradict each other? Today, we’re going to offer you answers to that question, and you can choose your own adventure: either read the newsletter version below or watch the video version by clicking here.

Key Takeaways

Science may seem to flip-flop, but there are reasons for that. A major reason is that the wrong kind of study is used to draw causal conclusions.
Media and incentives fuel confusion. Another reason for flip-flopping comes from overstated findings and attention-grabbing headlines based on weak studies.
Well-powered, randomized controlled trials are the gold standard. Without them, placebo effects, noise, and group differences can easily mislead us.
Science looks inconsistent mostly for three reasons: weak studies treated as causal, rigorous trials being costly, and real differences across populations.

You may have heard that studies say we should reduce the amount of cholesterol in our diets - for instance, by avoiding eggs. But then later research suggested that actually, no, dietary cholesterol isn't bad for us after all. But then wait, even newer research suggests that the truth is actually more complicated than either of those stories.

You may also have heard that any amount of alcohol is bad for us. But later research suggested that actually, one drink a day may actually be good for us. And then even newer research says actually, maybe that’s not right; maybe any alcohol at all is bad for us after all.

This sort of issue isn't just limited to food or nutrition. The same kind of problem happens with research into supplements. You may have heard, for instance, that vitamin C helps cure colds. Linus Pauling, a two-time Nobel prize winner, promoted this perspective, and products like Emergen-C that come loaded with vitamin C became popular. But then, oh wait, newer research says vitamin C doesn't actually work if we take it after a cold has begun.

What’s the problem with ‘flip flopping’ science?

This kind of flip flopping about the truth presents two big problems:

First, it makes it challenging to decide how to live our lives. Should we try to avoid cholesterol in our diets, or not? If we want to be as healthy as possible should we avoid all alcohol, or is a little alcohol fine, and maybe even good for us? It's easy to end up confused.

The second problem is that when the public perceives scientific research as contradicting itself, they can easily end up not trusting science. Even worse, they may end up trusting YouTubers instead.

I'm going to explain to you why this flip flopping happens in research, and how it can be avoided. I'll also talk about the underlying drivers of it - is it a problem with science itself, or a problem caused by scientists, or science communication, or are none of those to blame?

But to answer those questions, we first have to understand what it takes to figure out what actually works in health and nutrition.

The Vitamin C Puzzle: A Case Study

Suppose you want to tell if taking vitamin C supplements helps prevent colds. How would you do that?

Well, suppose that a survey had already been conducted where people were asked whether they took vitamin C supplements over the past 12 months, and they were also asked how many colds they had gotten in that period.

Figuring out if vitamin C supplements help prevent colds is then simple: all you have to do is look at this data to see whether those who were taking vitamin C supplements got fewer colds, on average, than those who weren't - right?

This seems like a valid way to see if vitamin C helps colds. And a lot of research in the real world works this way. But this approach is actually significantly flawed. I recommend that you pause reading this article now, for a moment, to see whether you can explain what's wrong with this approach. Why can't you reliably tell if vitamin C helps prevent colds by checking whether those who were taking vitamin C supplements over the past 12 months got fewer colds than those who weren't?

Correlation vs. Causation (And Confounding Variables)

The general issue here is that we can't rule out the possibility of what are called “confounders.”

We want to show that "A" - which in this case is taking vitamin C - has a causal effect on "B" which in this case, is getting sick with colds.

But all we've shown using this method is that A and B are associated or correlated - not that A causes B.

When we prove that A and B are associated, there are three common possibilities:

It could be that A causes B - in this case that would mean that taking vitamin C prevents colds.

It could be that B causes A. In this case that would mean that people who get more colds go take vitamin C more often - presumably because they have heard it helps with colds and are eager to stop their frequent colds.

It could be that some confounder - call it X - causes both A and B. In this case, that confounder could be, as we discussed, that healthy people are both more likely to take vitamin C AND less likely to get colds due to their health behaviors.

So we've seen that showing that A and B are associated or correlated doesn't prove that A causes B. Does that mean that associations are useless? Actually, they can be quite useful for two reasons:

If no association is found, that is moderately strong evidence that A doesn't cause B. So, if we found no association between taking vitamin C and fewer colds, that suggests that vitamin C is not a promising way to prevent colds.

And:

Associations are a good way to get hypotheses. For instance, if vitamin C and getting fewer colds are found to be associated with each other, that suggests a hypothesis to consider: perhaps taking vitamin C causes fewer colds. As Randall Munroe (creator of XKCD) put it:

*"Correlation doesn’t imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing ‘look over there’."*

But if we find that two things are associated, we still need further evidence to tell which way the causality goes. Is it that A causes B, that B causes A, or that some confounder X causes both A and B?

Associations can be very interesting to look at: they are a starting point for research, they can help generate hypotheses, and they can even help shoot hypotheses down. With this in mind, we gathered 1,000,000 correlations about humans and made them publicly available. If you're interested, you can search them here, for free, at PersonalityMap.io:

Launch PersonalityMap!

For instance, you can use it to look at the correlation between anxiety and depression, or to look up common traits of narcissists.

Getting back to the subject at hand - if we can't use associations to prove that taking vitamin C prevents colds, what should we do instead?

What’s Wrong With This Experiment?

Well, here's an idea for an experiment to fix the problem. We can simply pick 10 people, find out how often they typically get colds, and then give them vitamin C for 12 months and see how many colds they get. If they have fewer colds than normal, we've proven that vitamin C prevents colds - right?

Unfortunately, that's not right. What if those 12 months were simply a period when the cold season was less bad than normal? That would make it seem like vitamin C worked, when in fact most people got fewer colds during that period than normal whether they took vitamin C or not. Or, what if there's a placebo effect - perhaps believing you are getting a preventative treatment for colds causes reduced stress which actually makes them less likely to get colds? In that case, it would seem like the vitamin C was working but actually a sugar pill would work just as well.

So, how do we get around these issues? Well, we need to make another modification to our experiment.

This time, instead of just picking 10 people to give vitamin C to (say, 10 college students) we can also pick 10 people to monitor without giving them vitamin C (say, 10 people from our nearby community) who we will give a placebo sugar pill instead. None of the participants will know whether they are getting vitamin C or the sugar pill. Then, for the next 12 months, we'll monitor any colds in the two groups. Finally, we'll compare them to see which group got fewer colds.

This successfully solves the problems we had before of addressing changes that impact everybody, and it also handles the placebo effect. But, unfortunately, it introduces another important problem - can you tell what it is?

Why Your Control Group Might Be Flawed

The issue is that the 10 college students (who we selected to give vitamin C to) may differ in important ways from the 10 people from our nearby community (who we gave the sugar pills to). If the college students have stronger immune systems on account of being younger, they may get fewer colds than the other group not because vitamin C worked for them - but simply because they are younger.

Thankfully, there's a simple fix for this. Can you guess what it is?

It's one of the most powerful tools that studies use: randomization! All we have to do to solve this problem is randomizing which people get the placebo and which get vitamin C. This prevents there from being any differences (on average) between the placebo group and the vitamin C group.

But wait, we’re still not quite done. There's still a big problem with this experiment.

The Final Problem: Statistical Noise & Sample Size

The last issue is that with just 10 people in each group, even if vitamin C truly does help reduce colds, there’s a pretty good chance that just due to random noise we’ll actually end up with more colds in the vitamin C group. The reason for that is that the number of colds we get each year fluctuates randomly. So, just by chance, the 10 people getting vitamin C might happen to get more colds, even if Vitamin C is helpful.

Instead of recruiting 20 people and randomizing 10 to get the placebo and 10 to get vitamin C, we can recruit 400 people and randomize 200 to get the placebo and 200 to get vitamin C.

With enough people in our study, then if vitamin C really meaningfully reduce colds, the number of colds, we'll likely be able to tell by comparing the average number of colds in the vitamin C group to the average number in the placebo group. We can even do some statistics to calculate the probability of getting a difference in the averages as big as we got if, in fact, vitamin C doesn’t work. This is known as the ‘p-value’.

The Gold Standard: Randomized Controlled Trials (RCTs)

Notice how, in trying to answer the simple question “Does vitamin C prevent colds?” we had to correct error after error after error, to converge on a study design that actually tells us if one thing causes another.

The study design that we converged today is what's known as a well-powered randomized controlled trial - sometimes abbreviated as an "RCT".

These studies are "controlled" meaning that we don't just give people vitamin C, we also have a control group. In “placebo controlled” studies, the control group gets a placebo, but other other studies use other controls (e.g., some studies give the control group the best currently existing treatment).
It's randomized meaning that we don't just manually pick some people to get the vitamin C and others to get the placebo; we randomize which one each person gets to avoid any systematic differences between the two groups.
And it's well-powered, meaning that there are enough participants in the study to reliably detect the effect, if vitamin C actually does work.

Well-powered randomized controlled trials are powerful. Unlike surveys, studies with no control group, and studies with only a few participants, well-powered randomized controlled trials can reliably tell us if one thing causes another.

This brings us back to the original question that we set out to answer: Why does it seem like nutrition research and supplement research keeps flip flopping in what it tells us?

Why Research Seems to Flip Flop

The biggest reason is that most studies don't allow us to reliably tell if one thing reliably causes another. Unfortunately, when people see these weaker studies come out, they often prematurely jump to the conclusion that X causes Y.

Are scientists to blame for this? Often no, but sometimes, yes. Many scientists are cautious in their claims, and they don't make it seem like they've found strong evidence for X causing Y unless they really have. But there are some scientists who write their papers in a way to get more attention, which makes it seem like they've shown X causes Y even though they really haven't. The news media contributes a great deal to this problem - by latching on to the latest study without understanding what it really did and didn't show. And influencers who are not scientifically trained make it worse, by latching on to weak studies without understanding their limitations. These forces create bizarre situations, as depicted in this chart, where it seems like everything both prevents and causes cancer:

(Source: https://www.vox.com/2015/3/23/8264355/research-study-hype) — *(Source:* *https://www.vox.com/2015/3/23/8264355/research-study-hype*)

A major reason that a chart like this shows so many contradictory studies is that many of those studies are not the appropriate design to show whether something actually causes or prevents cancer.

The 3 Real Reasons for Contradictory Science

Fundamentally, the problem is not that science doesn't have the tools to figure out what works in health, supplements, and nutrition. But there are actually three problems happening simultaneously:

First, lots of studies that aren't capable of showing whether X causes Y are used as though they can answer questions like that. So studies appear to contradict each other when, in fact, the studies are just being misused.

Second, the actual studies we need to figure out answers are often very time consuming and very expensive to conduct, so lower quality studies get done instead. These don’t actually answer the questions we care about.

Third, the world is actually very complex, and human health is very complicated - which means that even the highest quality studies occasionally disagree for very good reasons. For instance, one well-powered randomized controlled trial might correctly conclude that a supplement is helpful in one population, whereas another well-conducted trial might conclude that the supplement doesn't work in another population. But both can be correct - the reason for the disagreement could simply be that the first population was deficient in something that the supplement provided, and the second population wasn't deficient, so the supplement didn’t actually help!

Science does have the power to answer difficult questions, but the right tool has to be used for the job. The wrong type of study won't answer your questions, but the right type of study can.

If you enjoyed this article, you might enjoy watching the same ideas come to life in our video version. And if you want more engaging explanations like this, you can subscribe to Spencer’s YouTube channel here:

See Spencer Greenberg on YouTube!