Evidence Thresholds

Defenders of science-based medicine are often confronted with the question (challenged, really): what would it take to convince you that “my sacred cow treatment” works? The challenge contains a thinly veiled accusation — no amount of evidence would convince you because you are a nasty skeptic.

There is a threshold of evidence that would convince me of just about anything, however. In fact, I have been convinced that many scientific claims are likely to be true — sufficiently convinced to act upon the conclusion that they are true. In medicine this means that I am convinced enough to use them as a basis for medical practice.

There are many functional differences between practitioners of SBM and those who accept claims and practices that we would consider to be pseudoscience or fraud, but I was recently struck by one particular such difference — where we set the threshold of evidence before accepting a claim.

Last week I took part in a debate about the legitimacy of homeopathy (you can read my full account here and here). On the other side was Andre Saine, a Canadian naturopath and Dean of the Canadian Academy of Homeopathy. If I had to summarize the key difference between Saine’s position and my own during the debate it was that he accepted incredibly poor evidence as sufficient to establish the reality of homeopathy. His threshold of evidence was astoundingly low.

At the same time he expressed disbelief that skeptics could maintain their skepticism in light of the evidence he presented. I could come to no other conclusion than that Saine simply has no concept of the usual threshold of evidence for acceptance in medicine and in mainstream science generally.

What Would It Take?

What convincing evidence looks like is something we have thought about and written about extensively. I also write about scientific topics outside of medicine, and this has helped give me a broader perspective on this question as well. Proponents of ESP, for example, also accept a lower standard of evidence.

The question is – at what point should the scientific community generally accept the reality of a phenomenon? This means that the alternate explanation, that the positive data is flawed or misleading in some way, can be confidently ruled out.

Here are the four criteria that need to be simultaneously met in order to be considered compelling scientific evidence:

1- Methodologically rigorous, properly blinded, and sufficiently powered studies that adequately define and control for all relevant variables (confirmed by surviving peer-review and post-publication analysis).

2- Positive results that are statistically significant.

3- A reasonable signal to noise ratio (clinically significant for medical studies, or generally well within our ability to confidently detect).

4- Independently reproducible. No matter who repeats the experiment, the effect is reliably detected.

What we often see with dubious medicine (like homeopathy) is that only criteria 2 is necessary – any study that shows statistical significance is taken as iron-clad.

We also often see a shell game similar to buying a new car. Car salespeople often use the four square method – they divide a sheet of paper into four squares – in one is the price of the car, in the other is the loan rate, in the third is the down payment, and in the fourth is the money they will give you on the trade-in. These will all calculate to your monthly payment.

But here’s the trick – the car dealer will use this method to make sure they make their profit. If they give you a good deal on the trade-in, then they don’t give you a good deal on the price of the car. You can never get a good deal on all four squares at the same time.

Proponents of dubious science work the same way – they offer studies that have one or maybe two of the above criteria, but never all four at the same time. They may offer a poorly designed study with positive results, or a well designed study with positive results, but clinically insignificant and unable to be replicated.

There is a reason why you cannot get all four criteria at the same time – because the phenomenon in question is not real. Only a real effect would show up consistently in highly rigorous studies.

It also needs to be pointed out that meeting these criteria is the baseline for scientific acceptance, without even considering prior plausibility. Within each criterion there is a range of quality – how rigorous a study, replicated how many times, with what effect size? The more implausible a claim, the greater the threshold should be in order to overcome that implausibility.

Homeopaths and proponents of implausible claims don’t like this reasoning. They deride it as “plausibility bias.” Everyone else calls it “science.”

But it is important to point out that even without considering prior probability or plausibility, homeopathy still fails to meet even the minimal scientific criteria for acceptance. It is not even close – even if we give it every benefit of the doubt.

Defending the Threshold

If you are convinced by the reality of something like homeopathy, acupuncture, energy medicine, colonic therapy, or something equally unlikely, the threshold of acceptance seems unfair and broken. It seems like a trick nasty skeptics use to deny the reality of your fabulous medicine.

It is, rather, the standard threshold of acceptance within mainstream science (obviously there is a range within that standard, but it is at least a minimal threshold).

Part of the philosophy of science-based medicine is that such a rigorous standard is justified and necessary, and in fact it needs to be even higher than perhaps is currently practiced in medicine. We have discussed many reasons why this is the case, each one can be the topic of a lengthy article, but I will just summarize them here.

– Medical research is challenging as people are generally a variable and noisy system in which to conduct studies and control variables.

– Placebo effects are multifarious and difficult to account for completely.

– Researcher degrees of freedom make it possible to manufacture positive results even out of a completely non-existent phenomenon. This requires special rigor in study design and execution to avoid, and also independent replication.

– Most published studies are wrong, because most new ideas in medicine do not pan out, and most studies are preliminary and are therefore subject to significant researcher bias in the positive direction.

– There is occasional fraud in scientific research.

– Publication bias distorts the overall scientific literature.

– There is considerable financial bias in medical research, as this is an applied science and often billions of dollars ride on the results of research.

– Humans generally are subject to a host of cognitive biases, heuristics, logical fallacies, errors in memory and perceptions, and other mechanisms of self-deception. We can naively be led to believe almost anything with utmost confidence.

Conclusion

Rigorous science anchors us to reality. Without it our beliefs will drift off into a fantasy world that caters to our emotions and desires but has little connection to reality. We might end up believing that pure water can retain the “essence” of a substance that was once diluted in it, and that this essence can heal people based upon completely unrelated features, such as their personality type.

People, left to their own devices, will believe in magic. Such a tendency is our evolutionary inheritance. But so is the capacity for logic and critical thinking.

Over the past two centuries, as scientific medicine has matured, we have learned to apply greater and greater rigor to the study of medicine and disease. We have learned more about our capacity for self-deception, and the subtle ways to manipulate data and research.

We now know what it takes to prove that something is really real, and doesn’t just seem real. We should vigorously resist those who wish to discard this hard-won wisdom because it threatens their cherished belief.

Author

Steven Novella

Founder and currently Executive Editor of Science-Based Medicine Steven Novella, MD is an academic clinical neurologist at the Yale University School of Medicine. He is also the host and producer of the popular weekly science podcast, The Skeptics’ Guide to the Universe, and the author of the NeuroLogicaBlog, a daily blog that covers news and issues in neuroscience, but also general science, scientific skepticism, philosophy of science, critical thinking, and the intersection of science with the media and society. Dr. Novella also has produced two courses with The Great Courses, and published a book on critical thinking - also called The Skeptics Guide to the Universe.

View all posts

Categories

Tags

Archives

Evidence Thresholds

Author

Posted by Steven Novella