Plausibility bias? You say that as though that were a bad thing!

On Friday, you might have noticed that Mark Crislip hinted at a foreshadowing of a blog post to come. This is that blog post. He knew it was coming because when I saw the article that inspired it, I sent an e-mail to my fellow bloggers marking out my territory like a dog peeing on every tree or protecting my newfound topic like a mother bear protecting her cubs. In other words, I was telling them all to back off. This article is mine.

Mine! Mine! Mine! I tell you!

My extreme territorial tendencies (even towards my friends and colleagues) notwithstanding on this issue aside, if you read Mark’s post (and if you didn’t go back and read it now—seriously, go now), you might also remember that he was discussing a “reality bias” in science-based medicine (SBM), a bias that we like to call prior plausibility. In brief, positive randomized clinical trials (RCTs) testing highly implausible treatments are far more likely to be false positives than RCTs testing more plausible treatments. That is the lesson that John Ioannidis has taught us and that I’ve written about multiple times before, as have other SBM bloggers, most prominently Kimball Atwood, although nearly all of us have chimed in at one time or another about this issue.

Apparently a homeopath disagrees and expressed his disagreement in an article published last week online in Medicine, Health Care, and Philosophy entitled Plausibility and evidence: the case of homeopathy. You’ll get an idea of what it is that affected us at SBM like the proverbial matador waving his cape in front of a bull by reading this brief passage from the abstract:

Prior disbelief in homeopathy is rooted in the perceived implausibility of any conceivable mechanism of action. Using the ‘crossword analogy’, we demonstrate that plausibility bias impedes assessment of the clinical evidence. Sweeping statements about the scientific impossibility of homeopathy are themselves unscientific: scientific statements must be precise and testable.

Scientific. You keep using that word. I do not think it means what you think it means. Of course, his being a homeopath is about as close to a guarantee as I can think of that a person doesn’t have the first clue what is and is not scientific. If he did, he wouldn’t be a homeopath. Still, this particular line of attack is often effective, whether yielded by a homeopath or other CAM apologist. After all, why not test these therapies in human beings and see if they work? What’s wrong with that? Isn’t it “close-minded” to claim that scientific considerations of prior plausibility consign homeopathy to the eternal dustbin of pseudoscience?

Not at all. There’s a difference between being open-minded and being so “open-minded” that your brains threaten to fall out. Guess which category homeopaths like Rutten fall into. But to hear them tell it, homeopathy is rejected because because we scientists have a “negative plausibility bias” towards it. At least, that’s what Rutten and some other homeopaths have been trying to convince us. This article seems to be an attempt to put some meat on the bones of their initial trial balloon of this argument published last summer, which Steve Novella duly deconstructed.

Before I dig in, however, I think it’s necessary for me to “confess” my bias and why I think it should be your bias too.

In which I confess my bias

Regular readers might have noticed that we write about homeopathy a lot on this blog. You might wonder why. Indeed, sometimes I myself wonder why. After all, if you want to come up with a list of the top three most ridiculous alternative medicine modalities with a large following, surely homeopathy will almost always be on the list, along with energy healing modalities (such as reiki) and a third nutty modality to be named later whose identity will be left for the reader for later given that there is likely to be some disagreement about it.

In any case, among highly implausible alternative medicine “healing systems,” homeopathy is at or near the top of the heap, reigning supreme. After all, given its twin pillars of “like cures like” and the law of infinitesimals, the former of which says that to relieve a set of symptoms you choose a remedy that causes those symptoms in healthy people and the latter of which says that those “like” remedies get stronger if they are highly diluted in serial steps—but only if they are vigorously shaken or “succussed” between each step. The first principle has no basis in physiology, pharmacology, biochemistry, or medicine (the claims of homeopaths to co-opt a real phenomenon known as hormesis notwithstanding), while the second principle so thoroughly violates the laws of chemistry and physics that, for it to be true huge swaths of these disciplines that have been well-established through hundreds of years of experimentation and observation would have to be not just wrong, but spectacularly wrong. One must concede that it’s possible that this latter principle might be true, but the odds that it is are about as infinitesimal as the amount of starting remedy in a 30 C homeopathic remedy. (That’s a 1 in 1060 chance, for those not familiar with homeopathy.) For all practical intents and purposes, the chances that homeopathy can work is zero. It is just water with its believer’s magical intent imagined into it.

So ridiculous is homeopathy that I sometimes feel that I and my fellow supporters of SBM are firing Howitzers at an ant when we take so much time and effort to explain why homeopathy is nonsense. On the other hand, it is homeopathy’s monumental lack of scientific plausibility that makes it a perfect teaching tool explaining the difference between science-based medicine (SBM) and evidence-based medicine (EBM). Specifically, because clinical trials have unavoidable shortcomings and biases, even at a p=0.05, which would imply only approximately a 5% chance that a given trial’s apparently positive results could be due to random chance alone. As John Ioannidis has taught us, in clinical trials as practiced in the real world, the chance is much higher that any given positive trial is a false positive. As explained in so much detail by Kimball Atwood, that also means that, the lower the prior plausibility of a remedy working, the higher the chance of false positive trials. This is exactly what we see in homeopathy, hence the panoply of homeopathy trials showing “positive” results in which the treatment group is barely different from the control and/or the results barely reach statistical significance. With something like homeopathy, which violates the laws of so many sciences, it is relatively easy to make the case that it takes a lot more than a few equivocal clinical trials to show that so much well-established science is wrong. Apparently positive clinical trials of homeopathy are measuring, in essence, the noise inherent in doing clinical trials.

Although most physicians and clinical investigators don’t think about it consciously they tend to have a bias for plausible hypotheses and treatments in evidence-based medicine and against implausible hypotheses. This bias is certainly not inherent in EBM, as we have described may times before. EBM, after all, relegates basic science considerations to the very bottom rung of its ladder of evidence, below even expert opinion. Clinical trial evidence and epidemiology are all, and, although EBM aficionados deny it, the way EBM is practiced it does appear to worship the randomized clinical trial (RCT) above all else. In fact, that “plausibility bias” that most physicians have often manifests itself as difficulty believing that there is a problem with EBM, that EBM can go so off the rails when it comes to CAM because it really has no mechanism to take plausibility into account. Indeed, it’s been speculated right here on this very blog that the reason prior plausibility is not built right into EBM, so to speak, is because the founders of EBM suffered from it. They assumed that treatments would not reach the stage of large RCTs if they had not proven themselves plausible first through preclinical evidence in laboratory studies, animal experiments, and studies of pathology and lab tests. Under this view, it simply never occurred to the gods of EBM that something as ridiculous as homeopathy could reach the stage of RCTs because they suffered from plausibility bias that blinded them to the very possibility of that happening!

Whether that’s true or not, I don’t know, but it would explain a lot. Either way, as we have pointed out, SBM tries to restore to EBM what it is missing: A consideration of prior plausibility based on scientific considerations. In practice, this is more useful for eliminating incredibly implausible treatments, such as homeopathy and reiki, than it is for putting hard numbers on prior plausibility for treatments because it is not always necessary to estimate a pre-trial probability of success, except when it so low that it would take an incredible amount of evidence to overturn existing knowledge, as it would for homeopathy or reiki. Here’s my plausibility bias: For something like homeopathy or reiki, either of which would require the rewriting of huge swaths of science to become plausible, I consider it reasonable to require supporting evidence at least in the same order of magnitude of quantity and quality as the evidence showing that homeopathy or reiki cannot work to make it reasonable to start to think that either could work. Or, to put it much more simply, extraordinary claims require extraordinary evidence.

That’s my plausibility bias. I’m biased in favor of science and reason and against magical thinking like homeopathy and reiki. You should be biased too.

The homeopaths attack

After I had stopped laughing in response to seeing homeopaths lecture scientists on what is and is not scientific, I delved into the paper. Rutten et al try (and fail—after all they are homeopaths) to establish their scientific bona fides righ in the second paragraph:

The authors of the present paper are doctors and scientists with an interest in homeopathy, committed to the scientific method in researching and practising it. We are qualified in medicine and science and started practising these in conventional contexts, gradually becoming convinced that homeopathy is an effective option, supplementary to rather than conflicting with conventional medicine. We concur with Hansen and Kappel that the disagreement concerning the interpretation of reviews of randomised controlled trials (RCTs) is rooted in prior beliefs and their influence on the perception of evidence. We do not concur, however, with their assumption that the homeopathy community’s positive view of the evidence is due to a rejection of the naturalistic scientific outlook. We ourselves, for example, do not reject any part of the naturalistic outlook.

My first temptation was to point out that the very fact that they are homeopaths means that they are either deluding themselves or lying when they claim that they do not reject any part of the naturalistic outlook. Homeopathy, after all, is rooted in the principles of sympathetic magic, not science. For instance, homeopathy’s law of similars (“like cures like”) is uncannily similar to Sir James George Frazer’s Law of Similarity as described in The Golden Bough (1922) as one of the implicit principles of magic. In addition, the concept that water can somehow retain the imprint of substances with which it’s been in contact, which really underlies the belief among homeopaths that remedies diluted to nonexistence (basically anything diluted more than around 12 C—14C or 15C, to be safe) can have biological effects, is very much like the Law of Contagion. Read the following passage from The Golden Bough and tell me that it doesn’t sound almost exactly like homeopathy:

If we analyse the principles of thought on which magic is based, they will probably be found to resolve themselves into two: first, that like produces like, or that an effect resembles its cause; and, second, that things which have once been in contact with each other continue to act on each other at a distance after the physical contact has been severed. The former principle may be called the Law of Similarity, the latter the Law of Contact or Contagion. From the first of these principles, namely the Law of Similarity, the magician infers that he can produce any effect he desires merely by imitating it: from the second he infers that whatever he does to a material object will affect equally the person with whom the object was once in contact, whether it formed part of his body or not. Charms based on the Law of Similarity may be called Homoeopathic or Imitative Magic. Charms based on the Law of Contact or Contagion may be called Contagious Magic.

A later passage by Sir Frazer is an excellent criticism of the two pillars of homeopathy:

Homoeopathic magic is founded on the association of ideas by similarity: contagious magic is founded on the association of ideas by contiguity. Homoeopathic magic commits the mistake of assuming that things which resemble each other are the same: contagious magic commits the mistake of assuming that things which have once been in contact with each other are always in contact. But in practice the two branches are often combined; or, to be more exact, while homoeopathic or imitative magic may be practised by itself, contagious magic will generally be found to involve an application of the homoeopathic or imitative principle.

See what I mean when I say that the ideas behind homeopathy resemble sympathetic magic far more than they resemble science? From my perspective, all homeopaths—and I do mean all homeopaths—hold views that reject science, no matter how much they fool themselves into thinking they are scientific and buy into the naturalistic world view. I could go on to demonstrate how much of homeopathy is rooted in prescientific vitalism, using Samuel Hahnemann’s own words, but I think you get the idea. Homeopathy is magic water made magic using thought processes akin to those used in voodoo when voodoo practitioners make voodoo dolls.

It is also rather interesting how Rutten et al are so willing to accept science when it comes to RCT evidence but reject the much larger and far more robust body of science that underlies the pre-trial assessment of prior probability that says that homeopathy can’t work. They willfully reject the concept that extraordinary claims require extraordinary evidence, and homeopathy is nothing if not a highly extraordinary set of claims. Instead, Rutten et al make an analogy to crossword puzzles. This analogy is actually rather apt, but not in the way our unhappy homeopaths think it is. Basically, here is the analogy as described by Rutten et al:

Sometimes new evidence overturns theory, but sometimes not; the context is crucial. This has been expressed in terms of a crossword analogy (Haack 1998): the correctness of an entry in a crossword depends upon how well it is supported by the clue, whether it fits with intersecting entries, how reasonable those other entries are, and how much of the crossword has been completed. In this analogy, for homeopathy, the primary entry is: “Does it work (other than by placebo effects)?” The secondary intersecting entries are concerned with “How does, or could, it work?”

Although Rutten et al will never admit it, this analogy is an excellent one for why the occasional “positive” clinical trial of homeopathy does not overthrow the existing scientific paradigm that concludes that homeopathy can’t work, that it is nothing but water, and that any apparently positive effects seen are due either to placebo, random chance, or bias and/or shortcomings in the RCTs. Such trials do not fit with “multiple intersecting entries” in physics, chemistry, and biology that are all consistent with the impossibility of homeopathy; i.e., they do not fit into the crossword puzzle. The only way they could be made to fit into the crossword puzzle would be if homeopathy were shown in a reproducible fashion to cure incurable diseases, such as metastatic pancreatic cancer, in which case homeopathy might go into the crossword puzzle and force the puzzle solver to start rethinking other answers to fit with homeopathy.

In other words, clinical evidence could make us question the rest of the “crossword puzzle” but only if it’s clinical evidence that is so extraordinary in result, quality, and quantity that it starts to rival the existing evidence from multiple disciplines that do not support homeopathy. No such evidence exists for homeopathy, and, in fact, the overall weight of the clinical evidence is consistent with homeopathy not working any more effectively than placebo. Indeed, Ruten et al wrongly relegate the question of how homeopathy could work to a secondary question, and here’s why: When, for a therapy to work the very laws of physics would have to be, as I say so often, not just wrong but spectacularly wrong, the question of how it could work is not secondary. This is in marked contrast to drugs (which inevitably work by either binding to a biological molecule or otherwise reacting somehow), in which case not knowing the exact mechanism is not as concerning. Even cases like the discovery that H. pylori causes duodenal ulcers is not a refutation of this principle with respect to homeopathy. After all, as implausible as the hypothesis that it was a particular bacterial species that was responsible for peptic ulcers in many cases, it did not require the violation of the laws of physics to imagine that a bacterial infection could somehow cause ulcers.

Rutten et al spend considerable verbiage listing the usual suspects for homeopathy, including old meta-analyses, various clinical trials, and, of course the infamous basophil degranulation experiments by Jacques Benveniste. These have been fodder many times before on this blog; so I don’t really want to dwell on them other than to note that in particular Rutten et al reserve most of their vitriol for a meta-analysis and systematic review of the literature by Shang et al published several years ago in The Lancet that found that homeopathy effects are placebo effects. Basically, Rutten et al basically rehash Rutten’s criticisms of Shang’s analysis. These are criticisms I dealt with in detail, and four years of aging don’t make them any better. In fact, the apologia based on “clinical evidence” is nothing that we haven’t heard before and nothing worth rehashing here (other than a link to my previous deconstruction) because the point of Rutten et al is to attack what they call “plausibility bias.” All the trotting out of clinical evidence that allegedly supports homeopathy is in reality a massively flawed lead-in, a thin mint wafer to cleanse the palate, so to speak, to the main argument, which is based on how Shang’s meta-analysis and other clinical trials allegedly support homeopathy but are often cited as evidence against homeopathy.

First, Rutten et al distinguish between homeopathic dilutions in which there might still be some of the original remedy left (generally less than 12C or so, but in reality any homeopathic dilution that gets higher than 7C (10-14) is probably in the femtomolar range or lower, and there aren’t very many substances that have significant biological effects at such a low concentration. None of this stops Rutten et al from proclaiming:

There are obvious sources of pre-trial belief. These include well documented paradoxical low-dilution effects. The basic idea of homeopathy is the exploitation of the paradoxical secondary effects of low doses of drugs. Secondly, reverse or paradoxical effects of drugs and toxins in living organisms as a function of dose or time are very widely observed in pharmacology and toxicology. They are variously referred to as hormesis (the stimulatory or beneficial effects of small doses of toxins) hormligosis, Arndt- Schulz effects, rebound effects, dose-dependent reverse effects and paradoxical pharmacology (Calabrese and Blain 2005; Calabrese et al. 2006; Bond 2001; Teixeira 2007, 2011).

Repeat after me: Hormesis does not justify homeopathy. It’s an analogy that homeopaths love because it’s a hypothesis that states that some substances that are toxic at high doses might be benign or even beneficial at lower doses. (Look back to the fun I had with Ann Coulter’s invocation of hormesis to try to convince you that radiation from the Fukushima nuclear reactor is in fact good for you for an explanation.) This is, of course, wishful thinking on the part of homeopaths, representing extreme over-extrapolation. Hormesis might apply to low doses, but much of homeopathy involves no dose; i.e., dilution far, far beyond the point where it is highly unlikely that even a single molecule of the original substance remains. Rutten et al try to dodge this question by claiming that most homeopathic remedies are not “ultramolecular dilutions” (i.e., dilutions far beyond Avogadro’s number that leave nothing behind). Even if that’s true, many homeopathic dilutions are “ultramolecular” dilutions, and homeopathy does postulate that dilution and succussion do increase the potency of homeopathic remedies. Have Rutten et al forgotten the Law of Infinitesimals?

They haven’t, though. After trying to argue that most homeopathic remedies are not “ultramolecular,” Rutten et al then cite a bunch of dubious in vitro studies claiming that ultramolecular dilutions can have biological effects. I’ve looked at many such studies (for instance, this study of homeopathic remedies on human breast cancer cell lines), and quite often what you find is shoddy methodology, effects of solvents and contaminants, and other potential explanations for the observed results that do not involve having to throw out huge swaths of physics and chemistry. Amusingly, Rutten et al even admit that such results have a serious problem:

A more recent meta-analysis evaluated 67 in vitro biological experiments in 75 research publications and found high-potency effects were reported in nearly 75 % of all replicated studies; however, no positive result was stable enough to be reproduced by all investigators (Witt et al. 2007).

Can you say “publication bias”? Sure, I knew you could.

Can you also say: Anecdotal evidence? Sure, I knew you could:

The other major source of our prior beliefs is practice experience. This may be regarded the lowest level of evidence, but it is under-rated by many (Vandenbroucke 2001). After adding homeopathy to conventional treatment, many unsuccessful cases improved (Marian et al. 2008). The repetitive character of such experiences gradually updated our belief, consistent with Bayesian theory (Rutten 2008).

In other words, Rutten et al admitting that the source of their “positive plausibility bias” towards homeopathy is based on anecdotes. That is, after all, what “practice experience” is: Anecdotes, confirmation bias, and the like. It’s the same reason that Dr. Jay Gordon, for instance, believes that vaccines cause autism when the evidence from large epidemiological studies does not support that belief. He sees what he thinks are cases of “vaccine injury” manifesting itself as autism and, because he believes that vaccines cause autism, attributes his patients’ autism to vaccines. Rutten et al also cite non-blinded, non-randomized “real world” (pragmatic) trials as contributing to their pre-test plausibility bias towards homeopathy.

Pre-trial belief: Science versus anecdote

We have argued that EBM has a shortcoming, and that shortcoming is that EBM does not adequately consider prior probability in assessing evidence. In EBM, clinical evidence is all, and evidence from RCTs (or even better, meta-analyses or systematic reviews of RCTs) rules the heap. This is not unreasonable when RCTs are only performed for hypotheses that have been developed through a scientific process that takes preclinical observations and builds upon them, such that existing evidence deems them reasonably plausible. CAM in general and homeopathy in particular are not such a case. RCTs of homeopathy in essence measure noise, but only positive noise. Some studies will appear to be positive, and publication bias will make sure that the studies where patients receiving homeopathy do worse are unlikely to be published so that we see in the literature only negative studies or studies apparently positive due either to random chance, either alone or combined with poor study design and/or bias. We and others have proposed taking prior probability into consideration, both for deciding what hypotheses to test in clinical trials and how to interpret the results of existing clinical trials.

The fact is that we have always taken plausibility into account in deciding which clinical trials to perform. We have to because we don’t have unlimited resources, human subjects, or researchers to test in an RCT every hypothesis that comes along. We just don’t. In fact, our resources are currently more constrained than they have been in at least 20 years, with NIH pay lines hovering around the 7th percentile in some institutes. Moreover, the very foundations of medical ethics as laid down in the Helsinki declaration require that human subjects experimentation have a strong background of basic science backing it up. The question is: How do we want to prioritize which trials get done? On what do we base our estimates of prior plausibility that color our decisions regarding which clinical trials to carry out and how to interpret data from existing clinical trials? Homeopaths like Rutten and colleagues would propose that we base our estimate of prior plausibility on anecdote, magical thinking, and dubious in vitro and clinical trial evidence, ignoring the massive, well-established prior implausibility of homeopathy that a rational scientific assessment will arrive at. Scientists base their assessment of prior plausibility based on as objective as possible an interpretation of existing scientific data.

I know which one I would choose.

I also have a message for Rutten and is merry band of homeopaths. You accuse us of “plausibility bias” as though that were a bad thing. It’s not. As Mark Crislip pointed out, what plausibility bias should really be called is reality bias. We are biased towards reality. Homeopaths are biased towards what they think is reality but is in actuality magical thinking.

Again, I know which one I choose.

Finally, we don’t have unlimited resources to test every hypothesis that anyone can think up. There isn’t the money. There aren’t enough scientists. Even leaving aside the serious ethical problems that come with testing highly improbable remedies on human subjects, there aren’t enough human subjects to test the promising drugs that have a reasonable probability of working (i.e., of being efficacious and safe) based on preclinical testing. Resource constraints have always existed, and scientists have never just tested whatever the heck they felt like testing. Plausibility has always been a major part of deciding which experiments to do, which promising compounds to take to clinical trials, which treatments to try. Think of it this way: We could estimate plausibility as carefully as we can based on scientific testing, evidence published in the existing scientific literature, and data from small pilot clinical trials. Or, taking the approach of Rutten et al, we can estimate plausibility from anecdotal experience, questionable experiments and clinical trials, and considerations that completely ignore the laws of physics and chemistry.

Again, I know which method I choose.

Posted in: Basic Science, Clinical Trials, Homeopathy, Science and Medicine

Leave a Comment (30) ↓

30 thoughts on “Plausibility bias? You say that as though that were a bad thing!

  1. I always appreciate an advertisement for a remedy when it brags that “it’s homeopathic!” That way I can ignore the product without further consideration.


  2. mousethatroared says:

    an epiphany.

    Now that I’ve seen it, I realized there really should be more Daffy Duck references on SBM – more Bugs Bunny, Will E. Coyote, Marvin the Martin too…

  3. Jan Willem Nienhuys says:

    which would imply only approximately a 5% chance that a given trial’s apparently positive results could be due to random chance alone.

    Hohoho! That’s not at all what p=0,05 means. It means that – assuming only random chance was operating – the obtained result would have had that computed chance of occuring. One can never (well, almost never) establish that ‘random chance alone is operating or not’, let alone find out that this thing (random chance operating) happens 5% of the time.

    In the practice of research producing some kind of ‘final proof’ you would be crazy to start a billion dollar experiment if you had no idea what it would produce. You can fit the p-value veneration into a kind of Bayesian tale: say your test has a 50% chance of producing a positive result if the stuff really works, and say you set the required significance at 5% (i.e. the chance that a dud would produce a seemingly positive result) then the likelihood ratio is 50% divided by 5%, and this is the multiplier for the prior odds giving you the posterior odds, at least when your experiment succeeds.

    So if you are pretty sure that your expensive test is going to work (prior odds 4 i.e. 80% chance that “it” works and 20% that it doesn’t) then the posterior odds will be 40 (i.e. only about 2.5% chance that it’s a dud). But if the prior odds are just 0 (like homeopathy) then the posterior odds will be 0 too. Of course homeopaths don’t think that way. They have already so much experience in treating patients (Rutten ‘curing’ exhibiotionists with homeopathy for example) that they put the prior odds quite high.

    But the above is dubious analysis, because gut feelings are not probabilities.

  4. rbnigh says:

    Thank you for this very clear exposition of your argument. I am fascinated by the homeopathy debate because it reveals so much about how we, as a society, construct reality and the role of science and scientists in that process.

    I am not interested in defending homeopathy or even SBM for that matter. I focus on studying how people decide their therapeutic trajectories. It is difficult to judge whether a given procedure ends up reducing human suffering or not.

    But the very clarity of your exposition emphasizes the limits of scientific rationality. Plausibility bias refers to that characteristic of humans to be, as Nassim Taleb calls us, machines for looking backwards. We are lousy predicters of the future he said, because we project linearly from our past experience and expect things to continue has they have. This seems to be a reasonable assumption most of the time. Yet it effectively blinds us to the ‘black swans’, the impact of the highly improbable. If you think about history, your own experience, and scientific discovery, that impact of the unexpected and improbable is decisive.

    The problems of medical research that have stumped us are probably still problems precisely because they don’t fit in with our plausibility bias. The answers are elsewhere, where we are not looking, where are biases blind us.

    Is appropriate the you marshall the arguments of Frazier. This 19th century thinker is the clearest example possible of a Eurocentric gaze on other cultures and, indeed, most of human history. This leads him to disqualify practically everybody as ‘primitive’ and ‘supersticious’, leaving European culture (and of course “Science”) as the culmination of human achievement. Don’t you think that maybe some of the thousands of years of human experience in other cultures could have some bit of flotsam or jetsum that just might have some value, that might help us see some of our recalcitrant problems in a new light?

    I am reminded of the arguments of Feyerabend who said that as soon as a science develops a rigid methodology its ceases to learn anything new. This is the limitation of the ‘plausibility bias’. Perhaps there is no ‘reasonable’ way to procedure except by prior probabilities, then this is a limitation of rationality itself. An open mind would at least acknowledge that not everything we seek is to be found in our currrent model of reality, as ‘scientific’ as it may appear. We need to explore the implausible and improbable.

  5. Jan Willem Nienhuys says:

    I would like to add another point I probably made before. Classical homeopaths base their treatment on so-called drug pictures. These are symptom lists associated with the drug. Their favorite example is sleeplessness and coffee. But then they are cheating the public. Most of these ‘symptoms’ that presumably have occurred after healthy provers took the stuff, were obtained by giving them 30C diluted stuff in the first place. There are about a 1000 (maybe 3000) homeopathic substances, among which salt, chalk and sand and charcoal (not to mention north pole magnetism), and each of them has a list of about 1000 ‘symptoms’, almost all obtained by letting people take a 30C solution, and letting them note down anything they felt. That’s one million ‘proofs’ that show that highly diluted stuff does something.

    Rather than doing an RCT getting p=0.02 by giving pollen 30C to pollinosis sufferers, the homeopaths should start to validate their million combinations of a symptom and a 30C whatever. Such research is very cheap, and skeptics would be eager to help out and even award prizes for success (and share the Nobel prize of course). But guess what? If one proposes such a test, the homeopaths quickly back off. There is only one conclusion possible: they actually don’t believe all this nonsense themselves.

    All physics students repeat experiments like determining the acceleration of falling objects. Freshman mathematics students prove over and over again basic properties of numbers and functions. Why can’t those homeopaths just once in a while do a decent RCT with their basics? But all they usuallly do, is my experience, is proposing an unblinded N=1 test without control, i.e. they say: why don’t you try it out on yourself?

  6. qetzal says:


    Well, I certainly think these SCAM artists are disthpthpicable!

  7. David Gorski says:

    Hohoho! That’s not at all what p=0,05 means. It means that – assuming only random chance was operating – the obtained result would have had that computed chance of occuring. One can never (well, almost never) establish that ‘random chance alone is operating or not’, let alone find out that this thing (random chance operating) happens 5% of the time.

    Yes, those are the assumptions. Search for “Ioannidis” on this blog, and you’ll see that I and other SBM bloggers have extensively discussed the issues that lead the false positive rate of clinical trials to be considerably higher than 5%. Remember, I’m writing for a general audience, not statisticians or clinical trialists, and I want a general audience to understand that even under ideal circumstances, by their very design, at least 5% of clinical trials will be false positives – actually at least 5% of all experiments. The real world number, as you point out, will be higher.

    We”ve also written extensively about Bayesian statistics. I didn’t see the need to rehash all that in a post that is already 4,000+ words; so I boiled the issue down to what I consider its essence: The lower the pre-test probability, the higher the chance a given “positive” result is actually a false positive result, and when the pre-test probability approaches zero (as for homeopathy) the probability that a given “positive” result is a false positive result becomes very high.

  8. nobeardpete says:


    There have been several “black swans” in the history of science, which reshaped our understanding of basic organizing principles of reality. The theory of relativity makes a reasonable answer. Because this theory was such a dramatic break from previous ideas, its proponents understood that they’d need to present incontrovertible proof – dramatic evidence that could be reproducibly verified by skeptics, and which would be completely at odds with the results predicted under previously accepted theories. Einstein proposed the precession of Mercury’s orbit, gravitational lensing of light as might be observed during a solar eclipse, and gravitational redshift of light as experiments that would meet these criteria, and would either prove or disprove his theory. Many physicists were reluctant to accept the theory even with this evidence, but it did eventually win out.

    The situation with homeopathy is nothing like this. The evidence offered has not been dramatic, it has actually been quite underwhelming. It certainly has not been reproducible by skeptics. It hasn’t even been reproducible by proponents. And those non-reproducible results of homeopathy trials have generally be completely consistent with the results that would be predicted by existing theory – namely that with enough small, poorly done trials, statistical noise will result in some weak positive results, which will be found primarily in smaller and/or poorly done studies, and not in the larger or methodologically better ones. No homeopaths have, to my knowledge, proposed straightforward experiments that get to the heart of demonstrating that homeopathic principles have any predictive power, and certainly no homeopath of note has laid out a clear list of experiments that will either show homeopathy to be true or false. It seems that no amount of negative evidence will convince homeopathic proponents that homeopathy is false.

    Yes, there have been “black swans” in the history of science. No, this is not one of them.

  9. craig davis says:

    I’m not sure ‘bias’ is the right word. It is not bias to start from existing strongly supported scientific knowledge. Yes, we always need to consider that existing knowledge may be wrong or incomplete. But to falsify knowledge built on a very large amount of scientific theory and measurement we need a lot more than one or a few p=0.05 studies.

  10. cervantes says:

    Dr. G, I would urge you to continue to say the name of Bayes when you make these exegeses on scientific inference. I don’t think it will scare off the innumerate masses — the idea only seems counterintuitive and difficult if it is presented in a confusing way. Probabilistic reasoning underlies most of science based medicine, and although investigators most often apply Gaussian statistics, it is always Bayesian logic that should guide us — as in this case, where you use it qualitatively to warn against misuse of the p value. Ioannidis, for some reason, in his famous critique, avoids naming Bayes as well, but he is really applying Bayes theorem. I think it is helpful to have a succinct label for the essential idea that the advance of knowledge depends critically on interpreting new evidence in light of what we already know.

  11. mousethatroared says:

    @qetzal, hehe!

    I’m trying to remember if it was Michaell Shermer or Marvin the Martian who said “There is a growing tendency to think of man as a rational thinking being, which is absurd.” :)

  12. Jan Willem Nienhuys says:

    even under ideal circumstances, by their very design, at least 5% of clinical trials will be false positives

    Nonono! the above statement only holds when you know in advance and for certain that you are comparing two identical things. But if you know already that you are comparing things that are exactly the same, there is no need for testing. Or is there?

    And if you are talking clinical trials, I hope most of them are done on treatments that offer some hope of being useful. You yourself mentioned:

    plausible first through preclinical evidence in laboratory studies, animal experiments, and studies of pathology and lab tests.

    The only kind of trials where you’d expect 5% false positives are placebo controlled homeopathy trials.

    Unfortunately the circumstances are not ideal, because homeopaths make all kinds of horrible mistakes, such as faulty blinding and determining what exactly the endpoint is supposed to be after the data are collected and studied extensively. Or stopping an experiment when ‘significance is reached’. They not only get much more than 5% false positives, but truly incredible results. Look at the result by Friese and Zabalotnyiwho got a result with a p-value of 2.47 x 10 to the power -29 (one-tailed), my computation. Why don’t they claim the Nobel prize?

  13. Jan Willem Nienhuys says:

    That Friese & Zaboltnyi link didn’t work out. Here it is again (I hope):

    and the full ref is:
    Friese, K.-H., Zabalotnyi, D.I. Homöopathie bei akuter Rhinosinusitis, Eine doppelblinde, placebokontrollierte Studie belegt die Wirksamkeit und Verträglichkeit eines homöopathischen Kombinationsarzneimittels, HNO 2007; 55(4):271-277

  14. PharmDee says:

    This is a very timely post for me to read, as I am tangled up in acupuncture arguments with some fellow skeptics (well, skeptics in many ways at least :D)

    It gets slightly trickier with some folks to bring this up with say acupuncture. Because somehow, in their minds, because you are doing something physical with acupuncture, it is much more plausible to have a therapeutic effect…”penetrating tissue, nerves, and muscle”. My one particular opponent now has gone so far as to be “very very skeptical” of the sham studies (showing sham = acupuncture) because superficially poking the skin also is doing “something”.

    When I first learned of acupuncture, I thought “sounds like a load of crap non-sense, makes no sense”….I said this because I had a hard time believing that randomly sticking people with needles would fix lower back pain. I did not totally discount it, but I was so very skeptical.

    Now, my opponents now seem to be much more generous in their initial thoughts about acupuncture than I was….and I would say that this more generous initial assessment has biased their conclusions. After this they see the positive studies, it confirms their initial suspicions, and they need not ask why they did not study it vs. a good sham arm…etc…they see this “pile of evidence” (never mind its quality or the fact that there are better designed sham studies that show it does not work) as enough to at least not dismiss acupuncture as bullshit.

    And that is where I have reached a wall….I think I can only get so far with these folks….many of them will state that perhaps the there is good reason for my skepticism, but I am charged with being too quick to dismiss….they feel that in their minds these many positive studies and anecdotes are apparently eternally enough to charge anyone that makes the clear assessment that acupuncture is CRAP as “too quick to dismiss”.

    It would seem very likely that the way they come to this conclusion, and the way that I have came to mine, is painted in large part by our initial read of prior probability….

    It is pretty frustrating….you have give me a fresh new angle to explore with my peers…

  15. Jan Willem Nienhuys says:


    Some of my best friends are Bayesians, but there a serious flaw in Bayesian thinking. Bayes is OK when all the terms involved are probabilities. And in this case a probability is something that can be measured, at least in principle, by repeating a basic experiment and counting. (That’s why the phrase ‘probability that something is due to chance’ is nonsense: it is impossible to establish that some event that has happened is due to chance.) In many practical cases probabilities follow from symmetry considerations. For example all the faces of a symmetric homogeneous cube have equal chance of being on top after you throw them. Similary for a symmetric coin.

    Bayesians treat ‘plausibility’ as if it were probability, but the theorems and assumptions of probability theory don’t apply to plausibility.

    I know some Bayesians will think that I am a perfidious frequentist, but it is the Bayesians that have to prove that their calculations work also with plausibilities.

  16. daedalus2u says:

    I think that bias is not quite the right term, I think that unfortunate framing is a better term.

    Homeopathy is nonsense, but it follows the usual human default framing caused by human hyperactive agency detection. If I do X and Y happened, then X caused Y. Humans also reason the other way, if Y happens, then there must have been an X that caused Y to happen. Looking for and then imputing agency to explain why things happen is what human hyperactive agency detection pulls for.

    I think that questions are always framed in ways that an answer is expected to be capable of being understood. This is what Kuhn meant by paradigms. When the answer is not understandable in the current paradigm, the old paradigm needs to be abandoned and the new answer expressed in terms of the new paradigm because it can’t be expressed in terms of the old paradigm.

    People had to abandon the paradigm of absolute time and space in order to understand Relativity.

    The problem the homeopaths have is that they are clinging to magical thinking and are unable to think in terms of modern physics, conservation of mass/energy, charge, spin, etc. and no action at a distance.

  17. DugganSC says:

    Seems to me that there are two major factors going on in regards to plausibility. There’s cases of people with science that doesn’t match what we know works and there’s approaches which show a lack of success. There are approaches which don’t match our science but work, where we just need to discover why it works, the classic case being the various herbal remedies that medical companies liberate from primitive tribes. There are cases where the theory is sound but the execution keeps failing, c.f. any number of inventions which got released too early and weren’t successful until refined.

    Something which falls in only one of the two categories is worthy of exploring to try to elicit the nugget of a result. Homeopathy, however, falls under both. The science is bad and the results are bad.

  18. cervantes says:

    JWN — The logic of plausibility is analogous to the Bayesian logic of probability, even if we do not have a way of plugging numbers into Bayes theorem per se, I.e., we are thinking about the plausibility of A given B. This is a complicated subject of course. Many people like the exegesis by E.T. Jaynes, which I am sure you are familiar with.

  19. David Weinberg says:

    These different interpretations about the meaning of a P value are largely contextual, depending on whether we are talking about the specificity of the study design, or positive predictive value of a particular study result.

    With a critical P value of .05; if the null hypothesis is true, we can expect 5% of all fair studies to be positive (reject the null hypothesis), and by definition these will all be false positives. The other 95% will be true negatives.

    Now looking in terms of positive predictive value: If it is certain that the null hypothesis is true, then 100% of positive studies will be false positives (have zero positive predictive value) . We can never say with certainty that the null hypothesis is true, but given a very implausible treatment, we can say that the null hypothesis is more likely true than not true, and that the positive predictive value is greater than zero, but still very low. This was the approach to be learned from Ioannidis’ paper.

  20. phayes says:

    Yes – thanks cervantes – please everyone do read Jaynes. Free yourselves from the pathological and anachronistic Frequentist/Bayesian bickering about the meaning of probability, ‘subjectivity’ versus ‘objectivity’, and whether plausibility and frequency are probabilities or not etc. Unless you’re doing quantum foundations there’s really nothing of any importance to argue about anymore. Either you’re doing Jaynesian inference or you’re doing it wrong. :)

  21. jt512 says:

     Specifically, because clinical trials have unavoidable shortcomings and biases, a p=0.05, which would imply only approximately a 5% chance that a given trial’s apparently positive results could be due to random chance alone. As John Ioannidis has taught us, in clinical trials as practiced in the real world, the chance is much higher that any given positive trial is a false positive.

    You’re misinterpreting the p-value in your second sentence above. The p-value, roughly speaking, is the probability of the data given that the null hypothesis is true (as you’ve correctly implied in your first sentence above). But then you go on to state that the chance is “much higher [than .05] that any given positive trial is a false positive.” This is, roughly speaking, the probability that the null hypothesis is true given the data. But the p-value does not imply this in the first place (you’ve transposed the conditional); it wouldn’t be true even in the absence of shortcomings and biases in clinical trials.

  22. cervantes says:

    Well JT, yeah but that’s a bit of a quibble. It is approximately true that in a set of true random trials — without the various biases that pertain in the real world, and no information about prior probability, in other words just a casting of lots — in which there was no association between two conditions, you’d get a p < .05 about 5% of the time. The straightforward correction for multiple hypothesis testing is just to multiply the p value by number of trials.

  23. jt512 says:

    Well JT, yeah but that’s a bit of a quibble.

    No, it is not a “quibble.” Dr. Gorski transposed the conditional, a major mistake in interpreting p-values. In the second sentence of his that I quoted he implied that the probability that the null hypothesis is true given that the p-value is .05 should be .05, in the absence of systematic error. That is false. The probability that the null hypothesis is true, given the p-value, depends on the prior probability of the null hypothesis and the power of the study.

    It is approximately true that in a set of true random trials — without the various biases that pertain in the real world, and no information about prior probability, in other words just a casting of lots — in which there was no association between two conditions, you’d get a p < .05 about 5% of the time.

    If the null hypothesis is true (“no association between conditions”) and there is no systematic error, then the probability of a study having a p-value less than .05 is .05. That statement is true whether you have a prior probability in mind or not; the p-value is not a function of the prior probability.

    The straightforward correction for multiple hypothesis testing is just to multiply the p value by number of trials.

    That statement has nothing to do with anything I wrote.

  24. David Weinberg says:


    The probability that the null hypothesis is true, given the p-value, depends on the prior probability of the null hypothesis and the power of the study

    That statement is simply false. The null hypothesis is either true, or untrue. The best estimate is the prior probability.

    If you had stated that the probability of rejecting the null hypothesis, given the P value depends of the prior probability of the alternative hypothesis and the power of the study

  25. trrll says:

    “Bayesians treat ‘plausibility’ as if it were probability, but the theorems and assumptions of probability theory don’t apply to plausibility.”

    Plausibility arises from probability. Something is seen as having low plausibility if it can be true only if numerous other statements that have been established to a very high degree of certainty (i.e. extraordinarily low probability of being false) are, in fact, false. For example, homeopathy can work only if we discard the fundamental principles of thermodynamics and molecular dynamics, which have been confirmed by literally thousands of experimental results, many of which have a very high degree of statistical significance.

  26. craig davis says:

    “There’s cases of people with science that doesn’t match what we know works and there’s approaches which show a lack of success. There are approaches which don’t match our science but work, where we just need to discover why it works, the classic case being the various herbal remedies that medical companies liberate from primitive tribes. ”

    I’d take that one step further. We should also consider how much evidence exists for the science which is being contradicted.

    In the example of a herbal remedy, there might be little or no evidence that a particular molecule it contains is not biologically active. In this case a couple of studies might suggest that the remedy has an effect and that we should explore the pathways by which it works to learn new science. Great.

    For homeopathy, on the other hand, for it to work would overturn deep science such as the notion that water molecules interaction with biological systems do not change based on the previous history of the container into which those water molecules are mixed into. (And no, non-local quantum effects won’t do it – quantum physics is well understood and totally incompatible with homeopathy). Therefore we should treat any study which purports to show homeophathic effects with deep, deep suspicion.

  27. jt512 says:

    David Weinberg wrote:

    jt512 wrote:

    The probability that the null hypothesis is true, given the p-value, depends on the prior probability of the null hypothesis and the power of the study.

    That statement is simply false. The null hypothesis is either true, or untrue. The best estimate is the prior probability.

    It is true that the null hypothesis is either true or it is false. However, we usually don’t know with certainty whether it is true or false. Therefore, the best we can do is assign a probability that it is true. Prior to conducting the study, that probability is the prior probability. After conducting the study, that probability is the posterior probability; that is, the probability that the null hypothesis is true, given the data, or equivalently, given the p-value.

  28. pmoran says:

    rbnigh:The problems of medical research that have stumped us are probably still problems precisely because they don’t fit in with our plausibility bias. The answers are elsewhere, where we are not looking, where are biases blind us.

    Possible, I suppose, in the fragile sense that all scientific knowledge is provisional to some degree — but not very likely.

    Our tough problems may not be solved until we get help from the same sources that have led to past medical advances. They include serendipity (e.g. penicillin, Viagra), advances in technology (the stethoscope, fibreoptic endoscopy enabling the discovery of Helicobacter in ulcers), new findings within the basic medical sciences (innumerable) and trial and error (herbs mainly).

  29. evilrobotxoxo says:

    I have a different take on statistics than you guys do, I think, in the context of general medical research. The problem is that a binary null-hypothesis vs. alternative hypothesis model is not appropriate when any actual treatment (i.e. excluding things like homeopathy) will have a measurable effect if the sample size is large enough. If you do a trial of tylenol for suicide prevention, you will eventually find that suicide rates are X% in the placebo group and X-0.000001% (or maybe X+0.000001%) in the tylenol group. You will eventually get a P value below 0.05 if your sample size is large enough (possibly larger than the population of the earth). A more important question is what is the effect size in a representative clinical population, and the current paradigm of RCTs doesn’t do a very good job of telling us this. I recognize this is a separate question from what is being discussed, but I think it’s ultimately a more important issue.

Comments are closed.