After the previous posting on the Bayesian approach to clinical trial data, several new comments made it clear to me that more needed to be said. This posting addresses those comments and adds a few more observations regarding the unfortunate consequences of EBM’s neglect of prior probability as it applies to “complementary and alternative medicine” (“CAM”).†
The “Galileo Gambit” and the Statistics Gambit
Reader durvit wrote:
A very interesting example, for a number of people, might be estimating the prior probability for Marshall and Warren’s early work on Helicobacter pylori and its impact on gastroduodenal management. I frequently have Marshall quoted to me as a variation on the Galileo gambit, so establishing whether he and Warren would have been helped or hindered by Bayesian techniques would be useful.
This suggestion raises a couple of issues. First, the “Galileo gambit” regarding Marshall and Warren’s discovery is a straw man (as durvit seems to have surmised).
The second issue is one that I might have overlooked if durvit hadn’t called attention to it: not all experiments require statistical analyses. Occasionally there are results that are so clear-cut, and that fit with prior knowledge so well, that a few independent replications make a convincing case for the hypothesis. Although it may come as a surprise to some, that was the case for H. pylori and peptic ulcer disease.
In extreme summary, for M and W there were three, or maybe four steps: 1. finding “unidentified curved bacilli” in enough specimens of diseased gastric epithelium (at first gastrititis, later ulcer) to become convinced that the bacteria were real and possibly pathogenic, and thus to pursue them (reported in 1983); 2. culturing and thereby isolating and characterizing the new bacterium (a difficult job because of its fastidiousness, but with a little luck it finally worked—reported in 1984); 3. replications of their bacteriologic findings by huge numbers of researchers all over the world within a couple of years of their report of the first successful culture, and the common finding that the organism was found in diseased tissues, thus lending credence to M and W’s early suspicion of pathogenicity; 4. therapeutic trials of eradicating H. pylori to observe its effect on peptic ulcer disease.
For the first 3 steps, either the finding (a new species of bacteria) was there or it wasn’t. Growing it in culture was a challenge, but that consisted of recognizing a few new wrinkles in standard bacteriology: what media it liked, what other environmental requirements it had (5% oxygen), how long it took to grow (a long time, which is why M and W almost gave up). No need for P values or prior probabilities there. Step 4, therapeutic trials, might have needed a little statistical help except that the results were so dramatic and the context so ready: eradicating H. pylori with antibiotics reduced the recurrence rates of healed peptic ulcers from 50-90% to about 10%, a result that was consistently found in several, independent trials. A slam-dunk: no statistical analyses necessary.
By “context,” I mean that by the time of the therapeutic trials, most people in the field were ready to accept the pathogenic role of H. pylori because of its predictable association with diseased tissue. The H. pylori hypothesis, as reader daedalus suspected, became plausible as soon as the organism was recovered and cultured. Ulcers are inflammatory processes, and inflammatory processes are commonly caused by bacteria. The “acid environment” of the stomach is child’s play for the adaptive capacities of prokaryotes, as every self-respecting microbiologist (if not every MD) by that time knew. The only legitimate skepticism, before M and W cultured the organism, would have been based on previous, failed attempts at culturing it. But those attempts had been made in the early 1950s, prior to a more sophisticated appreciation of the wide diversity of prokaryotic morphologies, staining requirements for microscopy, and “lifestyles.” Mycoplasma, for example, a common cause of atypical pneumonia, was thought to be a virus until the early 1960s.
The Marshall and Warren story is thus one in which the choice of statistical schools was beside the point.
Bayesian Analysis and Sequential Studies
Reader Mark raised an important point with this statement:
Suppose, hypothetically, that some new altmed treatment came out that sounded utterly ludicrous, but then was found to be effective in study after study, and replicated under controlled conditions by skeptical researchers. You would simply look at that research and say “well, I estimate the prior P of this treatment working as epsilon, since it sounds so stupid, therefore these studies don’t prove anything.”
I hadn’t discussed this previously because I was emphasizing the superiority of the Bayesian approach to individual studies, but Bayesian reasoning would not support such a conclusion. In any series of studies consistently finding that the novel hypothesis explains the data better than the null hypothesis does, the probability of the former must increase sequentially because of those studies. According to O’Hagan and Luce:
Today’s posterior is tomorrow’s prior.
The paradigm is about learning, and we can always learn more. When we acquire more data, Bayes’ theorem tells us how to update our knowledge to synthesize the new data. The old posterior contains all that we know before seeing the new data, and so becomes the new prior distribution. Bayes’ theorem synthesizes this with the new data to give the new posterior. And on it goes…Bayesian methods are ideal for sequential trials!
Notice that this also deals with hypotheses whose prior probabilities are initially deemed higher than the eventual reality, and thus shows how a series of trials will inevitably result in a convergence of disparate, initial probability estimates to some single distribution that everyone, from initial skeptic to initial enthusiast, must agree upon (please don’t challenge me on this point; I’m aware that it assumes that all observers will remain true to the mathematics, will agree that the data are credible, and maybe some other things that I’m not thinking of. I’ll say no more of that).
We Frequently (and inescapably) rely on Opinions but we Pretend Otherwise
A reiteration: clinical research is inductive and Bayes’ Theorem (simply proved, by the way) is the way to figure out how new data alter the probability of a hypothesis being true. There is no escaping it. Opinion already figures heavily in current clinical research and practice, but P-values and confidence intervals encourage researchers to pretend that it does not. One of the benefits of Bayesian statistics is to force opinion out into the open where it belongs. Prior probabilities, if stated, must be formally estimated (usually in the form of a distribution) and therefore justified. Readers may or may not agree, and the inevitable debates contribute to the progress of the field and let onlookers know where the debaters stand.
That happens even now, but it’s done largely post hoc, is opaque rather than transparent, and lacks both quantitation and a rational link to the data. Dr. Goodman gives examples in his “P-value” article. Covert opinion also figures in medical practice, obviously, but without a Bayesian perspective it is likely to be tentative and incomplete. Regarding the option to use highly implausible treatments themselves, for example, most physicians and patients seem to ignore “positive evidence” that must seem a bit wacky to them.
That’s a reasonable, if underdeveloped and non-quantitative estimate of prior probability that has stood modern physicians and patients in good stead. Many of those same people, however, when asked to weigh in on “Complementary and Alternative Medicine” (“CAM”) policy issues—e.g., research projects and funding, medical school curricula, or state endorsements of “CAM” practices—will bend to the weight of the “evidence,” EBM-style, not realizing that their personal misgivings are well-founded. That can lead to unfortunate consequences.
Failing to consider Prior Probability leads to Unethical Human Studies
An example, I suspect, was the regrettable decision of two academic pediatricians, one of whom is a nationally-recognized expert in pediatric infectious disease, to become co-investigators in an uncontrolled trial of homeopathic “remedies” for acute otitis media (AOM) in 24 children, ages 8 mos-77 mos. The treating investigators were homeopaths. The report provided evidence that 16 of the children had persistent or recurrent symptoms, lasting from several days to 4 weeks after diagnosis. Nevertheless, no child was prescribed analgesics or anti-pyretics, and only one child was eventually prescribed an antibiotic by an investigator (another in an emergency room). There is no evidence that the investigators evaluated any of the subjects for complications of AOM, nor did the two academic pediatricians “interact with any research subjects.” Similar examples are not hard to find. [1,2]
Funny thing about EBM’s tenacious hold on medical academics: a few years ago, when I first noticed the report just described, I ran it by a friend who is the chief of pediatrics at a Boston area hospital and a well-known academic pediatrician in his own right. After I explained the tenets of homeopathy, he agreed that it is highly implausible. At that point I expected him to agree that the trial had been unethical. Instead he queried, “but isn’t its being unproven just the reason that they should be allowed to study it?” There was no convincing him.
Prior Probability, Misleading Language, and Mr. Magoo
Even many researchers who should know better do not. I am happy that statistician R. Barker Bausell, who worked on “CAM” projects for several years, has finally seemed to realize that they are bogus. But what took him so long? Bausell doesn’t mention Bayes or prior probability in his book, although he seems to understand the P-value fallacy (pp. 172-4). In over 300 pages he limits his discussion of the “theory” of several “CAM” claims to a couple of paragraphs each, and in most cases he misses the point. Or perhaps his explanations merely conform to curious, recent anti-scientific admonitions to be “respectful” or “balanced” when discussing “CAM.”  This quotation is from Bausell’s two paragraphs on how homeopathy is “hypothesized to work” (p. 261):
There is no known physiological principle that can explain why “similars” or “like cures like” should work…[and] nothing else in nature that we know of becomes more potent as it is diluted. So the mechanism of action for this therapy is simply beyond the province of science…This is not to say, however, that the evaluation of these drugs’ clinical effectiveness is outside the ken of science, since also by definition they are especially suitable for placebo-controlled trials.
In their defense, homeopaths place much more emphasis upon developing a positive, caring relationship with their patients than do most conventional practitioners, and they probably spend more time with them as well, trying to understand both the patient and the genesis of his or her illness.” [etc.]
Coming from an active (or formerly active) “CAM” researcher, that is about as good as it gets. Nevertheless, I hope that readers who followed the entirety of the “Homeopathy and EBM” blog will have recognized the misleading language and erroneous content of those statements, however unintended on Bausell’s part. There is a big difference between the narrow truth of “no physiological principle that can explain why ‘similars’…should work” and the whole truth of the matter, to wit: the basis for Hahnemann’s claim of “similars” is categorically wrong. Similarly, “nothing else in nature that we know of becomes more potent as it is diluted” is understated in the extreme: to be true the claim would likely violate the 1st (energy is conserved), and certainly violate the 2nd (entropy increases) laws of the universe. To say that “the mechanism of action for this therapy is simply beyond the province of science” is to offer an epistemological safe haven for fools, akin to “evolution is only a theory.” It is also untrue, as Bausell surely knows, because there are quite plausible explanations for the “mechanism of action for this therapy.”
To suggest that science should evaluate homeopathy by doing placebo-controlled trials, thereby side-stepping established knowledge, is how EBM (and Bausell) got sucked into the “CAM” mess in the first place. Finally, crediting homeopaths for “trying to understand both the patient and the genesis of his or her illness” is plainly fatuous, referring as it must to their baseless and incoherent elicitations of “symptoms,” and the similarly daft processes of “provings” and prescribing—suggesting that Bausell, who is not a physician, has either been duped by homeopaths’ advertising pitches or, for some reason, feels that he must be conciliatory toward them.
Bausell praises the Cochrane Collaboration for offering the most “unbiased assessments of what works and what doesn’t work” and for “not being susceptible to accusations of bias toward ‘CAM.’ ” (p. 202) Yet almost all Cochrane “CAM” reviews avoid stating that the method under review “doesn’t work,” whereas in most cases there is overwhelming external evidence that this is the case. Rather, the reviews typically advise that the “evidence”—in the EBM sense—doesn’t yet support recommending the use of the method, and “more research” is necessary. The coy suggestion of future promise and the refusal to consider all the evidence betray a bias in favor of “CAM” claims—claims that inevitably wither under the stark light of what we already know about nature.
All of this is reminiscent of a point made by Dr. Goodman, regarding the unfortunate legacy of “frequentist” statistics in medical research:
This method thus facilitated a subtle change in the balance of medical authority from those with knowledge of the biological basis of medicine toward those with knowledge of quantitative methods, or toward the quantitative results alone, as though the numbers somehow spoke for themselves.
On the jacket of Bausell’s book is an approving blurb by Edzard Ernst, “the first Professor of Complementary Medicine in the United Kingdom.” Dr. Ernst writes that he and Bausell “both have researched CAM rigorously for many years; we both were unable to show through our work that much of CAM works convincingly; and we are both publishing a book about this experience.” If it’s true that Dr. Ernst has also arrived at the truth about “CAM,” again I might wonder why it took so long—except that I already know the answer. In responding to a well-argued plea for rational thinking about “CAM,” written in 2003 by our friend and colleague Cees Renckens, Ernst dismissed plausibility while making the spurious claim, refuted above, that if we had considered it we might never have allowed the bacterial cause of peptic ulcer disease to be demonstrated. His argument was a perfect illustration of the EBM-“frequentist” fallacy as it bears on “CAM” research:
In the context of EBM, a priori plausibility has become less and less important. The aim of EBM is to establish whether a treatment works, not how it works or how plausible it is that it may work. The main tool for finding out is the RCT. It is obvious that the principles of EBM and those of a priori plausibility can, at times, clash, and they often clash spectacularly in the realm of CAM.
Mr. Magoo-like, Dr. Ernst somehow managed to avoid stumbling upon the truth at the last possible moment. We are reminded of the emotional ejaculation of David Reilly that began this series, heralding his own unwitting brush with reality.
Perhaps I’m being too hard on Ernst and Bausell. It is somewhat understandable that they have been blinded by the limited EBM version of evidence, and it is to their credit that they have finally recognized “CAM” for what it is in spite of that handicap (if, indeed, they have). Dr. Ernst, in particular, stands alone among academic “CAM” enthusiasts in having managed to have one foot planted in the skeptical camp all along. He has been the world’s most prolific university-based reviewer of “CAM” claims over the past 15 years, and although he is spectacularly wrong to favor the “principles of EBM [over] a priori plausibility” when the two “clash spectacularly,” his reviews have been mostly rigorous, in the EBM sense. He is also the author of one of my favorite rebukes to those who think that “licensing” and “credentialing” of “CAM” practitioners is in the public interest:
Those who believe that regulation is a substitute for evidence will find that even the most meticulous regulation of nonsense must still result in nonsense.
The obvious truth of that statement, even if its “evidence” is of the anemic, EBM variety, stands in square opposition to the views of most of Ernst’s colleagues, including several with whom he has collaborated (do they talk?). The U.S. government could have saved its taxpayers much money and misinformation, and the NIH much embarrassment, simply by paying Dr. Ernst to do exactly what he’s been doing since 1993, rather than having created the Office of Alternative Medicine and later the NCCAM.
Dr. Ernst, unlike his colleagues at the Cochrane Collaboration, at the NCCAM, and elsewhere, has also never been afraid to say that a “CAM” claim doesn’t work. Examples can be found here, here, and here. This is especially impressive in light of the original funding for his Professorship, from the Maurice Laing Foundation, for “research into efficacy of complementary health treatments and their integration into general medicine.” Placing the cart of “integration” before the horse of “research into efficacy”—language that is common to the NCCAM and other proponent organizations—would seem to favor advocacy over skeptical inquiry, and is contrary to Dr. Ernst’s words quoted above. Interesting: Laing no longer includes Ernst or his department on its list of fundees. Instead it endows the University of Southampton, where George Lewith can apparently be counted on to remain a company man. Has Dr. Ernst, by being too rigorous in his “CAM” evaluations, worn out his welcome?
If so, he has my congrats. Then why can’t I just take “yes” for an answer from Ernst and Bausell? The reasons are several and have already been touched upon or discussed at some length. In the latter category are evidence not considered and its close relative, statistics erroneously applied. In the former are unethical human trials that will keep happening until EBM opens its eyes to all the evidence; “CAM” advocates boldly demanding “pluralism” in health care; public discourse filled with distortions of language, of science, and of the scientific basis of medicine; and much more. None of this needed to happen, even if there were politics and money behind it. It couldn’t have emerged from the eternal, marginal “woo” fringe if there hadn’t been well-meaning but misguided academics such as Ernst, Bausell, and many more welcoming it with open arms and open wallets. Most of them still do.
 Gurney S. Socially Harmful but Unapparent Effects of the NCCAM – Columbia University “Gonzalez” Protocol. Sci Rev Alt Med 7;2:74-77. (Fall-Winter 2004).
 Atwood KC, IV, Woeckner E, Baratz RS, Sampson WI. Why the NIH Trial to Assess Chelation Therapy (TACT) Should be Abandoned. Medscape Journal of Medicine. Accepted for publication.
 Sampson WI. On Being a Critic. Sci Rev Alt Med 2;1:4-5 (Spring/Summer 1998).
† The Prior Probability, Bayesian vs. Frequentist Inference, and EBM Series:
16. What is Science?