OK, I admit that I pulled a fast one. I never finished the last post as promised, so here it is.
This systematic review has clearly identified the need for randomised or controlled clinical trials assessing the effectiveness of Laetrile or amygdalin for cancer treatment.
I’d previously asserted that this conclusion “stand[s] the rationale for RCTs on its head,” because a rigorous, disconfirming case series had long ago put the matter to rest. Later I reported that Edzard Ernst, one of the Cochrane authors, had changed his mind, writing, “Would I argue for more Laetrile studies? NO.” That in itself is a reason for optimism, but Dr. Ernst is such an exception among “CAM” researchers that it almost seemed not to count.
Until recently, however, I’d only seen the abstract of the Cochrane Laetrile review. Now I’ve read the entire review, and there’s a very pleasant surprise in it (Professor Simon, take notice). In a section labeled “Feedback” is this letter from another Cochrane reviewer, which was apparently added in August of 2006, well before I voiced my own objections:
The authors’ state that they: “[have] clearly identified the need for randomised or controlled clinical trials assessing the effectiveness of Laetrile or amygdalin for cancer treatment.” This is to fail completely to understand the nature of oncology research in which agents are tested in randomized trials (“Phase III”) only after they have been successful in Phase I and II study. There was a large Phase II study of laetrile (N Engl J Med. 1982 Jan 28;306(4):201-6) which the authors of the review do not cite, they merely exclude as being non-randomized. But the results of the paper are quite clear: there was no evidence that laetrile had any effect on cancer (all patients had progression of disease within a few months); moreover, toxicity was reported. To expose patients to a toxic agent that did not show promising results in a single arm study is clinical, scientific and ethical nonsense.
I would like to make a serious recommendation to the Cochrane Cancer group that no reviews on cancer are published unless at least one of the authors either has a clinical practice that focuses on cancer or actively conducts primary research on cancer. My recollection when the Cochrane collaboration was established was that the combination of “methodologic” and “content” expertise was essential.
Wow! That letter makes several of the same arguments that we’ve made here: that for both scientific and ethical reasons, scientific promise (including success in earlier trials) ought to be a necessary pre-requisite for a large RCT; that the 1982 Moertel case series was sufficient to disqualify Laetrile; and that EBM, at least in this Cochrane review, suffers from “methodolatry.” It also brings to mind Steven Goodman’s words:
An important problem exists in the interpretation of modern medical research data: Biological understanding and previous research play little formal role in the interpretation of quantitative results. This phenomenon is manifest in the discussion sections of research articles and ultimately can affect the reliability of conclusions. The standard statistical approach has created this situation by promoting the illusion that conclusions can be produced with certain “error rates,” without consideration of information from outside the experiment.
This method thus facilitated a subtle change in the balance of medical authority from those with knowledge of the biological basis of medicine toward those with knowledge of quantitative methods, or toward the quantitative results alone, as though the numbers somehow spoke for themselves.
Perhaps most surprising about the ‘Feedback’ letter is the identity of its author: Andrew Vickers, a biostatistician who wrote the Center for Evidence-Based Medicine’s “Introduction to evidence-based complementary medicine.” I’ve complained about that treatise before in this long series, observing that
There is not a mention of established knowledge in it, although there are references to several claims, including homeopathy, that are refuted by things that we already know.
Well, Dr. Vickers may not have considered plausibility when he wrote his Intro to EBCM, but he certainly seems to have done so when he wrote his objection to the Cochrane Laetrile review. Which is an appropriate segue to a topic that Dr. Vickers hints at (“content expertise”), perhaps unintentionally, in the letter quoted above: Bayesian inference.
A few years ago I posted three essays about Bayesian inference: they are linked below (nos. 2-4). The salient points are these:
- Bayes’s Theorem is the solution to the problem of inductive inference, which is how medical research (and most science) proceeds: we want to know the probability of our hypothesis being true given the data generated by the experiment in question.
- Frequentist inference, which is typically used for medical research, applies to deductive reasoning: it tells us the probability of a set of data given the truth of a hypothesis. To use it to judge the probability of the truth of that hypothesis given a set of data is illogical: the fallacy of the transposed conditional.
- Frequentist inference, furthermore, is based on assumptions that defy reality: that there have been an infinite number of identically designed, randomized experiments (or other sort of random sampling), without error or bias.
- Bayes’s Theorem formally incorporates, in its “prior probability” term, information other than the results of the experiment. This is the sticking point for many in the EBM crowd: they consider prior probability estimates, which are at least partially subjective, to be arbitrary, capricious, untrustworthy, and—paradoxically, because it is science that is ignored in the breach—unscientific.
- Nevertheless, prior probability matters whether we like it or not, and whether we can estimate it with any certainty or not. If the prior probability is high, even modest experimental evidence supporting a new hypothesis deserves to be taken seriously; if it is low, the experimental evidence must be correspondingly robust to warrant taking the hypothesis seriously. If the prior probability is infinitesimal, the experimental evidence must approach infinity to warrant taking the hypothesis seriously.
- Frequentist methods lack a formal measure of prior probability, which contributes to the seductive but erroneous belief that “conclusions can be produced…without consideration of information from outside the experiment.”
- The Bayes Factor is a term in the theorem that is based entirely on data, and is thus an objective measure of experimental evidence. Bayes factors, in the words of Dr. Goodman, “show that P values greatly overstate the evidence against the null hypothesis.”
I bring up Bayes again to respond to Prof. Simon’s statements, recently echoed by several readers, that people may differ strongly in what they consider plausible, and that it is not clear how prior probability estimates might be incorporated into formal reviews. I’ve discussed these issues previously (here and here, and in recent comments here and here), but it is worth adding a point or two.
First, it doesn’t really matter that people may differ strongly in what they consider plausible. What matters is that they commit to some range of plausibility—in public and with justifications, in the cases of authors and reviewers, so that readers will know where they stand—and that everyone understands that this matters when it comes to judging the experimental evidence for or against a hypothesis.
An example will explain these points. Wayne Jonas was the Director of the US Office of Alternative Medicine from 1995 until its metamorphosis into the NCCAM in 1999. He is the co-author, along with Jennifer Jacobs, of Healing with Homeopathy: the Doctors’ Guide (©1996), which unambiguously asserts that ultra-dilute homeopathic preparations have specific effects. Yet Jonas is also the co-author (with Klaus Linde) of a 2005 letter to the Lancet that includes this statement, prefacing his argument that homeopathy, already subjected to hundreds of clinical trials, has not been disproved and deserves further trials:
We agree that homoeopathy is highly implausible and that the evidence from placebo-controlled trials is not robust.
Bayes’s theorem shows that Jonas can’t have it both ways. Either he doesn’t really agree that homeopathy is highly implausible (which seems likely, unless he changed his mind between 1996 and 2005—oops, he didn’t); or, if he does, he needs to recognize that his statement quoted above is equivalent to arguing that the homeopathy ‘hypothesis’ has been disproved, at least to an extent sufficient to discourage further trials.
Next, does it matter that we can’t translate qualitative statements of plausibility to precise quantitative measures? Does this mean that prior probability, in the Bayesian sense, is not applicable? I don’t think so, and neither do many scientists and statisticians. Even “neutral” or “non-informative” priors, when combined with Bayes factors, are more useful than P values (see #7 above). “Informative” priors—estimated priors or ranges of priors based on existing knowledge—are both useful and revealing: useful because they show how differing priors affect the extent to which we ought to revise our view of a hypothesis in the face of new experimental evidence (see #5 above); and revealing of where authors and others really stand, and of the information that those authors have used to make their estimates.
I believe that frequentist statistics has allowed Dr. Jonas and other “CAM” enthusiasts to project a posture of scientific skepticism, as illustrated by Jonas’s words quoted above, without having to accept the consequences thereof. If convention had compelled him to offer a prior high enough to warrant further trials of homeopathy, Dr. Jonas would have revealed himself as credulous and foolish.
Finally, there is no reason that qualitative priors can’t be translated, if not precisely then at least usefully, to estimated quantitative priors. Sander Greenland, an epidemiologist and a Bayesian, explains this in regard to household wiring as a possible risk factor for childhood leukemia. First, he argues that there are often empirical bases for estimating priors:
…assuming (an) absence of prior information is empirically absurd. Prior information of zero implies that a relative risk of (say) 10100 is as plausible as a value of 1 or 2. Suppose the relative risk was truly 10100; then every child exposed >3 mG would have contracted leukaemia, making exposure a sufficient cause. The resulting epidemic would have come to everyone’s attention long before the above study was done because the leukaemia rate would have reached the prevalence of high exposure, or ~5/100 annually in the US, as opposed to the actual value of 4 per 100,000 annually; the same could be said of any relative risk >100. Thus there are ample background data to rule out such extreme relative risks.
The same could be said for many “CAM” methods that, while not strictly subjects of epidemiology per se, have generated ample experimental data (see homeopathy) or have been in use by enough people for enough time to have been noticed for substantial deviations from typical outcomes of universal diseases, should such deviations exist (see “Traditional [insert ethnic group here] Medicine”).
Next, Greenland has no problem with non-empirically generated priors, because these are revealing as well:
Many authors have expressed extreme scepticism over the existence of an actual magnetic-field effect, so much so that they have misinterpreted positive findings as null because they were not ‘statistically significant’ (e.g. UKCCS, 1999). The Bayesian framework allows this sort of prejudice to be displayed explicitly in the prior, rather than forcing it into misinterpretation of the data.
By “misinterpretation,” Greenland is arguing not that the “positive findings” of epidemiologic studies have proven the existence of a magnetic field effect, but that the objections of extreme skeptics must be made explicit: it is their presumed, if unstated, prior probability estimates that justify their conclusions about whether or not there is an actual magnetic field effect associated with childhood leukemia; it is not the data collection itself. Prior probability estimates put people’s cards on the table.
I recommend the rest of Greenland’s article, which is full of interesting stuff. For example, he doesn’t agree that “objective” Bayesian methods, using non-informative priors (see my point #7 above) are more useful than frequentist methods, since they are really doing the same thing:
…frequentist results are what one gets from the Bayesian calculation when the prior information is made negligibly small relative to the data information. In this sense, frequentist results are just extreme Bayesian results, ones in which the prior information is zero, asserting that absolutely nothing is known about the [question] outside of the study. Some promote such priors as ‘letting the data speak for themselves’. In reality, the data say nothing by themselves: The frequentist results are computed using probability models that assume complete absence of bias and so filter the data through false assumptions.
All for now. In the next post I’ll discuss another Cochrane review that has some pleasant surprises.
*The Prior Probability, Bayesian vs. Frequentist Inference, and EBM Series:
16. What is Science?