This is the second post in a series* prompted by an essay by statistician Stephen Simon, who argued that Evidence-Based Medicine (EBM) is not lacking in the ways that we at Science-Based Medicine have argued. David Gorski responded here, and Prof. Simon responded to Dr. Gorski here. Between that response and the comments following Dr. Gorski’s post it became clear to me that a new round of discussion would be worth the effort.
Part I of this series provided ample evidence for EBM’s “scientific blind spot”: the EBM Levels of Evidence scheme and EBM’s most conspicuous exponents consistently fail to consider all of the evidence relevant to efficacy claims, choosing instead to rely almost exclusively on randomized, controlled trials (RCTs). The several quoted Cochrane abstracts, regarding homeopathy and Laetrile, suggest that in the EBM lexicon, “evidence” and “RCTs” are almost synonymous. Yet basic science or preliminary clinical studies provide evidence sufficient to refute some health claims (e.g., homeopathy and Laetrile), particularly those emanating from the social movement known by the euphemism “CAM.”
It’s remarkable to consider just how unremarkable that last sentence ought to be. EBM’s founders understood the proper role of the rigorous clinical trial: to be the final arbiter of any claim that had already demonstrated promise by all other criteria—basic science, animal studies, legitimate case series, small controlled trials, “expert opinion,” whatever (but not inexpert opinion). EBM’s founders knew that such pieces of evidence, promising though they may be, are insufficient because they “routinely lead to false positive conclusions about efficacy.” They must have assumed, even if they felt no need to articulate it, that claims lacking such promise were not part of the discussion. Nevertheless, the obvious point was somehow lost in the subsequent formalization of EBM methods, and seems to have been entirely forgotten just when it ought to have resurfaced: during the conception of the Center for Evidence-Based Medicine’s Introduction to Evidence-Based Complementary Medicine.
Thus, in 2000, the American Heart Journal (AHJ) could publish an unchallenged editorial arguing that Na2EDTA chelation “therapy” could not be ruled out as efficacious for atherosclerotic cardiovascular disease because it hadn’t yet been subjected to any large RCTs—never mind that there had been several small ones, and abundant additional evidence from basic science, case studies, and legal documents, all demonstrating that the treatment is both useless and dangerous. The well-powered RCT had somehow been transformed, for practical purposes, from the final arbiter of efficacy to the only arbiter. If preliminary evidence was no longer to have practical consequences, why bother with it at all? This was surely an example of what Prof. Simon calls “Poorly Implemented Evidence Based Medicine,” but one that was also implemented by the very EBM experts who ought to have recognized the fallacy.
There will be more evidence for these assertions as we proceed, but the main thrust of Part II is to begin to respond to this statement from Prof. Simon: “There is some societal value in testing therapies that are in wide use, even though there is no scientifically valid reason to believe that those therapies work.”
Some such Testing is Useful (and Fun)…
First, let me say that I am not opposed to all trials pertaining to such methods (not “therapies,” which begs the question), assuming that the risks to subjects are minimal, the funding is not public, and the study is honest and ethical in every respect. For example, I’m happy that studies have been done looking at interexaminer reliability of practitioners who claim to detect ‘craniosacral rhythms’ (there is none), at the ability of ‘therapeutic touch’ practitioners to detect the ‘human energy field’ when denied visual cues (they can’t), or whether ‘provers’ can distinguish between an ‘ultramolecular’ homeopathic preparation and a ‘placebo’ (they can’t).
Those sorts of trials are small, cheap, paid for with private money, often test the claimants themselves (on whom the onus of proof belongs), have minimal risk of harm or discomfort, and each hypothesis tested is a sine qua non of a larger therapeutic claim. Such tests are simpler and less bias- and error-prone than are efficacy trials of the corresponding claims, and are sufficient to reject those claims. Yet EBM typically ignores such research when reviewing efficacy claims, as exemplified by the Cochrane homeopathy abstracts quoted in Part I and by a Cochrane review of “touch therapies.” In the case of homeopathy, there are several other testable hypotheses that, when tested, have also disconfirmed the larger claim. Why aren’t they cited in EBM-style reviews?
Here I must give Prof. Simon some credit. In his most recent discussion of EBM vs. SBM he wrote the following:
Now part of me says things like, no funding of research into therapeutic touch until someone can replicate the Emily Rosa experiment and show different results than Ms. Rosa did. So I’m kind of split on this issue.
It was the Emily Rosa experiment that demonstrated that ‘therapeutic touch’ practitioners could not detect the ‘human energy field’ when denied visual cues. Thus in some ways Prof. Simon and I are not that far apart, although I’m not at all split on the issue. I’ll discuss more of this in Part III.
…But Efficacy Trials are Not
Regarding publicly funded efficacy trials of implausible claims, my responses are several, including those that Dr. Gorski has already discussed: such studies don’t convince true believers, they are frequently unethical and even dangerous, and they waste research funds. Prof. Simon counters that to have societal value, studies needn’t convince true believers, only fence-sitters (true but irrelevant—see below), and that the public money spent is such a small portion of the entire health care bill that it makes little difference—but here he stumbles a bit:
Money spent on health care is a big, big pot of money and the money spent on research is peanuts by comparison. If we spend some research money to help insure that the big pot of money is spent well, we have been good stewards of the limited research moneys.
The issue, of course, is whether or not the research money is well spent. In the case of efficacy trials of methods that lack scientific bases, the money is never well spent. The same people who would be convinced by such trials ought to be convinced by NIH scientists simply explaining to them, in a definitive way, that there is no scientifically valid reason for those methods to work, or that the methods have already been disproved by other investigations, including the types of trials just mentioned. If such statements are not convincing, why not? Remember, fence-sitters are not true believers or anti-intellectual, conspiracy-theory-laden, anti-fluoride, pro-Laetrile, pro-chelation, anti-vax paranoiacs. If they were, they wouldn’t be convinced by trials either, would they?
To explain why otherwise reasonable people might not be convinced by definitive statements based on science, we need look no further than EBM’s own scientific blind spot, as perfectly exemplified by Dr. Ernst’s words in his 2003 debate with Cees Renckens, quoted in Part I:
In the context of EBM, a priori plausibility has become less and less important. The aim of EBM is to establish whether a treatment works, not how it works or how plausible it is that it may work. The main tool for finding out is the RCT…
Ironically, it may be that those at most risk for being unconvinced by science are physicians themselves—thanks to EBM. What follows is the passage that I promised at the end of Part I. It illustrates just how elusive clear thinking can be, even for very intelligent people, after they’ve been steeped in EBM. Originally posted here, it also introduces the next reason that we should, er, look askance at calls for efficacy trials of implausible claims:
Failing to consider Prior Probability leads to Unethical Human Studies
An example…was the regrettable decision of two academic pediatricians, one of whom is a nationally-recognized expert in pediatric infectious disease, to become co-investigators in an uncontrolled trial of homeopathic “remedies” for acute otitis media (AOM) in 24 children, ages 8 mos-77 mos. The treating investigators were homeopaths. The report provided evidence that 16 of the children had persistent or recurrent symptoms, lasting from several days to 4 weeks after diagnosis. Nevertheless, no child was prescribed analgesics or anti-pyretics, and only one child was eventually prescribed an antibiotic by an investigator (another in an emergency room). There is no evidence that the investigators evaluated any of the subjects for complications of AOM, nor did the two academic pediatricians “interact with any research subjects.” Similar examples are not hard to find.
Funny thing about EBM’s tenacious hold on medical academics: a few years ago, when I first noticed the report just described, I ran it by a friend who is the chief of pediatrics at a Boston area hospital and a well-known academic pediatrician in his own right. After I explained the tenets of homeopathy, he agreed that it is highly implausible. At that point I expected him to agree that the trial had been unethical. Instead he queried, “but isn’t its being unproven just the reason that they should be allowed to study it?” There was no convincing him.
Such faith in clinical trials as absolute, objective arbiters of truth about claims that contradict established knowledge raises another point that will have to wait for Part III: RCTs are not objective arbiters in such cases, but rather tend to confuse more than clarify. For now, let’s continue to look at…
Human Studies Ethics
A “Clinically Competent Medical Person”
That homeopaths were accepted as the sole, treating clinician/investigators in the trial just mentioned should make any IRB member’s eyebrows raise. According to the Helsinki Declaration,
Medical research involving human subjects should be conducted only by scientifically qualified persons and under the supervision of a clinically competent medical person. The responsibility for the human subject must always rest with a medically qualified person and never rest on the subject of the research, even though the subject has given consent.
The physician may combine medical research with medical care, only to the extent that the research is justified by its potential prophylactic, diagnostic or therapeutic value. When medical research is combined with medical care, additional standards apply to protect the patients who are research subjects.
Dr. Gorski mentioned two other trials that I’ve written extensively about, the Gonzalez trial for cancer of the pancreas and the ongoing Trial to Assess Chelation Therapy (TACT). Each of those claims had a miniscule prior probability, but proponents justified each by “popularity” and by appeals to EBM such as the AHJ editorial quoted above. Each trial involved clinically incompetent investigators chosen by the NIH: Gonzalez himself in the former and numerous chelationists in the latter, most of whom are members of the organizations described here, and many of whom have been subjected to actions by state medical boards, federal civil settlements, or criminal convictions. Predictably, the Gonzalez trial involved unnecessary torture of human subjects, and the TACT has involved unnecessary deaths.
Below are quotations from a post that subjected the Gonzalez trial to ethical scrutiny; most of the arguments apply to implausible claims in general. For the purposes of this post I’ll provide new topic headings and a few comments.
Informed Consent and Clinical Equipoise
In 2003, using the Gonzalez regimen as an example, I argued that the information offered to prospective subjects of trials of implausible claims is likely to be misleading:
Plausibility also figures in informed consent language and subject selection. How many subjects who are not wedded to “alternative medicine” would be likely to join a study that independent reviewers rate as unlikely to yield any useful results, or in which the risks are stated to outweigh the potential benefits? Are informed consents for such studies honest? In at least one case cited in the following paragraph, the answer is “no.” Nor may subjects who prefer “alternative” methods be preferentially chosen for such research even if they seek this, because “fair subject selection requires that the scientific goals of the study, not vulnerability, privilege, or other factors unrelated to the purposes of the research, be the primary basis for determining the groups and individuals that will be recruited and enrolled” (Emanuel et al. 2000).
The Office for Human Research Protections recently cited Columbia University for failure to describe serious risks on the consent form of its “Gonzalez” protocol for cancer of the pancreas, funded by the NCCAM (OHRP 2002). The study proposes to compare the arduous “Gonzalez” method, which is devoid of biological rationale, to gemcitabine, an agent acknowledged by the investigators to effect “a slight prolongation of life and a significant improvement in . . . quality of life.” Nevertheless, a letter from Columbia to prospective subjects states, “it is not known at the present time which treatment approach is best [sic] overall” (Chabot 1999). The claim of clinical equipoise, or uncertainty in the expert medical community over which treatment is superior—necessary to render a comparison trial ethical—is not supported by the facts (Freedman 1987).
The consent forms for both the TACT and the homeopathy trial mentioned above were also uninformative or worse. For my comments on the former, look here under “Comments on the TACT Consent Form”; for the homeopathy trial’s consent form, look here and reach your own conclusions (hint: there is no mention of the risks of omitting standard treatments for acute otitis media).
Ms. Gurney’s article [about a friend who submitted himself to the Gonzalez trial] provides additional, compelling evidence that the Gonzalez protocol did not meet the standard of clinical equipoise:
…at ASCO, I learned quickly and definitively that the Gonzalez protocol was a fraud; no mainstream doctors believed it was anything else and they were surprised that anyone with education would be on it.
The “mainstream doctors” of the American Society of Clinical Oncology must be judged representatives of the pertinent “expert medical community.”
The TACT also violates the principle of clinical equipoise, even as it claims to do otherwise, as discussed here under “‘Clinical Equipoise’ and the Balance for Risks and Benefits.”
Science and Ethics
There is a consensus, among those who consider human studies ethics, that a study must be scientifically sound in order to be ethical. According to the Council for International Organizations of Medical Sciences. International Ethical Guidelines for Biomedical Research Involving Human Subjects (CIOMS; Geneva, Switzerland:1993. Quoted here):
Scientifically unsound research on human subjects is ipso facto unethical in that it may expose subjects to risks or inconvenience to no purpose.
The Helsinki Declaration agrees:
Medical research involving human subjects must conform to generally accepted scientific principles, be based on a thorough knowledge of the scientific literature, other relevant sources of information, and on adequate laboratory and, where appropriate, animal experimentation.
There is no body of basic science or animal experimentation that supports the claims of Gonzalez.
Emanuel and colleagues, writing in JAMA in 2000, asserted:
Examples of researchthat would not be socially or scientifically valuable includeclinical research with…a triflinghypothesis…
I assert that highly implausible claims ought to be viewed as “trifling hypotheses.”
The Fallacy of Popularity
Virtually all of the research agenda of the NCCAM has been justified by the assertion that implausible claims that are popular require research, merely because people are using them. Referring to the opinions of the late NCCAM Director Stephen Straus, Science Magazine wrote in 2000:
Scientific rigor is sorely needed in this enormously popular but largely unscrutinized field….Most of these substances and treatments have not been tested for either safety or efficacy.
As surprising as it may be to some, however, a method’s popularity may not supersede the interests of individual trial subjects. According to the Helsinki Declaration:
In medical research on human subjects, considerations related to the well-being of the human subject should take precedence over the interests of science and society.
The Belmont Report agrees:
Risks and benefits of research may affect the individual subjects, the families of the individual subjects, and society at large (or special groups of subjects in society). Previous codes and Federal regulations have required that risks to subjects be outweighed by the sum of both the anticipated benefit to the subject, if any, and the anticipated benefit to society in the form of knowledge to be gained from the research. In balancing these different elements, the risks and benefits affecting the immediate research subject will normally carry special weight.
The U.S. Code of Federal Regulations is unequivocal:
The IRB should not consider possible long-range effects of applying knowledge gained in the research (for example, the possible effects of the research on public policy) as among those research risks that fall within the purview of its responsibility. (CFR §46.111)
“Popularity” is a Ruse
In addition to the ethical fallacy just discussed, there is another fallacy having to do with popularity: the methods in question aren’t very popular. In the medical literature, the typical article about an implausible health claim begins with the irrelevant and erroneous assertion that “34%” or “40%” or even “62%” (if you count prayer!) of Americans use ‘CAM’ each year. This is irrelevant because at issue is the claim in question, not ‘CAM’ in general. It is erroneous because ‘CAM’ in general is so vaguely defined that its imputed popularity has been inflated to the point of absurdity, as exemplified by the NCCAM’s attempt, in 2002, to include prayer (which it quietly dropped from the subsequent, 2007 survey results).
It is erroneous also because it fails to distinguish between such different issues as consulting a practitioner and casually purchasing a vitamin pill at the supermarket, or between Weight Watchers and the pseudoscientific “blood type diet,” and much more. It is erroneous also because it fails to distinguish between occasional and frequent use or between rational use and flimflam (vitamins for deficiency states vs. vitamins to shrink tumors; visualization for anxiety vs. visualization to shrink tumors).
Most of the ‘CAM’ claims for which people consult practitioners are fringe methods, each involving, in the most credible survey, less than 1% of the adult population. The slightly more popular exceptions are chiropractic and massage, reported by 3.3% and 2%, respectively, but these numbers also fail to distinguish rational expectations from flimflam (a wish to alleviate back pain or muscle soreness vs. a wish to cure asthma or to remove ‘toxins’). The most recent National Health Interview Survey (NHIS), co-authored by an NCCAM functionary, reported that 8.6% of adults had used “chiropractic or osteopathic manipulation” in the previous 12 months, further confusing the question of chiropractic.
Let’s revisit an example of how the ‘popularity’ gambit has been used to entice scientific reviewers and taxpayers to pony up for regrettable ‘CAM’ research. The aforementioned NCCAM/NHLBI-sponsored Trial to Assess Chelation Therapy for coronary artery disease (TACT), which at $30 million and nearly 2400 subjects was to be the most expensive and largest NIH-sponsored ‘CAM’ trial when it began in 2003, was heralded as follows:
“The public health imperative to undertake a definitive study of chelation therapy is clear. The widespread use of chelation therapy in lieu of established therapies, the lack of adequate prior research to verify its safety and effectiveness, and the overall impact of coronary artery disease convinced NIH that the time is right to launch this rigorous study,” said Stephen E. Straus, M.D., NCCAM Director.
Over 800,000 patient visits were made for chelation therapy in the United States in 1997…
In the application that won him the TACT grant, Dr. Gervasio Lamas, who had also been the author of the American Heart Journal editorial quoted above, used similar language:
2.0 BACKGROUND AND SIGNIFICANCE
2.1 Alternative Medicine and Chelation Therapy in the United States
…A carefully performed national survey, and other more restricted local surveys all find the practice of alternative medicine to be widespread…34% reported using at least one alternative therapy in the last year…Thus alternative medical practices are common, and constitute a significant and generally hidden health care cost for patients.
NCCAM estimated that more than 800,000 visits for chelation therapy were made in the U.S. in 1997…
Sounds impressive, huh? Less so when you know the truth. The NCCAM made no such estimation. It merely accepted, without question, the number given to it by the American College for Advancement in Medicine (ACAM)—a tiny group of quacks who’d been peddling chelation for decades, especially so after their original snake oil-of-choice, Laetrile, had been outlawed. Not mentioned by the NCCAM press release or by Dr. Lamas was that the purported 800,000 chelation visits were for all comers: the ACAM member appointed as TACT “Trial Chelation Consultant” touts chelation for about 70 indications (it’s the One True Cure), so we can only guess how many hapless chelation recipients thought they were being treated for coronary disease.
For the NIH to have chosen ‘visits’ as the units of popularity, moreover, was misleading in itself: each person submitting to chelation typically makes at least 30 biweekly visits followed by indefinite bimonthly visits, so even if the ACAM number had been accurate, about 0.01% of the U.S. adult population underwent the treatment in 1997—a far cry from “34%.”
It’s no surprise, then, that the TACT has dragged on considerably longer than the originally planned 5 years. It hasn’t been able to recruit enough subjects! If you wade through the history of the trial on ClinicalTrials.gov, you’ll find that the expected subject enrollment has dwindled from 2372 to 1700, in spite of the NIH having selected more than 100 “community chelation practices” as study sites, in spite of its having added 22 (originally unplanned) Canadian sites a few years later, and in spite of the trial’s duration having been prolonged by several years.
A brief perusal of the 2002 NHIS data reveals that the NCCAM could have predicted this problem: the survey estimated that 0.0% (sic) of the U.S. adult population had used chelation for any reason in the previous 12 months, based on 10 of 31,000 adults interviewed having answered in the affirmative—a number so small that the extrapolation to the entire population “did not meet standards of reliability or precision.” Do you suppose that Director Straus was aware of the NHIS data when he asserted a “widespread use of chelation therapy”?
Next: Efficacy trials of highly implausible claims don’t work very well.
*The Prior Probability, Bayesian vs. Frequentist Inference, and EBM Series:
16. What is Science?