“Alternative medicine,” so-called “complementary and alternative medicine” (CAM), or, as it’s become fashionable to call it, “integrative medicine” is a set of medical practices that are far more based on belief than science. As Mark Crislip so pointedly reminded us last week, CAM is far more akin to religion than science-based medicine (SBM). However, as I’ve discussed more times than I can remember over the years, both here and at my not-so-super-secret-other blog, CAM practitioners and advocates, despite practicing what is in reality mostly pseudoscience-based medicine, crave the imprimatur that science can provide, the respect that science has. That is why, no matter how scientifically implausible the treatment, CAM practitioners try to tart it up with science. I say “tart it up” because they aren’t really providing a scientific basis for their favored quackery. In reality, what they are doing is choosing science-y words and using them as explanations without actually demonstrating that these words have anything to do with how their favored CAM works.
A more important fundamental difference between CAM and real medicine is that CAM practices are not rejected based on evidence. Basically, they never go away. Take homeopathy, for example. (Please!) It’s the ultimate chameleon. Even 160 years ago, it was obvious from a scientific point of view that homeopathy was nonsense and that diluting something doesn’t make it stronger. When it became undeniable that this was the case, through the power of actually knowing Avogadro’s number, homeopaths were undeterred. They concocted amazing explanations of how homeopathy “works” by claiming that water has “memory.” It supposedly “remembers” the substances with which it’s been in contact and transmits that “information” to the patient. No one’s ever been able to explain to me why transmitting the “information” from a supposed memory of water is better than the information from the real drug or substance itself, but that’s just my old, nasty, dogmatic, reductionist, scientific nature being old, nasty, dogmatic, reductionist, and scientific. Then, of course, there’s the term “quantum,” which has been so widely abused by Deepak Chopra, his acolytes, and the CAM community in general, while the new CAM buzzword these days to explain why quackery “works” is epigenetics. Basically, whenever a proponent of alternative medicine uses the word “epigenetics” or “quantum” to explain how an alternative medicine treatment “works,” what he really means is, “It’s magic.” This is a near-universal truth, and even the most superficial probing of such justifications will virtually always reveal magical thinking combined with an utter ignorance of the science of quantum mechanics or epigenetics.
So, yes, much of CAM is either very much more like religion than science in that CAM is immune to evidence. True, the scientific “explanations” change, and CAM practices might evolve at the edges based on evidence, but the core principles remain. You don’t see, for example, homeopaths or naturopaths deciding that homeopathy doesn’t work because science and clinical trials overwhelmingly show that it is nonsense. You don’t see chiropractors leaving chiropractic in droves because they’ve come to the realization that subluxations don’t exist and they can’t cure allergies, heart disease, gastrointestinal ailments (or anything else) but rather are in reality physical therapists with delusions of grandeur. Ditto reiki, acupuncture, therapeutic touch, and “energy healing.” These practices persist despite overwhelming evidence that they do not work and are based on magical thinking, not science. All of the scientific studies and clinical trials funded by NCCAM and other CAM-friendly organizations never actually take the next step from all the negative studies of CAM and come to the conclusion that they should stop using such modalities.
The key difference between SBM and CAM
No one is saying that the record of SBM is perfect when it comes to changing nimbly with new evidence, and any imperfection in the record of SBM and evidence-based medicine (EBM) actually being, well, science- and evidence-based, is a favorite target of CAM apologists. Hence there are frequent claims circulating that only 15% of medicine is actually evidence-based. It’s a bogus claim, a myth, as Steve Novella has pointed out. In reality, studies appear to converge on estimates that approximately 80% of interventions are based on compelling evidence, and between 30-60%, depending on the specialty, are based on randomized clinical trials. That’s not good enough, but it’s far better than CAM apologists would lead you to believe, and it’s certainly far better than anything in CAM.
Nonetheless, it has been recognized for a long time that EBM/SBM is sometimes slow to change in response to new evidence. Indeed, there was an aphorism I heard while in medical school that outdated treatments and procedures don’t die off completely until the physicians who learned them during residencies or fellowships die off. I learned that that’s not entirely true. There is, after all, a gap of around 20 years between the time a generation of physicians retires and dies off; so such practices actually die off much sooner. I keed, I keed, of course, but the point is valid.
There is the opposite problem in EBM/SBM as well, namely a tendency towards a “bandwagon” effect wherein a new therapy is widely adopted before there is solid evidence of its superiority (or at least of its non-inferiority with alternate benefits). I’m a surgeon, so I know that, unfortunately, the surgical world is very much prone to this sort of problem. Surgeons tend to like shiny, pretty new toys and to do spiffy new procedures that prove that they are the biggest, baddest scalpel cowboys in the all the land. These tendencies have led to a number of procedures becoming widely adopted before they were definitely shown to be superior. Laparoscopic cholecystectomy is the example that I like to use the most; it swept the surgical world over 20 years ago without compelling evidence for its safety. Later, it was found that the incidence of common bile duct injury was much higher after laparoscopic cholecystectomy than conventional cholecystectomy. That incidence fell as more surgeons became more facile at the procedure, but it was years before there was compelling evidence that the laparoscopic approach was truly superior. History seems to be repeating itself today with robotic surgery. At the risk of offending some of my surgical colleagues, I’ve yet to see compelling evidence that doing, for example, a radical prostatectomy with the da Vinci robot is truly superior to doing it using what was the new way ten or fifteen years ago but is now the old way, using laparoscopy. From my perspective evaluating existing evidence, the da Vinci is as safe and effective as laparoscopy, but if it is sufficiently more so to justify its much greater cost I haven’t seen the evidence yet. I sometimes joke that if it were possible to do breast surgery (my specialty) with the da Vinci, then I’d be all for it. Maybe I’ll have to look into that. I could be bigger than Armando Guiliano, and time’s wasting. I probably only have 15 or 20 years left in my career to make an international name for myself.
But how often are medical practices found to be ineffective and abandoned? How much do we test existing practices in light of new data? There have been a number of studies looking at this issue, which is already a marked contrast to CAM, where ineffective practices are, as far as I can tell, never abandoned. The most recent of these caught my eye last week. Published in the Mayo Clinic Proceedings by a team from the National Cancer Institute, the University of Chicago (one of my alma maters!), Northwestern University, George Washington University, and Lankenau Medical Center and entitled A Decade of Reversal: An Analysis of 146 Contradicted Medical Practices, this study seeks to get a handle on the answer to that very question for these reasons:
We expect that new medical practices gain popularity over older standards of care on the basis of robust evidence indicating clinical superiority or noninferiority with alternative benefits (eg, easier administration and fewer adverse effects). The history of medicine, however, reveals numerous exceptions to this rule. Stenting for stable coronary artery disease was a multibillion dollar a year industry when it was found to be no better than medical management for most patients with stable coronary artery disease.1 Hormone therapy for postmenopausal women intended to improve cardiovascular outcomes was found to be worse than no intervention,2 and the routine use of the pulmonary artery catheter in patients in shock was found to be inferior to less invasive management strategies.3 Previously, we have called this phenomenon (when a medical practice is found to be inferior to some lesser or prior standard of care) a medical reversal.4, 5, 6 Medical reversals occur when new studies—better powered, controlled, or designed than their predecessors—contradict current practice.4 In a prior investigation of 1 year of publications in a high-impact journal, we found that of 35 studies testing standard of care, 16 (46%) constituted medical reversals.4 Another review of 45 highly cited studies that claimed some therapeutic benefit found that 7 (16%) were contradicted by subsequent research.7
Identifying medical practices that do not work is necessary. The continued use of such practices wastes resources, jeopardizes patient health, and undermines trust in medicine. Interest in this topic has grown in recent years. The American Board of Internal Medicine launched the Choosing Wisely campaign,8 a call on professional societies to identify the top 5 diagnostic or therapeutic practices in their field that should not be offered.9 In England, the National Institute for Health and Clinical Excellence has tried to “disinvest” from low-value practices, identifying more than 800 such practices in the past decade.10 Other researchers have found that scanning a range of existing health care databases can easily generate more than 150 low-value practices.11 Medical journals have specifically focused on instances in which more health care is not necessarily better. The Archives of Internal Medicine created a new feature series in 2010 entitled “Less is More.”12
One can’t help but note right from the introduction of this paper that SBM/EBM does continually reevaluate its practices and treatments, testing which ones work and which ones do not and comparing current practice against new treatments. Granted, the intensity of this effort seems to be a more recent development, with the implementation of the Patient Protection and Affordable Care Act, but is it really? This article suggests that the answer is: perhaps not.
The authors specifically examine the question of how much of the medical literature consists of what they refer to as “medical reversals,” as described above. Specifically, they tried to estimate what percentage of the medical literature consists of articles that question current medical practice, particularly that consist of high quality evidence suggesting that current practice needs to be changed or that a standard-of-care intervention doesn’t work, doesn’t work as well as a non-standard-of-care intervention, or is actually harmful. How the authors did this, I find easier to let them describe:
Two reviewers (C.T., A.V., M.C., J.R., S.Q., S.J.C., D.B., V.G., or S.S.) and V.P. read articles addressing a medical practice in full. On the basis of the abstract, introduction, and discussion, articles were classified as to whether the practice in question was new or existing. Methods were classified as one of the following: randomized controlled trial, prospective controlled (but nonrandomized) intervention study, observational study (prospective or retrospective), case-control study, or other methods. End points for articles were classified into those that reached positive conclusions and those that found negative or no difference in end points. Lastly, articles were given 1 of 4 designations. Replacement was defined as a new practice surpassing an older standard of care. Back to the drawing board was defined as a new practice failing to surpass an older standard. Reversal was designated when a current medical practice was found to be inferior to a lesser or prior standard. Reaffirmation was defined as an existing medical practice being found to be superior to a lesser or prior standard. Finally, articles in which no firm conclusion could be reached were termed inconclusive. The designation of an article was also performed in duplicate. When there were differences in opinion between the 2 reviewers, adjudication first involved discussion between the 2 readers to see whether agreement could be reached. If disagreement persisted, a third reviewer (A.C.) adjudicated the discrepancy. Less than 3% of articles required discussion, and less than 1% required adjudication. A table detailing each medical reversal was constructed (Supplemental Appendix; available online at http://www.mayoclinicproceedings.com), and the third reviewer (A.C.) reviewed all reversals.
So what did the investigators (Prasad et al) find? They examined ten years’ worth of NEJM original reports, from 2001 through 2010, for a total of 2,044 original articles. Of these, 1,344 (65.8%) addressed a medical practice, of which 911 (68%) were randomized controlled trials, 220 (16%) were prospective controlled but non-randomized studies, 117 (9%) were observational studies, 43 (3%) were case-control studies, and 53 (4%) used other methods. Of these 1,344 reports, 981 (73%) studied a new medical practice, while 363 (27%) addressed an existing practice. Overall, 756 articles (56%) found that a new practice surpassed the existing standard of care at the time (replacement), while 165 (12%) failed to find that a new practice was better than existing practices. In terms of what we’re really interested in, of the 363 studies examining an existing practice, 146 studies (40%) were reversals, while 138 (38%) upheld standard practices. Here’s a breakdown from the article for your edification:
Of the reversal articles, not surprisingly most (76%) turned out to be randomized clinical trials, and interestingly, the percentage of each type of trial didn’t change much over the decade-long study period:
The one problem I had with this study was that it only looked at one journal: The New England Journal of Medicine. I can understand why the authors might have chosen that particular journal. It’s very high impact, and, with the exception of a recent distressing tendency to let some low quality CAM articles slip in, one of the more rigorous medical journals out there that isn’t a specialty journal; i.e., it accepts articles covering all areas of medicine. It’s not a basic science journal; it generally only publishes original studies that are either clinical trials, epidemiological studies, or at the very least highly translational. It also, from my reading, only rarely publishes really preliminary clinical work, such as phase I clinical trials. On the other hand, one has to wonder whether the results would be generalizable to the rest of the medical literature.
For example, according to this study, articles in the NEJM that tested new practices were far more likely to find them beneficial than articles that tested existing ones (77.1% vs 38.0%), while articles that tested existing standard-of-care practices were far more likely to find those practices ineffective than articles testing new practices (40.2% vs 17.0%). Looking at such numbers, I can’t help but wonder if there is a publication bias for finding new therapies effective and/or for finding existing therapies either ineffective or harmful, particularly in the NEJM, which is among the highest of high-impact medical journals. Think about it. Who thinks that their findings are substantial enough and interesting enough to be seriously considered for publication in the NEJM? It’s investigators who have found that some new therapy works for a common or very serious disease, but it wouldn’t surprise me if it’s also authors who have found compelling evidence that a commonly used existing standard of care is either not effective or is even dangerous.
It’s also informative to look at some of the medical practices that were the subject of reversal articles. For instance, it was thought that certain vaccinations could increase the risk of relapse in multiple sclerosis, but two studies showed no increased risk. One looked at tetanus, hepatitis B, and influenza vaccination; the other at hepatitis B vaccination. One showed that delayed drainage of effusion in otitis media did not result in worse outcomes than immediate placement, resulting in a change in practice. Another key reversal came in the form of a 2003 study that showed that high-dose chemotherapy followed by bone marrow transplantation did not improve survival in advanced breast cancer. This was a huge one, and almost immediately oncologists stopped doing bone marrow transplants for breast cancer. Another showed that the use of pulmonary artery catheters in acute lung injury didn’t improve outcomes and was associated with more complications. (When I was a resident in the 1990s, all of these patients got pulmonary artery catheters.) A couple of these I’ve written about, such as vertebroplasty. More recently, there was a study that showed no benefit to routine PSA screening for prostate cancer in American men.
Indeed, I can’t help but mention here that the whole reevaluation of routine screening for cancer, such as PSA screening for prostate cancer and mammography for breast cancer, topics I’ve written about numerous times for this blog, are examples of exactly that: SBM/EBM evaluating current practices in light of new data and determining whether they should be changed or abandoned. Routine PSA screening for men at average risk of prostate cancer has more or less been abandoned, for example, while current mammography practices are being questioned as promoting too much overdiagnosis and likely will evolve in response.
Perhaps the most prominent example of the efforts EBM/SBM makes to continually reevaluate its practices is the Choosing Wisely initiative. Scott Gavura brought it up last year, and I’ve mentioned it elsewhere. It’s basically an effort by organized medicine to reduce the use of what are known as “low value” tests (i.e., tests that provide little or no benefit but are often costly and can produce complications and more invasive testing). If screening tests are a problem, there are also a lot of tests that are ordered too frequently or for dubious indications. The reasons can range from laziness to defensive medicine, but whatever the reason, such tests cost money, can lead to incidental findings that need further workup, and can even lead to overdiagnosis. In 2010, Dr. Howard Brody published a challenge to his physician colleagues in The New England Journal of Medicine. It was an amazing article, in which Dr. Brody challenged physician specialty organizations thusly:
In my view, organized medicine must reverse its current approach to the political negotiations over health care reform. I would propose that each specialty society commit itself immediately to appointing a blue-ribbon study panel to report, as soon as possible, that specialty’s “Top Five” list. The panels should include members with special expertise in clinical epidemiology, biostatistics, health policy, and evidence-based appraisal. The Top Five list would consist of five diagnostic tests or treatments that are very commonly ordered by members of that specialty, that are among the most expensive services provided, and that have been shown by the currently available evidence not to provide any meaningful benefit to at least some major categories of patients for whom they are commonly ordered. In short, the Top Five list would be a prescription for how, within that specialty, the most money could be saved most quickly without depriving any patient of meaningful medical benefit. Examples of items that could easily end up on such lists include arthroscopic surgery for knee osteoarthritis and many common uses of computed tomographic scans, which not only add to costs but also expose patients to the risks of radiation.
Some specialty organizations have done just that, and the result is called Choosing Wisely. To begin, nine specialty societies have produced lists of Five Things Physicians and Patients Should Question, which Choosing Wisely describes as “evidence-based recommendations that should be discussed to help make wise decisions about the most appropriate care based on a patients’ individual situation.” The clinical societies that have participated include:
- American Academy of Allergy, Asthma & Immunology
- American Academy of Family Physicians
- American Academy of Hospice and Palliative Medicine
- American Academy of Neurology
- American Academy of Ophthalmology
- American Academy of Otolaryngology — Head and Neck Surgery Foundation
- American Academy of Pediatrics
- American College of Cardiology
- American College of Obstetricians and Gynecologists
- American College of Physicians
- American College of Radiology
- American College of Rheumatology
- American Gastroenterological Association
- American Geriatrics Society
- American Society for Clinical Pathology
- American Society of Clinical Oncology
- American Society of Echocardiography
- American Society of Nephrology
- American Society of Nuclear Cardiology
- American Urological Association
- Society for Vascular Medicine
- Society of Cardiovascular Computed Tomography
- Society of Hospital Medicine – Adult Hospital Medicine
- Society of Hospital Medicine – Pediatric Hospital Medicine
- Society of Nuclear Medicine and Molecular Imaging
- The Society of Thoracic Surgeons
It’s an impressive list. Naturally, being a cancer surgeon, I can’t resist looking here at the recommendations made by the American Society of Clinical Oncology (ASCO). Interestingly, two out of the five recommendations were breast cancer-related:
- Don’t perform PET, CT, and radionuclide bone scans in the staging of early breast cancer at low risk for metastasis.
- Don’t perform surveillance testing (biomarkers) or imaging (PET, CT, and radionuclide bone scans) for asymptomatic individuals who have been treated for breast cancer with curative intent.
It’s true. Far too often we do a million-dollar workup for patients with early stage breast cancer, and there is pretty much zero good evidence that these workups improve survival, improve care, or otherwise do anything except cost a lot of money, delay definitive treatment, expose the patient to radiation, and provoke worry in both patient and practitioner. I realize this is anecdotal experience, but overuse of these tests in early stage breast cancer doesn’t appear to as much of a problem in big cancer centers as it does in community cancer hospitals. But it is a problem in a lot of places.
One thing that disappointed me was that the Choosing Wisely list from the American College of Radiology didn’t appear to include any breast cancer-related recommendations. For instance, routine breast MRI before surgery for breast cancer increases the rate of mastectomy and, worse, contrary to the stated intent for preoperative MRI, does not decrease the rate of reexcision. On the other hand, it’s refreshing to see a recommendation that most surgeons instinctively know to be true:
Avoid admission or preoperative chest x-rays for ambulatory patients with unremarkable history and physical exam.
Performing routine admission or preoperative chest x-rays is not recommended for ambulatory patients without specific reasons suggested by the history and/or physical examination findings. Only 2 percent of such images lead to a change in management. Obtaining a chest radiograph is reasonable if acute cardiopulmonary disease is suspected or there is a history of chronic stable cardiopulmonary disease in a patient older than age 70 who has not had chest radiography within six months.
Of course, although doctors carry a large share of the responsibility for unnecessary tests, medications, and care, they are not alone in being responsible for the overuse of various medical tests and interventions. The classic example, of course, is the use of antibiotics for viral infections, which is something that patients often demand in the mistaken belief that it will help them and that many doctors use because patients ask for it and it’s easier to give in than to spend the time it takes to explain why it’s not medically indicated. Another contributor to the problem is that many of these tests have a significant financial incentive. It’s the same problem that contributes to the slowness of the decline in use of some treatments shown not to be effective.
Lest any reader think that I don’t put my money where my mouth is, I will mention that I was recently appointed co-director of a state-wide program designed to increase quality of breast cancer care by promoting adherence to evidence-based guidelines and one of our primary goals this year will be to promote adherence among our member hospitals to the Choosing Wisely guidelines.
We change, but it’s messy and slow
One reason why EBM/SBM is slower than we might like to eliminate outdated and ineffective practices is simple. It’s not easy. Evidence from science, epidemiology, and clinical trials takes a long time to come in. It’s often very messy. When a practice comes into question, there will often be conflicting evidence, and it often takes a number of studies before conclusions about the practice firm up to the point where they are incorporated into evidence-based guidelines and become standard of care.
Often, practices that are later reversed come into usage based on premature and inadequate evidence. Often, small trials look promising, and physicians start using a treatment based on them. Sometimes such practices become standard based on short term outcome measures, and when long term data become available previously unsuspected harms become apparent. Sometimes it’s excessive confidence in the appropriateness of the proposed mechanism used to explain why the treatment should work. What is needed, according to Prasad et al (and I agree), is more rigor:
As such, we favor policies that minimize reversal. Nearly all such measures involve raising the bar for the approval of new therapies6, 83, 84 and asking for evidence before the widespread adoption of novel techniques. In all but the rarest cases,82 large, robust, pragmatic randomized trials measuring hard end points (with sham controls for studies of subjective end points) should be required before approval or acceptance. Our position is in contrast to efforts to lower standards for device and drug approval,85 which further erodes the value of the regulatory process.
One can’t help but note that this is in marked contrast to CAM studies, in which CAM advocates ask us to accept much less rigorous types of evidence to accept modalities. As Steve Novella has frequently pointed out, as rigorous randomized clinical trials show that most CAM interventions are no better than placebo, the refrain we frequently hear is that we should look at “pragmatic” trials. In this context, pragmatic doesn’t mean the same thing. What Prasad et al are referring to are randomized trials that reflect real-world practices. What I mean by “pragmatic” trials in the context of acupuncture are more observational trials of how the treatment is used in the real world. As I’ve said many times, this is putting the cart before the horse. Normally pragmatic trials are done for treatments that have already been shown to be efficacious in randomized clinical trials. They can’t show efficacy by themselves. They are designed to test how treatments already shown to be efficacious in randomized trials function once let “out into the wild” (i.e., the real world). Frequently, outside the rarified, rigorous world of randomized clinical trials, treatments are less effective.
It should also be pointed out that, just because a treatment was “reversed” in a clinical trial doesn’t necessarily mean that the older practices reversed were wrong. However, as Prasad et al put it:
The reversals we have identified by no means represent the final word for any of these practices. Simply because newer, larger, better controlled or designed studies contradict standard of care does not necessarily mean that older practices are wrong and new ones are right. On average, however, better designed, controlled, and powered studies reach more valid conclusions.94 Nevertheless, the reversals we have identified at the very least call these practices into question. Some practices ought to be abandoned, whereas others warrant retesting in more powerful investigations. One of the greatest virtues of medical research is our continual quest to reassess it.
So, yes, “conventional” medicine doesn’t always get it right. Occasionally it gets it wrong, on rare occasions spectacularly wrong. But unlike most CAM modalities, EBM/SBM is self-correcting. It actually does abandon treatments that don’t work. The process might be messy and ugly at times, but it does happen. For example, many years ago, angina pectoris was sometimes treated with a surgical procedure known as mammary artery ligation. The idea was that tying off these arteries would divert more blood to the heart. The operation became popular on the basis of relatively small, uncontrolled case series. Then, two randomized, sham surgery-controlled clinical trials were published in 1959 and 1960. Both of these trials showed no difference between bilateral internal mammary artery ligation and sham surgery. Very rapidly, surgeons stopped doing this operation. A similar example is one I mentioned above: bone marrow transplantation for advanced breast cancer, which was similarly rapidly abandoned after randomized clinical trials showing it to be no better than the previous standard of care. I’m not saying that this happened without conflict or disagreement; proponents of these therapies can always find reasons to discount the clinical trial evidence. But in the end evidence and science do win out.
Now compare this to CAM practices. Can anyone name a CAM treatment that was abandoned by CAM practitioners as a result of research and randomized clinical trials showing that it doesn’t work? A single one? I can’t. That’s the difference between CAM and EBM/SBM. The day that I see a CAM practice go extinct, like bilateral internal mammary artery ligation for angina pectoris, is the day that I might start to take seriously CAM practitioner claims that they are science-based.
Posted in: UncategorizedLeave a Comment (77) ↓