Science-based medicine is partly an exercise in detailed navel gazing – we are examining the use of science in the practice of medicine. As we use scientific evidence to determine which treatments work, we also have to examine the relationship between science and practice, and the strengths and weaknesses of the current methods for funding, conducting, reviewing, publishing, and implementing scientific research – a meta-scientific examination.
There have been several recent publications that do just that – look at the clinical literature to see how it is working and how it relates to practice.
Dr. Vinay Prasad led a team of researchers through the pages of the New England Journal of Medicine hunting for medical reversals – studies that show that current medical practice is ineffective. Their results were published recently in the Mayo Clinic Proceedings:
Dr. Prasad’s major conclusion concerns the 363 articles that test current medical practice — things doctors are doing today. His group determined that 146 (40.2%) found these practices to be ineffective, or medical reversals. Another 138 (38%) reaffirmed the value of current practice, and 79 (21.8%) were inconclusive — unable to render a firm verdict regarding the practice.
Prasad also found that 27% of published studies looked at existing treatments while 73% studied new treatments.
This does not mean that 40% of current practice or current treatments are useless. There is likely a selection bias in which treatments that are controversial are more likely to be studied than ones that are well established. Also, as David Gorski has already pointed out, this is a study of one journal (the NEJM) and may reflect a publication bias toward high impact articles – showing new treatments that work and reversals of established treatments.
Also, on a positive note, these studies reflect the practice within medicine of studying even treatments that are already in use, and then abandoning those treatments when the evidence shows they don’t work.
The question remains, however – is 40% acceptable? Is this what we would expect from a system that is working well? In an accompanying editorial, John Ioannidis comments:
”Finally, are there incentives and anything else we can do to promote testing of seemingly established practices and identification of more practices that need to be abandoned? Obviously, such an undertaking will require commitment to a rigorous clinical research agenda in a time of restricted budgets,” concludes Dr. Ioannidis. “However, it is clear that carefully designed trials on expensive practices may have a very favorable value of information, and they would be excellent investments toward curtailing the irrational cost of ineffective health care.”
I agree that this highlights the fact that some current practices are useless or less than optimal and need to be re-examined. It also seems prudent to target expensive treatments that may not work.
These results might also suggest that the medical community, in some cases, adopts new treatments prematurely, based on preliminary evidence that is not reliable. Ioannidis himself pointed out that most new treatments do not work and most preliminary evidence on these treatments are false positives. Simmons et al demonstrated how easy it is to inadvertently manufacture positive results by exploiting researcher degrees of freedom. Publication bias is also a known effect pushing the literature toward positive results.
All of this research seems to be pointing in the same direction – we need to compensate for this positive bias in the clinical research and perhaps raise the bar of rigorous evidence before adopting new treatments. Further, we need to retroactively apply this raised bar of evidence to existing treatments which were perhaps adopted prematurely.
Another researcher, Benjamin Djulbegovic, just published another look at the clinical literature in the journal, Nature. He examined 860 phase III efficacy trials and found that in just over half of these trials the new treatments being studied were superior to existing treatments. He concludes:
“Our retrospective review of more than 50 years of randomized trials shows that they remain the ‘indispensable ordeals’ through which biomedical researchers’ responsibility to patients and the public is manifested,” the researchers conclude. “These trials may need tweak and polish, but they’re not broken.”
His primary point is that this ratio is close to ideal. If the new treatments worked the vast majority of the time, then that would indicate that existing evidence is likely sufficient and it would call into question the ethics of doing the study – putting patients on a placebo or less effective treatment. If the efficacy trials demonstrated that the treatment worked in only a small minority of cases, then the possibility of benefit would be unethically too small and that would also call into question the methods we were using to select treatments for large clinical trials.
Roughly a 50-50 split is therefore in the Goldilocks zone for clinical trials. Djulbegovic points out that this ratio allows for the steady incremental advance of medical treatments, even though only about 2-4% of new treatments represent a true breakthrough.
Another recent review of the clinical research, this one by Clinical Evidence, a project of BMJ, concludes that for about half of studied treatments, the efficacy is simply unknown. They point out that this does not mean half of treatments being used, and the data says nothing about the frequency of use of individual treatments – just that systematic reviews of treatments are inconclusive about half the time.
This result is similar to a 2011 review of Cochrane systematic reviews which found that 45% were inconclusive – could not conclude that the treatment either did or did not work.
One simple interpretation of all of this research into clinical trials is that we need to do more research. In the review of Cochrane reviews above, only 2% of reviews concluded that the examined treatment works and no further research is required (although it seems to me that recommending further research is the default conclusion for Cochrane reviews).
We do not, however, have infinite resources with which to conduct clinical research. We will always be making do with insufficient data, and so we must make the most out of the information we have. Part of that is understanding the patterns in the research, and the strengths and weaknesses of various kinds of medical research at every stage.
We now have a very good working knowledge of biases in clinical studies. Generally speaking, they have a huge positive bias. Preliminary studies should therefore be considered unreliable in terms of making clinical decisions, and should only be used as a basis for designing further research.
We need to recognize that only mature rigorous efficacy trials give us any reliable indication of whether or not a treatment works and is superior to doing nothing or to existing treatments. Even these trials, however, are problematic and are subject to bias. We therefore also need to consider plausibility (prior probability).
The threshold for determining that a treatment is well established as beneficial is: scientifically plausible intervention backed by rigorous clinical trials that show a replicable, clinically significant and statistically significant benefit.
In my opinion, the research clearly shows that this is the reasonable standard. This is the primary meaning of science-based medicine.
Meanwhile, there is another movement within mainstream medicine whose primary goal is to move in the exact opposite direction. To soften the standards of scientific evidence, use pragmatic studies as if they were efficacy trials, interpret placebo effects as if they were physiological effects, and to have a “more flexible” concept of medical evidence.
These two opposite trends can exist side-by-side because the latter exists in an “alternative” world, a literal double standard.