Everything you know may be wrong. Well, not really, but reading the research of John Ioannidis does make you wonder. His work, concentrated on research about research, is a popular topic here at SBM. And that’s because he’s focused on improving the way evidence is brought to bear on decision-making. His most famous papers get to the core of questioning how we know what we know (or what we assume) to be evidence.
His most recent paper takes a look at the literature on biomarkers. Written with colleague Orestis Panagiotou, Comparison of effect sizes associated with biomarkers reported in highly cited individual articles and in subsequent meta-analyses is sadly behind a paywall – so I’ll try to summarize the highlights. Biomarkers are chemical markers or indicators that can be measured to verify normal biology, detect abnormal pathology, or measure the effect of some sort of treatment. Ever had blood drawn for lab tests? Then you’ve had biomarkers tested. Had your blood pressure checked? Another biomarker. The AACR-FDA-NCI cancer biomarkers consensus report provides a nice categorization of the different biomarkers currently in use:
- Diagnostic biomarkers
- Early detection biomarkers
- Disease classification
- Predictive biomarkers
- Predict the response to a specific agent
- Predict a particular adverse reaction
- Metabolism biomarkers
- Biomarkers that guide drug doses
- Outcome biomarkers
- Those that predict response
- Those that predict progression
- Those that forecast recurrence
Biomarkers are developed and implemented in medical practice in a process that parallels drug development. It starts with a hypothesis, then progressive research to validate the relationship between the measurement of a feature, characteristic, or parameter, and the specific outcome of interest. The assay process, for measuring the biomarker itself must also undergo its own validation, ensuring that measurements are accurate, precise, and consistent. Biomarkers are generally considered clinically valid and useful when there is an established testing system that gives meaningful, actionable results that can make a clinically meaningful difference the way we prevent or treat disease.
Some of the most common medical tests are biomarkers. Serum creatinine to estimate kidney function, levels of liver enzymes to evaluate liver function, and blood pressure to predict the risk of stroke. The search for new biomarkers has exploded in the past several years with the growing understanding of the molecular nature of many diseases. Cancer therapies are among the most promising areas for biomarkers, with tests like HER2 (to predict response to trastuzumab), or the KRAS test (to predict response to EGFR inhibitors like cetuximab and panitumumab) guiding drug selection. It’s a very attractive target: Rationally devising drugs based on specific disease characteristics, and then using biomarkers to a priori to identify patients most likely to respond to treatment.
Despite their promise, the resources invested, and isolate winners, biomarker research has largely failed to live up to expectations for some time. Most recently, David Gorski discussed how the hype of personalized medicine hasn’t yet materialized into truly individualized treatments: not because we’re not trying, but because it’s really, really, hard work. I’ve also pointed out that the the direct-to-consumer genetic tests, some of which rely on biomarkers, is a field still not ready for prime time, where the marketing outpaces the science. The reality is that few new biomarker tests have been implemented in clinical practice in the past decades. For many medical conditions, we continue to rely on traditional methods for diagnosis. Yes the promise of biomarkers is tantalizing. Every major conference heralds some new biomarker that sounds predictive and promising. So we have a hot scientific fields, lots of preliminary research, multiple targets and approaches, and significant financial interests at play. Sound familiar? It’s exactly the setting describe by Ioannidis on therapeutic studies, in his well-known paper, Why Most Published Research Findings Are False. And based on this latest paper, the biomarker literature seems to share characteristics with the literature on medical interventions, which Ioannidis studied in another well-known paper, Contradicted and Initially Stronger Effects in Highly Cited Clinical Research.
This newest paper, which was published earlier this month, sought to evaluate if highly cited studies of biomarkers were accurate, when compared to subsequent meta-analyses of the same data. To qualify, each study had to have been cited over 400 times, and each study had to have a matching subsequent meta-analysis of the same biomarker relationship conducted as follow-up. To reduce the field from over 100,000 studies down to something manageable, results were restricted to 24 high impact journals with the most biomarker research. Thirty-five base papers, published between 1991 and 2006 were ultimately identified. These were well-known papers – some have been cited over 1000 times. For each paired comparison, the largest individual study in each meta-analysis was also identified, and compared to the original highly cited trial. Biomarkers identified included genetic risk factors, blood biomarkers, and infectious agents. Outcomes were mainly cancer or cardiovascular-disease related. Most of the original relationships identified were statistically significant, though four were not.
So did the original association hold up? Usually, no. Of that sample of 35, subsequent analysis failed to substantiate as strong a link 83% of the time. And 30 of the 35 reported a stronger association than observed in the largest single study of the same biomarker. When the largest studies of these biomarkers were examined, just 15 of the 35 original relationships were still significantly significant, and only half of these 15 seemed to remain clinically meaningful. For example, homocysteine use to be kind of a big deal, after it was observed that a strong correlation existed between levels of this biomarker and cardiovascular disease, in a small study. The most well-know study has been cited in the literature 1451 times, and reported an whopping odds ratio of 23.9. Subsequent analyses of homocysteine failed to show such a strong association. Nine years after the initial trial, a meta-analysis of 33 trials with more than 16,000 patients calculated an odds ratio of 1.58. Yet this finding has been infrequently cited in the literature: only 37 citations to date.
The authors identify a number of reasons why these findings may be observed. Many of the widely cited studies were preliminary and had small sample sizes. Publication interest could have led to selective reporting from looking for significant findings. The preliminary studies preceded the meta-analysis often by several years, giving ample time for citations to accrue (though this was not always the case, and in some cases, the highly cited studies followed larger studies.) Limitations identified included the biomarker selection process which included several arbitrary selection steps, including the citation threshold, and the requirement for a paired meta-analysis. The authors warn readers to be cautions when authors cite single studies and not meta-analyses, and conclude with the following warning:
While we acknowledge these caveats, our study documents that results in highly cited biomarker studies often significantly overestimate the findings seen from meta-analyses. Evidence from multiple studies, in particular large investigations, is necessary to appreciate the discriminating ability of these emerging risk factors. Rapid clinical adoption in the absence of such evidence may lead to wasted resources.
The editorial that accompanied the article (also paywalled) echos the cautions and concerns in the paper:
It would be premature to doubt all scientific efforts at marker discovery and unwise to discount all future biomarker evaluation studies. However, the analysis presented by Ioannidis and Panagiotou should convince clinicians and researchers to be careful to match personal to hope with professional skepticism, to apply critical appraisal of study design and close scrutiny of findings where indicated, and to be aware of the findings of well-conducted systematic reviews and meta-analyses when evaluating the evidence on biomarkers.
More of the (Fake) Decline Effect? No.
The so-called “Decline Effect” has been discussed at length here at SBM. The popular press seems to be quick to reach for unconventional explanations of the weakening of scientific findings under continued scrutiny. Steven Novella discussed a related case earlier this month, pointing out there’s no reason to appeal to quantum woo, when the decline effect is really just the scientific process at work: adding precision and reducing uncertainty through continued analysis.
Biomarker research parallels therapeutic research, with all the same potential biases. The earliest and often most highly cited results may ultimately turn out to be inaccurate and quite possibly significantly overstated. Trial registration and full disclosure of all clinical trials will help us understand the true effect more quickly. But that alone won’t solve the problem if we continue to attach significant merit to preliminary data, particularly where there is only a single study. Waiting for confirmatory research is hard to do, given our propensity to act. But a conservative approach is probably the smartest one, given the pattern we’re seeing in the literature on biomarkers.
Ioannidis JP, & Panagiotou OA (2011). Comparison of effect sizes associated with biomarkers reported in highly cited individual articles and in subsequent meta-analyses. JAMA : the journal of the American Medical Association, 305 (21), 2200-10 PMID: 21632484