The NIH funding process: “Conformity” and “mediocrity”?

When we refer to “science-based medicine” (SBM), it is a very conscious choice to emphasize that good medicine should be based on a solid foundation of science. The name was coined to contrast the difference between the current evidence-based medicine (EBM) paradigm, which fetishizes randomized clinical trial evidence above all else and frequently ignores prior plausibility based on well-established basic science, and the SBM paradigm, which takes prior plausibility into account. The purpose of this post will not be to resurrect old discussions on these differences, but before I attend to the study at hand I bring this up to emphasize that progress in science-based medicine requires progress in science. That means all levels of biological (and even non-biological) basic science, which forms the foundation upon which translational science and clinical trials can be built. Without a robust pipeline of basic science progress upon which to base translational research and clinical trials, progress in SBM will slow and even grind to a halt.

That’s why, in the U.S., the National Institutes of Health (NIH) is so critical. The NIH funds large amounts of biomedical research each year, which means that what the NIH will and will not fund can’t help but have a profound effect shaping the pipeline of the basic and preclinical research that ultimately leads to new treatments and cures. Moreover, NIH funding has a profound effect on the careers of biomedical researchers and clinician-scientists, as having the “gold standard” NIH grant known as the R01 is viewed as a prerequisite for tenure and promotion in many universities and academic medical centers. Certainly this is the case for basic scientists; for clinician-scientists, having an R01 is certainly highly prestigious, but less of a career-killer if an investigator is unable to secure one. That’s why NIH funding levels and how hard (or easy) it is to secure an NIH grant, particularly an R01, are perennial obsessions among those of us in the biomedical research field. It can’t be otherwise, given the centrality of the NIH to research in the U.S.

It’s also why the current hostile NIH funding environment, with pay lines routinely in the range of the 7th percentile, has brought this issue to the fore once again, and when NIH funding levels come to the fore, inevitably the topic of the peer review of NIH grants comes to the fore with it. The system by which NIH grants are reviewed involves what is known as a study section, which consists of scientists with (hopefully) the relevant expertise to evaluate the grants submitted, who all read and review a certain number of grants. They then meet, usually in Bethesda but increasingly more often by video conference, to discuss and score the proposals. Having participated in a number of NIH study section meetings as an ad hoc reviewer, I have some appreciation for the process, which sometimes involves a lot of contentious discussion and other times is amazingly cordial.

Regular readers know that many of us here at SBM are great admirers of John Ioannidis, who is best known for an analysis he published several years ago entitled Why most published research findings are false. Personally, I’ve commented on a couple of other of Ioannidis’ publications, including an analysis of the life cycle of translational research (hint: it takes a loooong time for an idea to make it through basic science studies to clinical trials to become an accepted therapy). This time around, Ioannidis has published, with co-author Joshua M. Nicholson, a commentary in Nature entitled Research Grants: Conform and Be Funded, which is about the very issue of which sorts of grants the NIH funds. As is often the case, Ioannidis is provocative in making his point. As is less often the case, I’m not entirely sure he’s on-base here.

The issue of whether the NIH supports “safe” and “unimaginative” science is something that scientists have been debating since long before I ever got into the business. It’s a question that particularly comes up during harsh funding times (like now). Back when I was in graduate school 20 years ago, the NIH was in another “bust” phase of a boom-or-bust cycle, and pay lines were as tight as they are now. Well do I remember seeing two different tenured professors forced to shut their labs down because they could no longer secure funding after successfully having done so for a long time before that. In any case, it makes a lot of sense that during tight funding times the NIH would become more conservative in what it funds. After all, when money’s tight you don’t want to risk wasting it. During such times, study sections have been observed to require more and more preliminary data for grant applications and be more and more critical of ideas that don’t look on the surface like a “slam dunk.” A measure of arbitrariness to the funding also sets in. After all, the process isn’t so objective that there is a clear difference between a grant scoring in the 5th percentile (fundable) and the 10th percentile (probably not fundable).

Ioannidis, as he frequently tries to do with many things, tries to quantify the level of conservativeness and conformity of the NIH review process, introducing the concept this way:

The NIH has unquestionably propelled numerous medical advances and scientific breakthroughs, and its funding makes much of today’s scientific research possible1.

However, concern is growing in the scientific community that funding systems based on peer review, such as those currently used by the NIH, encourage conformity if not mediocrity, and that such systems may ignore truly innovative thinkers2, 3, 4. One tantalizing question is whether biomedical researchers who do the most influential scientific work get funded by the NIH.

The influence of scientific work is difficult to measure, and one might have to wait a long time to understand it5. One proxy measurement is the number of citations that scientific publications receive6. Using citation metrics to appraise scientists and their work has many pitfalls7, and ranking people on the basis of modest differences in metrics is precarious. However, one uncontestable fact is that highly cited papers (and thus their authors) have had a major influence, for whatever reason, on the evolution of scientific debate and on the practice of science.

So basically, you can see where this is going. Ioannidis is going to try to analyze whether the most “influential” scientists are NIH-funded. One can quibble about whether the most cited papers are truly the most “influential,” of course. After all, sometimes the most cited papers are influential in a bad way or are cited as an example of a paradigm that was later rejected. However, there aren’t any really good ways of measuring a scientist’s influence; citations are probably about as useful a way as one can come up with. At least it’s an objective number that can be measured. After his analysis, Ioannidis also makes some interesting, although perhaps unworkable suggestions for improving the process.

First, let’s look at the key finding of the entire exercise. Nicholson and Ioannidis first identified 700 papers in biomedical research journals published since 2001 that have received 1,000 or more citations. They then examined the record of the primary author of those papers. It’s fairly amazing what a select group this is. There were more than 20 million papers published worldwide between 2001 and 2012, of which only 1,380 had received 1,000 or more citations. Of these, 700 were catalogued in the life or health sciences and had an author affiliation in the U.S. These 700 papers produced 1,172 discrete single, first, or last authors. The reason Ioannidis concentrated on these authors is because in biomedical research publications, the first author is usually the one who did most of the work and wrote the paper, while the last author is usually the principal investigator (PI); i.e., the researcher in whose laboratory the research was carried out. Frequently, therefore, the first author is a graduate student or postdoctoral fellow working in the laboratory of the PI.

So what were the results? Because I’m going on vacation and feeling a little lazy right now, I’ll let Ioannidis describe it:

We discovered that serving on a study section is not necessarily tied to impact in the scientific literature. (see ‘Is funding tied to impact?). When we cross-checked the NIH study-section rosters against the list of 1,172 authors of highly cited papers, we found only 72 US-based authors who between them had published 84 eligible articles with 1,000 or more citations each and who were current members of an NIH study section. These 72 authors comprised 0.8% of the 8,517 study-section members. Most of the 72 (n = 64, 88.9%) currently received NIH funding.

We then randomly selected 200 eligible life- and health-science papers with 1,000 or more citations (analysing all 700 would have required intensive effort and yielded no extra information in terms of statistical efficiency). We excluded those in which the single, first or last author was a member of an NIH study section, and those in which the single or both the first and last author were not located in the United States on the basis of their affiliations at the time of publication. This generated a group of 158 articles with 262 eligible US authors who did not participate in NIH study sections. Only the minority (n = 104, 39.7%) of these 262 authors received current NIH funding.

These data are presented in graphical form in the article:

So the basic finding is that researchers viewed as “highly influential” are often not on NIH study sections. Ioannidis notes that these rates of NIH funding are not any higher and might be worse than biomedical scientists in general. Ioannidis cites data that suggests that between 24% and 37% of biomedical researchers applying for grants from 2001 to 2011 were funded as principal investigators. True, Ioannidis points out that the funding rates for individual grants are considerably lower (after all, the pay lines over the last few years have been in the 7th percentile range) and that researchers submit multiple grants, resulting in a significant number of researchers ultimately receiving NIH funding. There is one problem using this particular range, however. The early part of the range is very different from the more recent part of the range as far as pay lines go. from 1998 to 2003, the NIH budget nearly doubled, the result of an initiative started under the Clinton Administration and completed under the Bush Administration. Truly, back then it was the land of milk and honey for researchers trying to compete for NIH grants, with pay lines well over the 20th percentile in some institutes. Then, after the doubling ended in 2003, the NIH came in for what has since become known as a “hard landing.” Budget increases did not keep up with inflation. Also, because grants funded during the time of the doubling could be as long as 5 years, the commitments from those grants funded during the time of the doubling remained for several years after the large budget increases ended, leaving less money for new grants. Thus, this period encompasses a “time of plenty” that lasted until around 2003 and 2004 and the current drought, which got really severe after around 2006. Despite all sorts of moves by the NIH to increase the number of new grants, such as cutting the budgets of existing grants and newly awarded grants, the current situation shows no signs of abating any time soon. All of this is why I would be curious if Ioannidis’ estimate still holds for the period from, say, 2006 to 2011 as it did in the earlier time period from 2001 to 2005.

Whether it does or not, the finding remains that there doesn’t seem to be a discernable difference between the NIH funding rates between these highly cited scientists and the rest of us hoi polloi, which does rather suggest that the current NIH system isn’t identifying the truly best and brightest. At least, this is what Ioannidis argues. First, however, he also points out another interesting observation. Study section members and non-members showed no significant difference in their total number of highly cited papers:

Among authors of extremely highly cited papers, study-section members and non-members showed no significant difference in their total number of highly cited papers, despite the fact that members of study sections were significantly more likely than non-members to have current NIH funding. This was true both for authors with multiple highly cited papers (13/13 versus 13/19, p = 0.024) and for those with a single eligible highly cited paper (51/59 versus 91/243, p < 0.0001) and overall in a stratified analysis (p < 0.0001).

It’s important to clarify here. There is a reason why study section members are significantly more likely to have NIH funding, and that’s because the NIH invites holders of NIH grants, particularly R01s, to join study sections, and many do. Indeed, among the requirements for study section members is that they must be a principal investigator “on a research project comparable to those being reviewed.” In other words, to be an official standing member of a study section, in general you have to have an NIH grant, usually an R01 or larger. The rest of the study section is then rounded out with a rotating band of ad hoc reviewers picked for specific areas of expertise. That’s the point, and that’s why Ioannidis mentions this. The point is that being on a study section or holding an NIH grant has no correlation with being one of these highly cited scientists.

Another observation made by Ioannidis using the similarity or “match” score on the NIH rePORTER website, where you can find listings of all federally funded research projects, is that the grants of study section members were more similar to other currently funded grants than were non-members’ grants. In other words, the funded grants of members of NIH study sections resemble each other and grants in general funded by the NIH. I can certainly guess why this might be true. One of the pieces of advice I received when I was starting out (and that is given to lots of young investigators) is to find a way to get on a study section. The rationale is that by learning how the NIH evaluates grants you can learn how to craft grants more likely to be funded. It therefore makes sense that a certain level of conformity creeps in, and there’s little doubt that that conformity becomes more pronounced when funding is tight. Study sections and the NIH do not want to “waste” taxpayer’s money on risky projects.

Or, as Ioannidis puts it:

If NIH study-section members are well-funded but not substantially cited, this could suggest a double problem: not only do the most highly cited authors not get funded, but worse, those who influence the funding process are not among those who drive the scientific literature. We thus examined a random sample of 100 NIH study-section members. Not surprisingly, 83% were currently funded by the NIH. The citation impact of the 100 NIH study-section members was usually good or very good, but not exceptional: the most highly cited paper they had ever published as single, first or last author had received a median of 136 (90–229) citations and most were already mid- or late-career researchers (80% were associate or full professors). Only 1 of the 100 had ever published a paper with 1,000 or more citations as single, first or last author (see Appendix 1 of Supplementary Information for additional citation metrics).

And, from a news report about the Ioannidis’ study:

Top scientists are familiar with NIH’s penchant for the safe and incremental. Years ago, biologist Mario Capecchi of the University of Utah applied for NIH funding for a genetics study with three parts. The study section liked two of them but said the third would not work.

Capecchi got the grant and put all the money into the part the reviewers discouraged. “If nothing happened, I’d be sweeping floors now,” he said. Instead, he discovered how to disable specific genes in animals and shared the 2007 Nobel Prize for medicine for it.

Although I think Ioannidis definitely has a point, anecdotes like this are rife in the research world. Usually, they take the form of scientists with ideas that they couldn’t persuade the NIH to fund despite multiple grant applications but that later turned out to be revolutionary or to lead to highly useful new therapies. The most common such anecdote that I hear is that of Dennis Slamon, who proposed targeting the HER2 oncogene in breast cancer. I’ve touched on his saga before, along with the whole issue of supporting risky science versus safer, more incremental science. It’s the issue of “going for the bunt versus swinging for the fences.” Basically Slamon likes to go on and on about he had trouble getting NIH funding to develop a humanized monoclonal antibody against HER2, but, as I pointed out, the story was actually a lot more complex than that. For instance, Slamon’s applications were submitted around a time when other scientists were having difficulty replicating their results, and very likely study section members knew that. Also, Slamon had no trouble getting NIH funding for other projects at the time.

To me, these anecdotes represent the scientific variant of the mad scientist in horror movies ranting, “They thought me mad, mad, I tell you! But I’ll show them.” OK, I’m being a bit sarcastic, but these stories are so ubiquitous whenever anyone complains about the “conformity” and conservativeness of the NIH review process that I tend to want to gag every time I read one of them. They’re pure confirmation bias. Yes, occasionally the daring or bizarre idea will pay off. However, far more often risky ideas do not pan out because, well, they are risky. Most risky ideas fail. It’s very easy to recognize innovative ideas with a high potential for impact in retrospect. With the benefit of hindsight, we now know that Slamon and Capecchi have made huge contributions to science. At the time they were doing their seminal work, it wasn’t nearly as obvious. Both Slamon and Capecchi got lucky. Such is the fate of riskier ideas that it could just as easily have gone the other way and their big ideas gone nowhere.

But back to Ioannidis. I see a lot of problems with his analysis, not the least of which is his metric. For one thing, in some specialized fields, even papers with a very high impact would have difficulty reaching 1,000 citations, because there just aren’t enough scientists working in that field to produce such blockbusters. Another issue is consistency. Truly influential and creative scientists tend to produce multiple influential papers, and even an average scientist can stumble onto something. Even Ioannidis concedes that “one cannot assume that investigators who have authored highly cited papers will continue to do equally influential work in the future.” Indeed, how many of these papers analyzed by Ioannidis were one-shot papers in which the scientists who published them never published papers anywhere near as influential again? To be fair, Ioannidis does make a good point that such investigators have reached a bar that should entitle them to a chance to prove that they can keep doing such good work.

Finally, as NIH Director Francis Collins pointed out, scientists funded by the NIH have won 135 Nobel Prizes. The situation is not as clearly a problem as Ioannidis makes it sound. Besides, it is the very nature of science that “game changing” studies tend to be relatively few and far between. Most of the hard work of advancing science does come from incremental work, in which scientists build upon what has come before. We fetishize the “brave mavericks” and “geniuses,” and, yes, they are important, but identifying these geniuses at the time they are doing their work is not a trivial thing. Often the importance of their ideas and work is only appreciated in retrospect.

In the end, as much as I admire Ioannidis, I think he’s off-base here. It’s not that I don’t agree that the NIH should try to find ways to fund more innovative research. However, Ioannidis’ approach to quantifying the problem seems to suffer from flaws in its very conception. In light of that, I can’t resist revisiting the discussion in my last post on the question of riskiness versus safety in research, and that’s a simple question: What’s the evidence that funding more risky research will result in better research and more treatments? We have lots of anecdotes of scientists whose ideas were later found to be validated and potentially game-changing who couldn’t get NIH funding, but how often does this really happen? As I’ve pointed out before, the vast majority of “wild” ideas are considered “wild” precisely because they are new and there is little good support for them. Once evidence accumulates to support them, they are no longer considered quite so “wild.” We know today that the scientists whose anecdotes of woe describing the depradations of the NIH were indeed onto something. How many more proposed ideas that seemed innovative at the time but ultimately went nowhere?

Ioannidis does bring up a disturbing point, namely that scientists who have authored highly influential papers are apparently no more likely to achieve NIH funding than the rest of us and, more relevant, that scientists on NIH study sections differ from those not on study sections only in their ability to persuade the NIH to fund them. NIH study sections are a lot of work, and there is also a culture there, with a definite “in” crowd. It is quite possible that the truly innovative thinkers and scientists don’t want to be bothered with the many hours of work that each study section meeting involves, which can require members to review five to ten large grants and then travel to Bethesda, and it is equally possible that such scientists “don’t play well with others” in the study section. It is certainly worthwhile to investigate whether this is the case and then to try to find a way to bring such creative minds into the grant review process. However, the assumption underlying Ioannidis’s analysis seems to be that there must be “bolts out of the blue” discovered by brilliant brave maverick scientists. It’s all very Randian at its heart. However, science is a collaborative enterprise, in which each scientist builds incrementally on the work of his or her predecessors. Bolts out of the blue are a good thing, but we can’t count on them, nor has anyone demonstrated that they are more likely to occur if the NIH funds “riskier research.” It’s equally likely that the end result would be a lot more dud research.

No one can say, and that’s the point.

Posted in: Basic Science, Clinical Trials, Politics and Regulation

Leave a Comment (7) ↓

7 thoughts on “The NIH funding process: “Conformity” and “mediocrity”?

  1. Angora Rabbit says:

    Since no one is commenting, some quick thoughts between expts. But I am becoming less enamoured of Ioannidis- he’s got an axe to grind and hiding it badly. IMO.

    “We excluded those in which the single, first or last author was a member of an NIH study section,”

    So this already creates a skew against NIH-funded authors. I don’t accept the omission given what I understand him to be testing.

    “…and those in which the single or both the first and last author were not located in the United States on the basis of their affiliations at the time of publication. This generated a group of 158 articles with 262 eligible US authors who did not participate in NIH study sections. ”

    I read this to mean during the study period (and was this permanent or ad hoc membership?) If current, it does not surprise me given that SS is a HUGE amount of work. Typical load is 10-12 proposals three times a year; I’ve known some with 15-20. We had only about a 2mo break between work-loads (I am old enough to remember big boxes arriving, and the accompanying sinking feeling). Regular membership is 4yrs, but reality is that new members ad hoc a few times before getting a permanent appt. I stayed an extra year to chair my study section which made a total of 6yrs service just for that SS. I am postponing a return to SS for as loooong as possible.

    Knowing all this, I don’t see how someone can have a hugely productive lab and be a permanent study section member. My 5+ years was a huge workload and rotating off felt like getting my life back.

    My CSR admin is struggling to get senior investigators back onto SS, because we are all ducking and running from her. I just did an ad hoc there and was surprised at how young the reviewers are – they still have the energy (and naivety) to do this.

    I don’t know what the answer is, but I do think that Ionnidis is missing some valuable points.

    Only the minority (n = 104, 39.7%) of these 262 authors received current NIH funding. ”

    And if he put back the first or last authors with funding, wouldn’t that percentage increase? Moreover, was he looking at funding of ALL authors or just first and last? I’ve been in positions where I’ve helped an unfunded baby prof by giving them last authorship, and I’ve been recipient of same in my early career. I doubt I’m unique with this. It looks to me like Ioannidis method would miss that.

  2. Amalthea says:

    I’m puzzled about the bit about excluding authors who were’nt in the US at the time of publishing. Wouldn’t it be possible for someone to head overseas after the main body of the work was done leaving the touching up to other members?

  3. evilrobotxoxo says:

    I agree that a big problem is Ioannidis’s metric, which is based on individual papers, rather than the cumulative productivity of individuals. What I mean is that someone might publish ten high-impact papers that are all cited 500 times, but in his analysis that person is less productive than someone who published one paper that was cited 1000 times.

    However, I will say that my personal opinion is that the NIH is too risk-averse and too focused on basic science. For example, if you want to do something translational, that’s inherently more risky than basic science because you’re not just trying to figure something out, you’re trying to apply it to humans. So translational work is harder to get funding for, even though it is by definition more likely to have an actual impact on human health.

    A big part of the problem is how the NIH defines “risk.” To them, I think “risk” means the risk of not generating publications to demonstrate that their money was well-spent. I would say that a more important risk that they don’t seem to consider is the risk of obtaining results that are not actually important or useful. If you submitted a grant for a set of experiments that had a 20% chance of working, but the results would be guaranteed to be game-changing, you would never get funded. If you submitted a grant where the expts were 100% likely to work, but only 20% likely to be meaningful, your odds are much better.

  4. etatro says:

    I wouldn’t necessarily say that it is NIH who is risk-averse, but reviewers who are. Reviewers don’t seem to follow NIH’s instructions for reviewing/scoring, nor do they fully understand the guidelines in PA’s or RFA’s that grants are responding to, nor do they want to follow the “new” grant formatting system, they severely punish the lack of preliminary data when the RFA states that preliminary data aren’t necessary. You essentially need to do all the work first and ask for funding after. I have had reviewers actually say “this is risky,” without explaining why or what the “risk” is. Presumably the risk is negative findings. If I could guarantee beforehand that my experiments / clinical study would definitely yield positive result, I’d be guilty of CAM-type pseudoscience.

    Even if the program officers at the NIH really really think a project should go forward, if it cannot get a good score during the review process, which rewards safety and “guaranteed” results, then it won’t be funded.

    I’ve also had reviewers give sections a score of 2 (less than perfect, which is a 1), while writing “No weaknesses noted” in the written critiques section. How is that useful?

    I’ve had reviewers mistake which institute I’m at. They have dinged me for bringing up points that the SF424 instructions explicitly say to address.

    I’ve had reviewers criticize me for not addressing a particular issue that was underlined & bolded in the application. I’ve had reviewers completely and totally confuse my grant with someone else’s (criticizing my mouse-surgical methods, when there’s no mouse-surgey at all in the research plan). I’ve had one reviewer say that a plan was too modest for the timeframe and another reviewer say that a plan was overambitious for a given timeframe, for the same application — both counting it as a weakness.

    A program officer once told me that reviewers don’t read the entire application, which was a relief. Because if they did, then they demonstrated reading comprehension skill that wouldn’t qualify them to grade Bio 101 term papers.

    Reviewers seem to want conformity. They want you to write exactly how and what they write. The NIH seems to be trying very hard to change the culture. They changed the way the grants are structured, retooled guidelines, give bonuses points to new investigators; but the reviewers resist. The NIH is issuing RFA’s, PA’s, guidelines, and instructions that point in one direction, while the reviewers and review process pushes us in another.

    There doesn’t seem to be any easy way around this except to wait for people to retire. The only problem is that the individuals rising through the ranks to replace the old guard will have gotten there through groveling, conformity, flattery, and emulation.

    But I’m not bitter.

  5. Angora Rabbit says:

    @etatro, you make several excellent points. I agree it is less a question of risk-averse and more a question of preliminary data. It comes down to this: if the results are not what the PI expected, then what happens? As a reviewer, I see too many PIs who refuse to acknowledge anything but the anticipated outcome. As a PI, I use the “Pitfalls, Anticipated Outcomes”, and for every Aim, to explain how I’ll make lemonade if the results are the opposite. Tough to do now that we’re restricted to 15pp, but this is crucial to a proposal.

    You can bet that my proposals always have preliminary data to support each aim. But sadly this is much easier to do in a basic science proposal – I don’t know how one manages it in a clinical study with much higher overheads, labor, and expenses, not to mention the IRBs.

    I am, however, saddened by Etatro’s experiences of reviewers who cannot read. This is equally the fault of the study section Chair, who ought to riding herd on her reviewers, and the CSR admin. I also wonder, however, if there is substantial cut-and-paste going on between review templates; I am certainly guilty of doing this as a reviewer, but then I always reread to make sure what I’ve written is accurate. In honesty, I don’t read the entire grant either – I do read the proposal body, abstract and vertebrate animals in detail, and then glance over the remaining pages (biosketches, budget, etc) looking for glaring issues.

    I don’t believe retirement will change the problem for the simple reason that study section membership is getting *younger* rather than older. Increasingly we are being put on SS shortly after getting that first R01. Four yrs later we are burned out and avoid going back on for another 10-20 yrs. When in reality it is the older folks who should be serving and sharing their expertise and mentoring to the next generation of reviewers and researchers. I was on my old SS last June after just 6-7yr away and I was shocked that, at a mere 52yrs, I was now one of the oldest reviewers out of 27. Honestly, that’s just wrong!

    I’d love to hear from other PIs and reviewers on this topic.

  6. etatro says:

    I realize this thread is dead …. but …

    I just found this post on the statistics blog, “Simply Statistics” in which Steven Salzberg offers another look at Ioannidis’s data & methods. He also wrote letter to Nature.

Comments are closed.