RCT Plausibility Scale

RCT Plausibility Scale

After a few intro paragraphs, I want to present a scale of probability to estimate a value of a “prior” to plug into the formula for obtaining a Bayes Factor. The scale can help to estimate a value, but will still rely on an estimate, the non-quantitative element in Bayesian simulations. However, the checklist may at least provide some objective bases on which to hang a value, and that value would actually make a semi-quantitative statement of its own. Although that value would retain some subjective quality, it would at least be backed by known quantities and laws of nature.

Begging your patience again, I became aware of this problem in 1999 when asked to moderate an online ( debate on “CAM” among 4 physicians. My role soon morphed into participant-debater when I could not get all to agree on what I thought was obvious common ground to proceed with the discussion – that 1) concepts that violate scientific laws do not have to be subjected to clinical trial (RCT) and that trial results had to be interpreted in light of previous knowledge; and 2) clinical trials could not constitute adequate evidence in the absence of plausibility because their results were too varied and inconsistent. The matter was p-recipitated by systematic reviews (SRs) showing efficacy of acupuncture in back pain. I was truly surprised when one of the participants (Dr. Edzard Ernst) assured me that indeed, RCTs were now the gold standard for efficacy. The debate went downhill from there.

I became fascinated with the disparity raised in the debate, and continued discussing the principle in an exchange in Academic Medicine with the then Director of NCCAM, Steve Straus. who wanted to use “rigorous trials” to prove or disprove “CAM” methods. I maintained that such an approach would lead to an infinite progression of indeterminacy.


Kim Atwood’s series on homeopathy and plausibility sums up our present state of knowledge and lack thereof, or – one could put in several different subjects – what’s wrong with Evidence Based Medicine (EBM;) lack of EBM’s ability to describe accurately the state of knowledge of implausible medical proposals; our understanding of Bayes’s theorem applied to RCT outcomes and reviews of implausible and ineffective methods. I can’t think of much to add to his summaries. BTW, he may not have stated, but he has been mulling this also for almost ten years to my knowledge, and if anyone has a handle on what’s wrong with the medical literature regarding sectarian “CAM” it’s Kim.

By this time most of us on this blog and some other colleagues recognize that EBM methods, including those used by the Cochrane Collaboration, are necessary but insufficient to reach a realistic expression of confidence in clinical trial results. Most of us are familiar with Ioannidis’s article and the way he has gone about this using a simplified Bayes method. Kim gave an example a week or two ago.

I have in my own mind used a few shortcuts to sectarian method evaluation. I am fond of shortcuts to help extract one from brambles of disputes about efficacy using insufficient and conflicting information. One way is to ask simply, how much worse off would we be or would a patient be if the method in question did not exist? It’s sort of a steal from O. W. Holmes famous comment on the materia medica contents of his time – the answer was they would have been better off without them. Regarding most methods that concern us today, the “CAM” ones, the answer is either better off or no worse off without them. In the case of methods with complications, like chiropractice and herbs, we’d be better off without them. With methods lacking bad effects such as homeopathy, we’d be just no worse off. But that assumes the methods are ineffective – which is obvious to us, but apparently not to others.

I will not go into the reason review experts (Cochrane’s and others) do not conclude ineffectiveness, and keep recommending “more clinical trials.” There is a reason, but that’s for another paper.

Kim’s and our problem, then, given RCTs already done, is how to establish ineffectiveness in presence of conflicting information without submitting every nutty idea to infinite numbers of trials.

The answer we’re coming to is to apply a Bayes Factor to the reported P values in a way similar to those of Steven Goodman and John Ioannidis. Goodman took a range of 3-4 possible values for a prior probability and calculated the posterior for each assigned prior. So, one had several possibilities from which to choose. Looking at his charts, the several possibilities are more revealing than one would have thought; the revised P value becoming much less significant in each of the examples. Kim Atwood presented another example a week or two ago.

But if one wanted to narrow the choices to one or two prior probability estimate, here is a checklist for use in estimating prior probabilities.

First, just list the usual classification of the scientific principles from the most basic to those dealing with the most complex, and grade the current evidence about the method on a scale from 0 to 10, with 0 as the least consistent or most inconsistent and 10 as most highly consistent with principles of each.

Physics (and mathematics)




Molecular biology

General biology

Other complex sciences (geology, botany, astronomy, etc.)

Then, apply a negative integer, 0 to -10, as to how well or complete the phenomenon can be explained by another known science(s) – especially experimental psychology (suggestion, misperception) and social psychology (cognitive dissonance, mass hysteria, etc.) This maneuver takes advantage of known information that offers a hidden logical reason for any observed positive effect.

From here, there are a number of ways for adding, subtracting values, and the option for multiplying by 0 in the case of a highly conflicting basis such as homeopathy, so that no matter how many plus values there are, the answer would still be zero.

Taking the most implausible example, homeopathy, one would assign a 0 value for physics (violation of 1st and 2nd laws of thermodynamics, Boyle’s and Charles’s laws of gases (fluids). a “0” for chemistry (violation of law of mass action) a 0 or 1 for pharmacology, invalidity of “law of similars.” Add the resulting values.

A more direct mathematical way would be to use a numerical scale from 0 to 1, with degrees of consistency expressed as a fraction/decimal (0.001, 0.1, 0.5, etc.) whose final sum or product could be plugged directly into the formula to modify the calculated P of the report. A highest score would be a 1.0. which would confirm the calculated P value. Diminishing consistency with scientific laws and principles would diminish the calculated P proportionately.

Once the scientific scale is applied, one could have the option of applying another scale based on non-scientific credibility (consider the source):

Economic history (involvement in previous scams and schemes,) marketing useless

products, books on same.

Legal history (convictions, fines, licensure disciplinary actions, etc.)

Writings on other implausible claims, sectarian schemes (Scientology, etc.) vitamin

promotion, etc.)

Participation in pseudoscience meetings (Whole Life Expo, etc.)

These events and characteristics reveal degrees of lack of credibility of the individual whose work is being evaluated (Wirth/ Cha prayer group, advocates of mercury-autism link, raw milk promotion, Laetrile advocacy, etc.) A value between 0 and 1 for each or for the combination would diminish further the value assigned to the Prior.

This is as far as I have been able to develop this idea. I grant its sometime subjective quality, that some qualities (convictions, fines) depend on other qualities in the scales, but I think they are important qualities and adding them decreases credibility. At least they should not be ignored. Implausible schemes are attractive to a certain set of mind types – psychopathic and gullible mindset patterns mean something. They can lead to a cultural and institutionalized intellectual psychopathology and a scientific-social terrorism, infiltration of academic institutions, bribery (the massive funding behind movements) and law changes (DSHEA, licensing quackery, Access to Medical Treatment Acts (AMTAs.) .

These credibility indicators are thoughts that occur to us but that are often excluded from evaluations because of principles of the law, intellectual/academic political correctness, and sometimes just a sense of “fairness” however blinding that may be. I think they are significant when it comes to a “scale of credibility,” which is not a bad name for this.

Several days ago Q’ometer blog addressed this problem with an interesting graphic using 4 quadrants and a plot of credibility vs evidence, variating on the theme of Kim Atwood’s fugue. A graphic plot would be a welcome addition to visualize the probability scale values.

This method does not address directly the problem of “MA and SR indeterminacy. (Gimme credit for that one too.) It addresses the RCTs that go into those SRs and MAs. Most SRs do not have single values to which one can apply a prior probability. There are other ways of handling SRs we can explore later.

I am no professional mathematician, so have at it.

WS. �

Posted in: Clinical Trials, Science and Medicine

Leave a Comment (8) ↓

8 thoughts on “RCT Plausibility Scale

  1. Infophile says:

    I think I can see what you’re getting at with your method of estimation, but perhaps an example working all the way through would be useful. There are a few parts of it where one has to make assumptions about how you intend the numbers to be used (dividing by maximum score in the 0-10 scale, for instance).

    Beyond that, I don’t think multiplying by zero, even in the case of homeopathy, would be very useful at convincing anyone. If you start off by saying that it can’t work, of course your result will be a probability of zero. It should be fair to at least give them the one-in-a-google shot that humanity has been collectively hallucinating about the scientific findings of the last few centuries.

  2. Harriet Hall says:

    You can try to quantify, or you can just remember SBM = EBM +CT. Science-based medicine equals evidence-based medicine plus critical thinking.

    I have a proposed rule of thumb for meta-analyses: if the results are negative, you can probably believe them; if the results are positive, withhold judgment until you have looked carefully at the quality of the studies and at prior plausibility. And even when they pass those tests, withhold judgment! :-)

  3. pmoran says:

    Harriett, why just meta-analyses? It is proved that there is an overall bias towards positive results in drug company- funded studies. Negative studies may be even more likely to reflect the truth in studies of “alternative” methods, because these are typically designed by enthusiasts able to test the precise scenario where they believe their method perform best. There is also the difficulty in blinding with many alternative procedures.

    The Bayesian approach begs independent validation, something that breaks the circularity of reasoning that Infofile alludes to. The quite frequent negative trials of alternative methods help supply that. In fact, the naive EBM approach sort of works if this bias towards positive results is factored in, but asymptotically approaching truth only after very many studies have been performed and analysed. This is a waste of resources partly because the inevitable positive studies sustain belief in some.

  4. Harriet Hall says:

    I was picking on meta-analyses because they compound the error. GIGO (garbage-in-garbage-out). Some people are under the misconception that you can take a lot of iffy studies and make them respectable with a meta-analysis. Because of factors like the ones you mention, positive results are likely to predominate and the good negative studies just get lost in the noise. This is true whether you’re talking about homeopathy or a new antibiotic.

  5. daedalus2u says:

    If reality or physiology were simple, then you could use a simple formula to decide these things. There are fundamental flaws in how people think about physiology. The (wrong) concept of homeostasis being one of the most grievous. Nothing in physiology is static. Having a wrong concept of some sort of “stasis” as your default concept is going to lead to wrong conclusions.

    The recent stoppage of the trial on better control of blood sugar to what is considered “normal” is a good example. The fundamental hypothesis of the trial was that elevated blood sugar is a sign of dysregulation in glucose physiology, and doing things to keep blood sugar in a narrow “normal” range will improve health. What the trial showed is that it is not that simple. What the trial showed was that every instance of elevated blood sugar is not detrimental. We already knew that (or should have known) because the only time blood sugar is constant is at rest. Under stress, even in fasted individuals blood sugar goes up. Is that “dysregulation”? No, it isn’t. It is the body mobilizing glucose reserves and releasing them to peripheral tissues in anticipation of increased need. If the body doesn’t get glucose to those peripheral tissues, those tissues will experience glucose depletion and ATP depletion and will be unable to perform their called upon functions and the consequences will be adverse.

    Is the elevated blood sugar of the metabolic syndrome pathological or adaptive? A good question since the “cause” of the metabolic syndrome remains unknown. Is it even dysregulation? There are two types of dysregulation, good regulation around a bad setpoint, and bad regulation around a good setpoint. I suspect that diabetes type 2 is good regulation around a bad setpoint and that diabetes type 1 is the consequence of bad regulation around a good setpoint. You can have both simultaneously. Bad regulation has both positive and negative deviations. A negative deviation in blood sugar, due to too much insulin, can be promptly fatal. The ablation of the pancreatic islet by the immune system may be a mechanism to prevent this. Protective in moderation, fatal when carried to an extreme.

  6. A paper in this week’s BMJ is a good example of the need for Bayesian analysis.

    It purports to show that acupuncture is an effective adjuvant for in vitro fertilization. An accompanying noncritical editorial is here:

    Although it is a SR and MA, point estimates are given and the individual trials are cited, so it may be possible to subject the claim to quantitative Bayesian analysis. I would love for someone to take a stab at it.

    And, if you want to use Dr. Sampson’s scale of credibility, consider that BMJ publishes many quackery promoting articles—more than any journal I know except for those that are dedicated to CAM, thus earning it a place on Steve Barrett’s list of nonrecommended periodicals:

    I posted concerning this earlier today:

Comments are closed.