Aug 29 2012
The Power of Replication – Bems Psi Research
You are currently browsing comments. If you would like to return to the full story, you can read the full entry here: “The Power of Replication – Bems Psi Research”.
18 Responses to “The Power of Replication – Bems Psi Research”

[...] Note – This article was also cross-posted at Science-Based Medicine. [...]
re: “It highlights the role of researcher degrees of freedom in generating false positive data, and replication as one solution to this problem.”
Isn’t there some tension between this point, and a desire to create a greater role for the assumptions researchers make about plausibility?
If those researching a particular area share certain prejudices, and this then leads to the generation of false positive data, then the problem could become self-reinforcing. To some extent, this will depend upon the area being studied, the ease of gathering meaningful objective data, etc… but there are areas where medicine touches upon important moral and political issues, and vested interests will have strong desires for data to be interpreted in particular ways (I mentioned in another comment the biopsychosocial reforms to disability benefit taking place in the UK).
Not fond of the researcher “degrees of freedom” language, since it’s nothing like what the term is really about, and I don’t want the pain of hearing docs say it who don’t know about that reality, which will happen if the term becomes trendy. The meaning in synergy, gene/locus, and predictive have all already died the death at the doctors hands. Can you leave us something left. Flexibility is a good word.
We can restart an experiment, try the same experiment multiple times (RT-PCR – it’s easy to do it until you get an error in your favor), ignore some samples that are against us (declaring them to have problems), get a few more data-points to see if it helps, and try 7 different transforms and statistical tests, all while telling ourselves we aren’t lying, somehow.
Thank you. Another tool in my growing kit of poor-quality study detecting. I now have a place to go where all the debunking techniques are archived in one place.
I think this blog may be more widely read than I thought. A couple of woo-ish acquaintances have stopped writing/calling after I mentioned this blog as my source for rebutting their claims of scientific support for things such as acupuncture or even (!) homeopathy. The price of using one’s real name. Oh well–I can only hope they will keep reading.
“Replication involves gathering an entirely new data set, so any prior random patterns would not carry forward. Only if there is a real effect should the new data reflect the same pattern.”
A real effect but not necessarily the hypothesised real effect.
“All experiments in psychology are not of this type, however. For
example, there have been many experiments running rats through all
kinds of mazes, and so on–with little clear result. But in 1937
a man named Young did a very interesting one. He had a long
corridor with doors all along one side where the rats came in, and
doors along the other side where the food was. He wanted to see if
he could train the rats to go in at the third door down from
wherever he started them off. No. The rats went immediately to the
door where the food had been the time before.
The question was, how did the rats know, because the corridor was
so beautifully built and so uniform, that this was the same door
as before? Obviously there was something about the door that was
different from the other doors. So he painted the doors very
carefully, arranging the textures on the faces of the doors exactly
the same. Still the rats could tell. Then he thought maybe the rats
were smelling the food, so he used chemicals to change the smell
after each run. Still the rats could tell. Then he realized the
rats might be able to tell by seeing the lights and the arrangement
in the laboratory like any commonsense person. So he covered the
corridor, and still the rats could tell.
He finally found that they could tell by the way the floor sounded
when they ran over it. And he could only fix that by putting his
corridor in sand. So he covered one after another of all possible
clues and finally was able to fool the rats so that they had to
learn to go in the third door. If he relaxed any of his conditions,
the rats could tell.All experiments in psychology are not of this type, however. For
example, there have been many experiments running rats through all
kinds of mazes, and so on–with little clear result. But in 1937
a man named Young did a very interesting one. He had a long
corridor with doors all along one side where the rats came in, and
doors along the other side where the food was. He wanted to see if
he could train the rats to go in at the third door down from
wherever he started them off. No. The rats went immediately to the
door where the food had been the time before.
The question was, how did the rats know, because the corridor was
so beautifully built and so uniform, that this was the same door
as before? Obviously there was something about the door that was
different from the other doors. So he painted the doors very
carefully, arranging the textures on the faces of the doors exactly
the same. Still the rats could tell. Then he thought maybe the rats
were smelling the food, so he used chemicals to change the smell
after each run. Still the rats could tell. Then he realized the
rats might be able to tell by seeing the lights and the arrangement
in the laboratory like any commonsense person. So he covered the
corridor, and still the rats could tell.
He finally found that they could tell by the way the floor sounded
when they ran over it. And he could only fix that by putting his
corridor in sand. So he covered one after another of all possible
clues and finally was able to fool the rats so that they had to
learn to go in the third door. If he relaxed any of his conditions,
the rats could tell.” http://www.lhup.edu/~DSIMANEK/cargocul.htm
IMHO, the failure of many medical scientists and psychologists to recognise the pathologies and the irrationality in the frequentist methods and ‘system’ of inference is an error which pales into insignificance next to the failure to understand why claims to have established retrocausality etc. would need something more than just impressively significant and replicable results before they could be taken seriously.
- This large, rigorous, and negative replication establishes that studies published in peer-reviewed journals with positive and solid-appearing results can still be entirely wrong. It therefore justifies initial skepticism toward any such data, especially when extraordinary claims are involved.
Indeed. Here is a cautionary tale. You may have heard that obesity, divorce, and even loneliness can spread through social networks. The obesity claim was published in the New England Journal of Medicine. Mathematician Russell Lyons completely debunked it, along with the rest of these claims, which were made by researchers at Harvard. But NEJM refused to publish his rebuttal. He had great difficulty getting it published at all, finally got it into an obscure journal. Christakis and Fowler, who made these claims, have sailed on unscathed, despite the established fact that their entire corpus of work in this area is total bullshit. NEJM has not printed any correction or retraction, and evidently never will.
That (cervantes’ cautionary tale) reminds me of sTeamTraen’s recent debunking work (in collaboration with Alan Sokal!):
http://www.badscience.net/forum/viewtopic.php?f=3&t=26673
http://www.badscience.net/forum/viewtopic.php?f=3&t=26673&start=25#p843471
which I hope won’t be rejected.
In reading this, I just realized that the inability of other researchers to “replicate” Bem’s result (submitting a manuscript to a certain journal and getting it published) demonstrates that what they tried to replicate (publishing in that journal) was not repeatable and so was not the result of a “scientific process”.
We covered the Bem study at the recent Skeptic’s Toolbox workshop in Eugene, Oregon. Ray Hyman told us he had a lot of experience as a reviewer of articles for psychological journals, and he was amazed that this one got by the peer reviewers and editors. He said he had made a list of 100 flaws in Bem’s paper, any ONE of which was grounds for rejection.
I think the FDA drug approval system is a useful model of replication. As I understand it, a drug goes through 3 phases: Phase I demonstrates relative safety, establishes dosage, etc. Then in Phase II, some degree of efficacy is demonstrated. If successful, Phase III is started to see if efficacy is replicated in a large trial. As a result of having to prove efficacy twice, ineffective drugs rarely get approved (of course, some may sneak through or adverse events may be missed, etc., but these are relatively rare)
In comparison, typical alternative medicine techniques are usually only held to phase I standards (“this has no harmful side effects!”). Occaissionally, they are held to phase II standards (“In this small, poorly designed study, we saw an effect, p=0.049″). I don’t know of any alternative medicine techique actually getting the phase III treatment. Of course, if they were successful in Phase III, we’d probably call them medicine.
This, in my mind, is the fundamental difference between science-based medicine and woo. True medicine has a high bar for evidence. If you want your brand of woo to get respect, you have to play by the same rules.
[...] Novella describes the [...]
BKsea:
And perhaps most important, those 3 phases all have to test the same hypothesis. You can’t say “Our drug improved maze performance in 3 out of 5 lab rats, and then improved breathing in 7 out of 12 asthmatics, therefore we want to test it on 1500 diabetics because obviously there is something special about this drug.”
Actually, FDA’s default standard is “at least two adequate and well-controlled [clinical] studies, each convincing on its own, to establish effectiveness.” (see Section IIA, paragraph 2 here) But that’s not a hard and fast rule.
[...] In 2011, The Journal of Personality and Social Psychology published a paper by psychologist Daryl J. Bem, which claimed to show that human performance can be influenced by future events. There was criticism of the way the data was analysed in the study. Subsequent attempts to replicate Bem’s findings have failed (here and here). Steven Novella at Science-Based medicine has the full story. [...]
[...] at Science-Based Medicine, Steve Novella writes about a newly published replication study appearing in the same journal where Bem’s original [...]
[...] to replicate, even if most people in the field believe that the study you are trying to replicate is obviously not [...]
How would your interpretation of Bayesian statistics go about assigning probabilities – by committee? I.e., “We think this idea if very unlikely, therefore we need an alpha of 0.001. But this idea we think is likely, and therefore we only need an alpha of 0.3.”? Is this what you’re suggesting? It sounds like it (more or less); you might as well create an Inquisition.
Prior Probability is assigned on the basis of the results of previous trials and on the basis of whether or not well established laws of physics, chemistry, physiology, and biology would need to be wrong I order for the therapy to work