A loan officer sets up a meeting with an aspiring entrepreneur to inform him that his application has been denied. “Mr Smith, we have reviewed your application and found a fatal flaw in your business plan. You say that you will be selling your donuts for 60 cents apiece. “Yes” says Mr. Smith, “that is significantly less than any other baker in town. This will give my business a significant competitive advantage!” The loan officer replies, “According to your budget, at peak efficiency the cost of supplies to make each donut is 75 cents, you will lose 15 cents on every donut you sell. A look of relief comes over Mr. Smith’s face as he realizes the loan officer’s misunderstanding. He leans in closer, and whispers to the loan officer “But don’t you see, I’ll make it up in volume.”
If you find this narrative at all amusing, it is likely because Mr. Smith is oblivious to what seems like an obvious flaw in his logic.
A similar error in logic is made by those who rely on anecdote and other intrinsically biased information to understand the natural world. If one anecdote is biased, a collection of 12 or 1000 anecdotes multiplies the bias, and will likely reinforces an errant conclusion. When it comes to bias, you can’t make it up in volume. Volume makes it worse!
Unfortunately human beings are intrinsically vulnerable to bias. In most day to day decisions, like choosing which brand of toothpaste to buy, or which route to drive to work, these biases are of little importance. In making critical decisions, like assessing the effectiveness of a new treatment for cancer, these biases may make the difference between life and death. The scientific method is defined by a system of practices that aim to minimize bias from the assessment of a problem.
Bias, in general, is tendency that prevents unpredjudiced consideration of a question (paraphrased from dictionary.com). Researchers describe sources of bias as systematic errors. A few words about random and systematic errors will make this description clearer.
Random errors are unpredictable variations in a measurement. Random errors, by definition are unpredictable in how they may affect any single measurement. For small samples, random errors may lead to an incorrect conclusion. For instance, 4 consecutive “heads” or 4 consecutive “tails” coin flips are not rare, but when they occur, may give the false impression of an unfair coin. Larger samples decrease the likelihood that random errors will result in an errant conclusion. 1000 coin flips will rarely deviate very much from a 50/50 distribution. More data increases the confidence that a sample measurements approximate the true value.
Systematic Errors (a.k.a. bias)
Bias is a non-random (or systematic) error which tends to distort results in a particular direction.
Larger sample sizes are not protective against bias. In fact, larger sample size increases the likelihood that bias will lead to an erroneous conclusion. Lets say that we want to measure the birth weight of a population of inner city newborn babies. Our scale is mis-calibrated in a way that 8 ounces is added to the weight of every child. The will result in a systematic over estimation of the birth weight of the babies. If we weigh lots and lots of babies, we will develop very impressive statistics with very tight standard errors around the mean weight, but they will be wrong because our measurement is biased. If we do not recognize the bias, more data will make us more confident in our data, but will not make our data any less wrong!
Bias is the nemesis of researchers, and is difficult, if not impossible to eliminate completely. The best researchers strive to minimize bias in their study designs, and acknowledge potential sources of bias they cannot eliminate.
Some types of data are more prone to bias than others. The vulnerability to bias is one of the most important qualities in determining the reliability of data derived from different sources of evidence. At the bottom of the hierarchy of reliability are the anecdote and its first cousin, the testimonial.
Anecdotes are narratives of one time events. Because they are singular events, anecdotes of anomalous occurrences are not balanced by averaging effects that results from multiple observations, thus they are highly vulnerable to random errors. So, if we look at enough anecdotes, the errors will neutralize each other, and we can create a true impression of nature, right?……..Wrong! If a collection of anecdotes represented a random sample of independent events, such a strategy might work, but collections of anecdotes are rarely random or independent….quite the opposite is true. Because of the biases intrinsic to anecdotes, a collection is even more likely to reinforce a false conclusion.
The only anecdotes that become part of the collective consciousness are those that are remembered by an observer, deemed worthy of repetition, and then actually communicated by the spoken or written word. There are many factors which influence the recall, interpretation, and reporting of experience. These factors are not random.
One of the principle reasons experiences are recalled and repeated is precisely because they are unexpected or out of the ordinary. Anecdotes of mundane events are unlikely to be repeated. Great significance is often attached rare and unexpected event, and the human mind seems programmed to look for “causes” of these events. Much of the time the events are just rare, but random occurrences (point examples of random error).
Confirmation bias is a tendency to notice and attach significance to information that reinforces ones preconceptions, while dismissing or ignoring contrary information. Confirmation bias has a great influence on which events are remembered and repeated as anecdotes.
Anecdotes are greatly influenced by the image the narrator has of him or herself, and even more so the image he/she wishes to project to the outside world. I call this Bumper Sticker Bias. I regularly see bumper stickers boasting that the driver’s progeny is an honor student a their respective school, but, strangely, I have never seen a sticker proclaiming “My kid was expelled from Central High School”. Bumper stickers are not random expressions of human experiences, and neither are anecdotes.
“Case Reports” are anecdotes enshrined in the medical literature. There is reason to believe that case reports are somewhat more reliable than an anecdotes told by your taxi driver or your cousin (at least when it comes to medical information). Physicians are ostensibly trained observers, but are in no way immune to all the biases ubiquitous to the human condition. For a case report, there is a written document (the medical record) to consult to for re-creation of the narrative. Most journals publish very few if any case-reports and credible journals do so only after peer review. Even published case reports are not given much weight by the medical community, and the articles are infrequently cited in future literature. Case reports can, however, have value. Rabies, for example, is so uniformly fatal, that a single example of an unvaccinated patient surviving infection with rabies is a highly significant case.
There is a familiar skeptical mantra that says “the plural of anecdote is not data.” The original attribution, and even the accuracy of this quote is a little uncertain. I would argue that the plural of anecdote is data; they are just really unscientific, really really unreliable data. Under the right conditions, anecdotes may generate hypotheses. For these hypotheses to be widely accepted, they must be confirmed by more robust data gathering and analytic methods. One of the recurring themes of pseudoscientific medicine, is that hypotheses generated by anecdotes are confirmed by nothing other than an avalanche of more anecdotes. Like the failed entrepreneur introduced at the beginning of this post, the purveyors of these treatments think that augmenting anecdotes with more anecdotes validates their hypotheses. But they are wrong. When it comes to bias, you can’t make it up with volume.