Articles

Screening Tests – Cumulative Incidence of False Positives

It’s easy to think of medical tests as black and white. If the test is positive, you have the disease; if it’s negative, you don’t. Even good clinicians sometimes fall into that trap. Based on the pre-test probability of the disease, a positive test result only increases the probability by a variable amount. An example: if the probability that a patient has a pulmonary embolus (based on symptoms and physical findings) is 10% and you do a D-dimer test, a positive result raises the probability of PE to 17% and a negative result lowers it to 0.2%.

Even something as simple as a throat culture for strep throat can be misleading. It’s possible to have a positive culture because you happen to be an asymptomatic strep carrier, while your current symptoms of fever and sore throat are actually due to a virus. Not to mention all the things that might have gone wrong in the lab: a mix-up of specimens, contamination, inaccurate recording…

Mammography is widely used to screen for breast cancer. Most patients and even some doctors think that if you have a positive mammogram you almost certainly have breast cancer. Not true. A positive result actually means the patient has about a 10% chance of cancer. 9 out of 10 positives are false positives.

But women don’t just get one mammogram. They get them every year or two. After 3 mammograms, 18% of women will have had a false positive. After ten exams, the rate rises to 49.1%. In a study of 2400 women who had an average of 4 mammograms over a 10 year period, the false positive tests led to 870 outpatient appointments, 539 diagnostic mammograms, 186 ultrasound examinations, 188 biopsies, and 1 hospitalization. There are also concerns about changes in behavior and psychological wellbeing following false positives.

Until recently, no one had looked at the cumulative incidence of false positives from other cancer screening tests. A new study in the Annals of Family Medicine has done just that.

They took advantage of the ongoing Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial to gather their data. In this large controlled trial (over 150,000 subjects), men randomized to screening were offered chest x-rays, flexible sigmoidoscopies, digital rectal examinations and PSA blood tests. Women were offered CA-125 blood tests for cancer antigen, transvaginal sonograms, chest x-rays, and flexible sigmoidoscopies. During the 3-year study period, a total of 14 screening tests were possible for each sex. The subjects didn’t all get every test.By the 4th screening test, the risk of false positives was 37% for men and 26% for women. By the 14th screening test, 60% of men and 49% of women had had false positives. This led to invasive diagnostic procedures in 29% of men and 22% of women. 3% were minimally invasive (like endoscopy), 15.8% were moderately invasive (like biopsy) and 1.6% involved major surgical procedures (like hysterectomy). The rate of invasive procedures varied by screening test: 3% of screened women underwent a major surgical procedure for false-positive findings on a transvaginal sonogram.These numbers do not include non-invasive diagnostic procedures, imaging studies, office visits. They do not address the psychological impact of false positives. And they do not address the cost of further testing.

These data should not be over-interpreted. They don’t represent the average patient undergoing typical cancer screening in the typical clinic. But they do serve as a wake-up call. Screening tests should be chosen to maximize benefit and minimize harm. Organizations like the U.S. Preventive Services Task Force try to do just that; they frequently re-evaluate any new evidence and offer new recommendations. Data like these on cumulative false positive risks will help them make better decisions than they could make based on previously available single-test false positive rates.

“In a post earlier this year, I discussed the pros and cons of PSA screening. Last year, I discussed screening ultrasound exams offered direct to the public to bypass medical judgment). If you do 20 lab tests on a normal person, statistically one will come back false positive just because of the way normal lab results are determined. Figuring out which tests to do on a given patient, either for screening or for diagnosis, is far from straightforward.

This new information doesn’t mean we should abandon cancer screening tests. It does mean we should use them judiciously and be careful not to mislead our patients into thinking they offer more certainty and less risk than they really do.

Posted in: Clinical Trials

Leave a Comment (15) ↓

15 thoughts on “Screening Tests – Cumulative Incidence of False Positives

  1. superdave says:

    Thank you!
    Even an extremely accurate test will have tons of false positives when you consider that there are 300,000,000 Americans. It’s another reason why screening is important.

  2. Alaskan says:

    -Nearly 90 million people – about one-third of the population below the age of 65 spent a portion of either 2006 or 2007 without health coverage. (Families USA. Wrong Direction: One Out of Three Americans are Uninsured. September 2007. http://familiesusa.org/assets/pdfs/wrong-direction.pdf)

    -A study found that 29 percent of people who had health insurance were “underinsured” with coverage so meager they often postponed medical care because of costs.( Consumer Reports. Are You Really Covered? September 2007)

    -Nearly 50 percent of uninsured children did not receive a checkup in 2003, almost twice the rate (26 percent) for insured children. (The Urban Institute. Key Findings from the 2002 National Health Interview Survey. 9 August 2004.)

  3. mckenzievmd says:

    Tremendously important topic. Here in vet med, we have to face the decision of whether to euthanize our patients, and this increases the importance of understanding the meaning of positive test results. When a feral cat comes in and tests positive for FIV (the feline version of HIV), some protocols recommend euthanasia. While this strategy has been an important part of reducing the incidence of the disease in un-owned cats, it is dangerous when many vets do not understand positive predictive value. They will say that since the test has a specificity of 98%, it is very unlikely the result is a false positive. But with an incidence of 1-2%, the fact is close to half of positive test results are false positives!

    Thanks for the post.

    Brennen McKenzie, MA, VMD
    http://www.skeptvet.com/
    http://skeptvet.com/Blog/

  4. megancatgirl says:

    Thank you for the informative post. Another thing to be careful of is ranges for what is considered normal. I had symptoms of hypothyroidism for years and was tested at least three times before a test finally showed my TSH level to be out of the normal range (although it had been increasing over time). My blood tests were normal until I developed my most severe symptom of heart palpitations. I don’t really know why it took so long for my problem to show up on tests, but I’m glad I caught it eventually. I’m sure it’s impossible to define ranges to include every healthy person and exclude every person who is at risk. I don’t really know what the solution is, but we just have to do the best we can with the tools we have.

  5. qetzal says:

    I’m curious about the repeatability of false positives.

    To illustrate, suppose some blood test gives a false positive for a given sample from a given person. That could be due to some error with running the test itself, such that retesting the same sample would give a true negative. Or, there could be an error related to the sample, such that retesting the sample would give another false positive, but a new sample from the same person would give a true negative. Or, there could be something about the person being tested, such that most or all samples from that person will give false positives.

    Does anyone know to what extent these possibilities are typically investigated for various tests?

  6. wales says:

    Very interesting post, Dr. Hall. I have often thought screening tests (and their annual repetition) were over done and not risk-free. A relative of mine recently died from cardiac arrest caused by pharmaceutical heart stimulation for a routine cardiac stress test. Unintended consequences abound.

  7. TsuDhoNimh says:

    #qetzal –
    Here’s the lab protocol if certain test results are “anomalous” … too far out of the “normal range” to be credible, or in the normal range when they aren’t expected to be there (pre-dialysis testing that shows a normal potassium, for example)

    Most labs don’t know who the test is being run on, all they have is a number unless they start looking. Also, most test methods have a list of “known interfering substances” from the manufacturer or the reagents that is constantly being added to.

    Tests are run with a “control” or controls, which is either a sample of a known value, or a known positive and known negative. If these don’t come out right, nothing gets reported until the problem has been solved and corrected and everything gets re-run.

    IF the controls are OK, and there isn’t an obvious block of samples with oddball results, these questions get asked:

    1 – Is there something wrong with the sample? Hemolyzed? Drawn with the wrong anticoagulant? Damaged in shipping from heat or cold? Drawn too close to an IV? Drawn right after a transfusion?

    2 – Is the result consistent with other tests that normally all trend together?

    3 – Has the patient had any medications or food or something that could explain the results? (call the pharmacy/x-ray/dietary/blood bank/nursing station to get the necessary info)

    4 – Is it consistent with patient’s previous tests, if any? Most lab software has a programmable variance value that will alert the lab if the results are more than X higher or lower than previous. The tech looks at the alert, checks the past lab history, and can over-ride or redo the test.

    5 – Re-collect and repeat the test … sometimes you really do have zebras.

    *******
    Some people, with no clinically apparent problem “run high” or “run low” on certain lab tests. Often these accumulate until someone collects the information and gives it a name.

  8. pmoran says:

    “A positive result actually means the patient has about a 10% chance of cancer. 9 out of 10 positives are false positives.”

    What data are you working from? If this means that out of ten biopsies performed because of screen-detected abnormalities only one is cancer, then that suggests a rather poor standard of radiological/imaging assessment.

    I thought most expert screening centres expected a benign/malignant ratio for biopsies of 3 to one or less.

  9. Harriet Hall says:

    10 abnormal screening mammograms don’t mean 10 biopsies. Additional mammographic views and sonograms may rule out the need for a biopsy. I couldn’t relocate the original reference I found for the 1 in 10 figure, but see this:
    http://www.ajronline.org/cgi/content/abstract/165/6/1373

  10. weing says:

    wales,
    My condolences.

  11. qetzal says:

    TsuDhoNimh,

    That all makes sense. I’m still wondering, though, what’s typically known about the sources and proportions of ‘true’ false positives.

    Results that can be discarded due to failed controls, assignable errors, hemolysis, verified interfering substances, etc., aren’t really what I have in mind. I’m more interested in results that are apparently valid and positive for some disease indicator, but it’s later determined that the disease is not actually present.

    Some of those will be “one-off” results that wouldn’t be positive if the same sample was retested. Others will be reproducibly positive for a given sample, but not positive in a second sample. Still others will presumably be positive in every sample, because (as you say) some people run high or low without any disease.

    I’m just wondering how often we have any idea of the relative proportion of each of those for a given test, because that could affect how we respond to a positive result. For example, if we knew that a given test some times gave false positives in a single sample, but hardly ever gave false positives in two different samples from the same person, then any time that test came back positive, we could draw another sample and retest. If it’s negative, the first result was likely to be a false positive. If the second result is also positive, it’s more likely to represent a true positive.

    On the other hand, if we knew that a given test tends to give repeated false positives in a subset of patients, there’d be no point in repeating the same test after an initial positive. We’d need to move on to an alternate confirmatory test.

    What I don’t know is whether anyone ever looks at such things. In many cases, it may be easiest to follow up any initial positive test with a second test using an unrelated method, but I don’t know how often that’s feasible.

  12. Tsuken says:

    Excellent post, thanks for that.

    Also interesting to note, given the extent of breast cancer screening that goes on, is something emphasised at a symposium I attended yesterday: women die of cardiovascular disease more than they do of breast cancer – yet it’s the breast cancer that provokes the fear, and that receives the attention. Throw in the false positives, and it becomes clear that continual evaluation of costs and benefits is needed, rather than simply setting up a programme and letting it run.

  13. cloudskimmer says:

    Great post, Dr. Hall; as a person subject to random drug testing due to my job, I have a great deal of concern for false positives.
    In a recent Los Angeles Times editorial debating the merits of various health insurance reforms, it was mentioned that the British National Health Service permits mammograms only between the ages of 50 and 70, and then only once every three years. Do you think that mammograms are performed too often in the U.S.? For people in low-risk groups, should there be fewer? Would the cost savings outweigh the downside of cases which are found late and the increased cost of treating cancers which are found later? Of course, right now we have a substantial population in the U.S. which don’t have health insurance, don’t get any mammograms, and, if afflicted with cancer, find it quite late. But are the British doing something which is medically indicated, or is it a trade-off resulting in providing everyone with mammograms versus our procedure of providing some people with annual checks, and others with none?

Comments are closed.