Future Tense

Understanding COVID-19 False Positives

Serum tests are mostly accurate, but a few percentage points of error can have a dramatic impact.

A woman in protective clothing leans over an antibody test on a table under a tent.
A health worker process the COVID-19 antibody test on May 5 in Torrance, California. Valerie Macon/Getty Images

A version of this article appeared on the website COVID-Explained.

When people discuss testing for COVID-19 antibodies—“serology testing”—there is a lot of talk about false positives, about “sensitivity,” and about “specificity.” What does this mean and why does it matter?

We can start with definitions, focusing on COVID-19 and antibodies (although noting that these general principles apply to any test).

A “false negative” test refers to a case where someone has antibodies (so he should test positive) but the test doesn’t detect them. The “sensitivity” of a test is a measure of this false negative rate: It is a number which indicates the share of positive people who are actually detected as positive. A test with 99 percent sensitivity would correctly identify 99 out of 100 people who have antibodies as “positive.”

A “false positive” refers to a case where someone who does not have antibodies (so she should test negative) are incorrectly identified as positive. The “specificity” of a test is a measure of this false positive rate: It is a number which indicates the share of negative people who are correctly identified as negative. A test with a 99 percent specificity would correctly identify 99 out of 100 people without antibodies as “negative.”

Larger numbers for either of these values indicate a better test. A perfect test would have 100 percent sensitivity and specificity: It would identify all positive people as positive, and all negative people as negative.

COVID-19 antibody tests are not perfect, and the quality differs across tests. A team at the University of San Francisco helpfully evaluated a large number of commercially available tests, looking at how they do on both false negative and false positive detection.

Most of these tests identify between 80 percent and 100 percent of positive cases. That is, among 100 people with antibodies, some of the tests find virtually all of them, and others find only about 80, incorrectly classifying the other 20 as negative.

On the flip side, the false positive rate varied between zero percent  and 16 percent. That is, among 100 people without antibodies, some of the tests incorrectly identified as many as 16 of them as actually having antibodies.

We should be clear: All of these tests are being commercially sold and used. None of these tests are perfect, and the worse performing ones are really not good at all. By comparison, HIV antibody tests have a sensitivity and specificity over 99.5 percent, and many tests are at 100 percent—essentially perfect. But why does it matter? To see this, we need to go a little bit more into the statistics.

Let’s imagine we have one of these tests and it has a 99 percent sensitivity—meaning it correctly finds 99 out of 100 people who do have COVID-19 antibodies—and a 95 percent specificity, meaning it correctly identified 95 out of 100 people without antibodies. Flipping it around, 1 out of 100 people who are positive would be (incorrectly) flagged as negative, and 5 out of 100 people who do not have antibodies would be (incorrectly) flagged as positive.

There are really two things you might want to do with this test. One is to inform people about their own antibody status, and the other is to figure out the population-level antibody rate. But when a disease is rare, as COVID-19 still is in many places, even small errors in testing can make a big difference in these conclusions.

To see why, let’s do some numbers.

Imagine that we have a population which has been relatively unaffected by COVID-19 (for example, a rural population in the middle of the U.S.), in which the true exposure rate is 1 in 200 people. To make it simple, let’s say there are 100,000 people in the population so, in reality, 500 of them have antibodies to the virus. If we had a perfect serology test, then we’d detect a positive rate of 0.5 percent, or 500 people out of 100,000.

Now we have our actual test, with a sensitivity of 99% and a specificity of 95 percent. This is similar to, or maybe a bit better than, many of the tests we actually have.

The 99 percent sensitivity means that of the 500 people who have antibodies, we correctly identify 495 of them. Good job!

Of the 99,500 people without antibodies, we correctly identify 95 percent of them as negative, or 94,525. Also great! But this means out of the negative people, we incorrectly classify 4,975 of them as “positive.”

Now let’s go to our questions, to see how these results affect how we can use this information. First, we want to use our data to calculate the share of people in the population who are positive. Remember, the truth is 1 in 200 people, but we don’t know that! That’s why we are conducting tests. The simplest thing would be to take all the positive tests (that’s 495+4975) and divide by the total population (100,000 people). If you do that, with 5,470 positive results, you will get a prevalence rate of 5.5 percent, which is way too high. The true rate, accounting for the 500 actual cases, is only 0.5 percent!

What happened? The problem is that because almost everyone is negative, even a small false positive rate here means that most of the people who test positive are mistakes.

This also makes it hard for people to rely on their antibody results. Among the 5,470 people who test positive for antibodies with this hypothetical test, only about 10 percent of them actually have antibodies. The rest are false positives. This really matters if people base their activities around that test result—they may take risks they would not otherwise take, thinking that they are relatively more protected. They could endanger themselves and others, possibly exposing themselves to the virus under the assumption they are immune, resulting in potential illness and the spread of the virus to others.

If you are only interested in using your serology tests to figure out the prevalence in an overall population, these problems are less bad because you can (sort of) fix them. For example, if you knew the sensitivity and specificity of your test for sure, you could reverse my calculations above and fix your estimates. However: Doing so requires knowing for sure the performance of the test. And with a new virus, with new tests, we do not always know. And when the prevalence is low, small differences in assumptions about these values will really, really change the results.

If you want to use your tests to give individuals information about their infection history and immunity status, things get much trickier. There’s no way to know if an individual’s test result is accurate or part of the small percentage of false results. Not to mention our ongoing uncertainty about whether and for how long being sick with COVID-19 leaves you immune to future infection.

The main solution here is better testing. Some of the serology tests are better than others; focusing on using those will improve our understanding of antibodies against the virus. But this problem is always likely to be with us to some extent.

Additional research and reporting by Elizabeth DeRiso, Jeff Hsiao, Anagha Lokhande, Nivea F. Luz, and James Okun.

Future Tense is a partnership of Slate, New America, and Arizona State University that examines emerging technologies, public policy, and society.