Future Tense

We Can’t Get a Handle on the Coronavirus Pandemic Without Random Testing

Everyone says more testing is the key to understanding—and ending—the coronavirus pandemic. And it is—but it needs to be done right.

Rows of nasal swabs on a red background.
Photo illustration by Slate. Photo by Ben Hasty/MediaNews Group/Reading Eagle via Getty Images.

It is fair to say that every one of us would like to see progress against COVID-19 and toward a return to normalcy. And if you follow the news, you’ll hear that testing is the key. Test more! Test better! Different kinds of testing! Home testing, drive-up testing, more testing centers. Antibody tests!

And testing has indeed been ramping up. More states are testing sick people, and some have gone beyond. Tennessee has announced that now anyone can be tested for the coronavirus, independent of symptoms.

While a sick individual’s diagnosis won’t change their course of treatment, especially for a mild case, it does inform self-isolation measures and contribute to tracking of the disease’s spread. But we are still not doing enough of the key thing: truly random testing. To see why this is so crucial, it’s useful to start with one of the key open questions in the pandemic: What share of people are already exposed?

There are a lot of unanswered questions with COVID-19—how far it travels in the air, how best to treat it, why some groups and people are so much more affected than others. But among these lurks a more fundamental question in the background: How widespread is the virus, anyway? This is an extremely important question, but it’s also very hard to address. Why?

A lot of our predictions about the path of the virus over the next few months (and the world, the economy, etc.) rely on epidemic modeling. Many of these models are forms of an “SIR” model—“susceptible-infected-recovered”—which chart out dynamics as a population moves from entirely virus-susceptible to infected and finally into recovered.

The basic structures of the models are mostly similar, but depending on what numbers you insert in them, they give wildly different answers. We’ve seen that in how predictions about future hospitalizations and deaths have changed over the past month.

There are a lot of reasons for this, but they mostly come down to the fact that these models have exponential growth. This means that small differences multiply quickly over time, so small changes in assumptions about disease spread will result in huge differences in projections a few weeks out.

To make the models better—both to figure out which ones are right and to improve the best ones—we need to fit them to data. That means actually knowing what share of people are susceptible, infected, or recovered at any given time. Without that information, we are basically just guessing.

You may think: Surely we know that! Don’t we see information on infections and hospitalizations and deaths over time? I feel like I’ve seen a lot of graphs about this.

Well, yes. But in the context of COVID-19, that’s not close to enough. Many infections with COVID-19 are mild and nonspecific, meaning people either don’t realize they have this disease or they aren’t sick enough to seek out medical care. A large share of people—perhaps half or even 75 percent—who are infected have no symptoms. Even people who are symptomatic are still often not tested. Case counts are virtually meaningless given variation in testing over time and across space, and the fact that even in the best surveyed places in the U.S. testing is incomplete.

This means for every case we see, there are at least some we do not see. How many is really unclear. Some people think there are 10 missing cases for every one we see; others think it’s just one or two.

The implications of these two views are hugely different. If 1 percent of the population has already been infected, then 99 percent of people are still susceptible. On the other hand, if 20 percent have already been infected, well, that’s a different story.

Among our top priorities should be to learn about this number. And here is where I’ve been contemplating the problems of selection.

The best way to learn about the share of the population that has been exposed to the virus is either to test everyone (best case, but probably infeasible in the U.S.) or to test a random sample of people. This testing could be for active current infection or for past infection using antibodies. (This antibody testing has started to come online in the past couple of weeks and promises to be even more useful than active infection testing.)

Regardless of which type of testing we use, the best information will come from testing a random sample of people. Being random, it is representative of everyone, so it allows us to learn about what we expect in the population overall.

There are a few examples of this kind of testing so far in the pandemic—a very few. Iceland did some random population testing recently, which showed about 1 percent of the general population had active infection (half of them asymptomatic). There is one town in Italy that tested everyone early in the epidemic (3 percent active infection, about half asymptomatic). Antibody testing (which identifies present and past infections) in a random sample in Germany showed 15 percent had been, either actively or in the past, infected.

Second best to a random sample may be universal testing among a known population. We had a recent example of this among, actually, pregnant women in New York. A publication earlier this week in the New England Journal of Medicine showed active COVID-19 infection among almost 15 percent of women admitted for delivery at one hospital in New York City.

This isn’t as good as a random sample, since pregnant women are different in many ways (gender, age, exposure to medical care) from the general population. Still, it has value—in part because we can understand the sources of bias.

I’d say a similar thing about recently announced plans by Major League Baseball to test, basically, its entire workforce. Yes, this is not a random set of people. But if they really get to something close to universal, we can at least have a really good understanding of how the sample is selected.

Most people agree that random or universal testing is the best approach. But it’s also very hard to execute. Identifying a random sample of people and testing them is much, much more challenging than testing what we’d call a “convenience sample”—people whom it is easy to find and access. Let’s say you are a governor and you want to test your population randomly, either with mouth swabs or blood tests. You would need to send someone (the National Guard? nurses?) to people’s houses to ask them to give blood or have a swab stuck deep into their nose. A lot might say no. People will complain! Beyond this, it is well known that survey response varies with age, race, and ethnicity, which would be a problem for a random sample. It’s a logistical mess.

Given how hard this is, you might be tempted to think: Well, some data is better than no data. I’ll do something easier—maybe set up a mobile testing site and encourage people to come—and at least I’ll learn something.

This thinking is really problematic. Put simply: If we do not understand the biases in our sampling, the resulting data is garbage. One recent frustrating example of this is a large National Institutes of Health study that aims to do antibody testing among 10,000 volunteers to measure the prevalence of undetected infections. Volunteers are being solicited in various ways, like over Twitter and with other public postings. People are asked to email the NIH to enroll, at which point they may be sent a home test kit.

Anthony Fauci has suggested that this will give us a “clearer picture of the true magnitude of the COVID-19 pandemic in the United States.” But it will not! It will give a clear picture of the magnitude among people who, say, scroll Twitter for opportunities to be in studies like this. Are these people more or less likely to have had COVID-19? I have no idea. Maybe you pull more people who know they’ve been exposed (higher prevalence), or maybe you pull people who are more careful about exposure (lower prevalence). Maybe it’s a weird mix of both. We simply do not know. We’ll get some number out of this, and it will be completely uninterpretable.

This is worse than nothing, since people will think that they’ve learned something.

I have similar problems with testing blood donors as a measure of prevalence. Yes, it’s convenient. But it’s not going to tell us anything broadly useful.

What to do? I’m afraid that despite how hard it is, we simply have no choice but to do better sampling when we test. As someone who is trying to get some random testing off the ground in various populations, I can attest to the many, many challenges of doing so. But it is worthwhile. We need to do this.

What can you do, other than tell all your friends that random testing is great? By far the most important: If someone shows up at your door and tells you you’ve been randomly selected for testing, please, please consent.

A version of this article first appeared in Emily Oster’s newsletter, ParentData.

Future Tense is a partnership of Slate, New America, and Arizona State University that examines emerging technologies, public policy, and society.