This article is part of Privacy in the Pandemic, a new Future Tense series.
Nearly three months after the novel coronavirus arrived in the United States, the country is scrambling to manage COVID patients. There’s still a major shortage of tests. Last week, the U.S. Department of Health and Human Services announced it would end federal funding of 41 testing locations, while there’s still a backlog of tests in major cities across the U.S. And with many hospital systems understandably focused on (and overwhelmed with) treating COVID patients with serious symptoms, it can be difficult to find the time to check in on patients with more mild cases, who are recovering outside the hospital. The city of San Antonio announced a program to check in with each COVID patient, but that would be a tall order for places like New York City, which has had more than 106,000 confirmed cases.
To better track COVID recovery, or even diagnose COVID, researchers are looking to places you might not expect. For instance, Matt Whitehill, a computer scientist at the University of Washington, is soliciting volunteers to record themselves coughing. (Full disclosure: I am teaching a course at UW this semester but have no relationship with Whitehill or anyone from his department.) People sick with COVID-19 experience a variety of symptoms—or sometimes, none at all—but a dry cough is among the most common symptoms, so the prevalence of that cough “can be used as a proxy for their overall health,” says Whitehill.
The data Whitehill is collecting will be used to train an algorithm that detects and counts coughs in patients recovering from COVID. He and his colleagues have been conducting cough detection research since 2011, with the idea that monitoring coughs could help health care professionals monitor people recovering from the flu or tuberculosis, and to track diseases’ spread. In their work, they’ve found that coughs can tell you a fair amount about a person. “People have their own cough signature, because you get some vibrations of the vocal cords as you’re coughing.” It doesn’t necessarily take an algorithm to identify coughs—Whitehill says that people listening to strangers’ coughs can identify individual coughs with about 80 percent accuracy—but subtler aspects of coughs can be detected by an algorithm, like whether the cougher is likely male or female.
This ability to differentiate between different people’s coughs comes into play if you’re tracking patients in real world environments. Say you have multiple sick people recovering in the same household, or the same area of the hospital. A cough detection algorithm may be able to identify your cough and count the number of times you’re coughing in an hour or a day, which might be able to tell your doctors how well you’re recovering. In the long run, this data might also give researchers a better idea of what the trajectory of recovery looks like for COVID patients.
Their current database includes more than 10,000 coughs, says Whitehill, and to better train the algorithm, they’ve also collected recordings of people speaking, clearing their throat, or laughing. “These are the things the model most frequently has trouble with,” he says.
Now, Whitehill and his colleagues are developing an app that uses their cough detection algorithm to count coughs. They hope to begin piloting it in the next two to three months, and have relationships with clinical researchers who will seek consent from their patients. Consenting volunteers would have the app running in the background of their phone at all times, turning on only when it detects a cough. “It’s similar to an Amazon Echo,” says Whitehill. “It’s constantly listening to your mic data, but it only starts recording or analyzing once it hears a ‘wake word’—in this case, the cough is the wake word.”
Given recent concerns around data privacy, Whitehill and colleagues designed their app so that personal data remains on the user’s device by default, unless they choose to share it with the researchers. “Some apps might record your audio and send it to the cloud for processing, which allows them to use a more complex deep-learning model, but it’s a pretty severe privacy risk,” says Whitehill. “Instead, all our processing occurs on the device, so none of your audio leaves your device unless you provide consent.” He adds that walking volunteers through privacy options will be an important part of rolling out their app.
Privacy concerns have put a related project on hold. The COVID Voice Detector, created by Carnegie Mellon researchers, set out to collect voice samples from volunteers to train an algorithm that estimates your chances of being COVID-19 positive. (They also collaborated with A.I. voice company Voca to collect data.) Previous work in coughalytics (yes, this is really a thing) and voice analysis suggests it’s possible to differentiate between, say, people with chronic obstructive pulmonary disorder and people with the flu, or pneumonia. These Carnegie Mellon researchers thought it might be possible to train an algorithm to detect COVID-19 in patients by their voice. “Not everyone who has COVID even shows symptoms. While a cough could be indicative, it’s not the only thing you could look at,” says Rita Singh, a computer scientist on the project. “When you’re coming down with any condition of your respiratory track, there’s likely to be some change in your voice, the same way there’s a change in your voice when you get a common cold.”
What, exactly, those differences are would be borne out by collecting data, so the project’s website launched on March 30, asking volunteers to record themselves coughing, reciting the alphabet, saying some vowels. Then, the site spat a score—the likelihood you were infected with COVID-19. But after a day or so, Singh and colleagues realized people might misinterpret the website’s “score” as a real diagnosis. “We immediately brought the website down, and have been rethinking how we can present the results of this system,” says Singh. “You don’t want anyone’s blood on your hands.”
But Singh and colleagues still want to find ways to make their work useful. Singh says they have reached out to medical doctors and to the Food and Drug Administration in hopes of using their algorithm as a diagnostic aid. (A full diagnosis, of course, would require approved tests.) They’ve also been thinking about how to safely collect data. While we might take our voice for granted, it’s a powerful biometric tool. Much like your fingerprint or DNA are unique to you, so is your voice. “If there’s a voice database anywhere, going forward, we have the ability to find out your identity,” says Singh. So they’re working on making their data collection site compliant with privacy regulations from the EU and California, so users could delete their data at any time.
Meanwhile, there are several other projects collecting voice or cough data from participants from institutions like the University of Cambridge and Israeli startup Vocalis Health. Only time will tell whether algorithms will be a useful tool for COVID tracking and diagnosis. But any studies must take privacy seriously, and to be clear about what this technology can and can’t tell us. “We don’t want to play games,” says Singh. “This is not the time to experiment with things we’re not certain about.”