Future Tense

Do Tech Companies Really Need to Snoop Into Private Conversations to Improve Their A.I.?

It’s not the only way, but it’s the easiest.

An Amazon Echo with an ear cupped with a hand, eavesdropping.
Photo illustration by Lisa Larson-Walker. Photos by Amazon, Getty Images.

It turns out that Siri, Alexa, and whatever it is you call Facebook Messenger have been a little loose-lipped with your conversations. In recent weeks, each of these companies (as well as Google and Microsoft) has been scrutinized for allowing workers to listen to users’ private conversations in order to test their artificial-intelligence products for quality. By allowing the world’s most powerful companies to put microphones in our homes to make ordering a pizza more convenient, we’ve also let them take some of our lives out the door.

The latest of these examples to emerge is from Facebook. Bloomberg reported on Tuesday that the company has contracted hundreds of workers to transcribe anonymized voice calls made via Facebook Messenger. The contractors reportedly didn’t know why they were transcribing people’s private conversations or how the audio was obtained. Facebook says these users had opted to have their voice calls transcribed in the app. Still, presumably users assumed this would be done with software on their device—not by people. “Much like Apple and Google, we paused human review of audio more than a week ago,” Facebook told Bloomberg, suggesting it ended the practice in response to the backlash Amazon, Apple, and Google had received after customers learned private conversations recorded on their devices had been sent to human reviewers.

Amazon employed thousands of people to listen to voice recordings from people’s homes, sometimes picking up fights or children in distress. Amazon now allows users to opt out of sending their voice recordings to humans for review. Google and Apple, which were also found to have reviewed snippets of users’ private conversations, recorded on smart home speakers, also said earlier this year that they would end the human review of recordings. With Microsoft, the recording and human listening reportedly happened over Skype calls in which users opted for transcription. It’s not clear whether these companies have stopped saving recordings from people’s devices, however. The companies all claim that the recordings are anonymized, though Amazon reportedly sent the recordings to reviewers with the Amazon Echo owners’ first names. In any case, users could still easily be recorded sharing identifying information even if the recording’s labeling is anonymous.

Was all of this really necessary? And is there a better way to improve artificial intelligence software that understands us when we speak? The reason why all these companies were listening to recordings of private conversations is that the A.I. used in their devices simply doesn’t understand everything we say. That’s obvious if you’ve ever used one, since it’s not uncommon to bark the same command at a smart speaker multiple times before it registers the request. A.I. might understand what you say (in English) 95 percent of the time but that means one-twentieth of the time, when the machine thinks it’s supposed to be listening and doesn’t understand what’s being communicated, it needs a little help.

Humans are “the cheapest and most possible way to do this,” according to Meredith Whittaker, co-founder and co-director of the nonprofit AI Now Institute. That’s because A.I. systems are trained by being told they got an answer right or wrong, like understanding what someone said correctly. “It comes down to the issue of whether there is there an automated system that can be the arbiter of that decision with the accuracy and nuance and cultural context of a human, and right now there isn’t,” said Whittaker. “And so precarious hidden labor does a lot of heavy lifting for A.I.”

But contractors aren’t the only way to train a machine to get better at understanding us. There are other less privacy-invasive ways of improving A.I.’s ability to comprehend human language than human review. “Instead of sending out the exact piece for human transcription, you could create a way that has the same kind of noise or other acoustic features and have a human transcribe that so that you’re not divulging anything private of your users,” said Micha Breakstone, an expert in natural language processing and co-founder of Chorus, which builds A.I. for understanding conversations for sales teams.*

Breakstone means that a copy of a recording could be made to imitate the same sounds, inflections, and words so that the actual, potentially identifying recording isn’t being reviewed by strangers. It’s also possible to shift the voice or gender of the person talking in the recording to further protect their identity. Still, “at the end of the day, the answer is you don’t need to send those materials out to humans, but it’s much, much easier if you do,” Breakstone said.

It’s unlikely that in any near future there will ever be an artificially intelligent voice assistant that understands everything humans say—humans don’t even understand everything we say. We constantly ask each other to explain or repeat ourselves, and that’s a good thing. If we understood everyone in the world perfectly and said everything the same way, the world would be a very boring place. But that also means the machines we’re increasingly inviting into our lives are unlikely to ever fully understand us without the help of other humans either. And if we want to invite microphones from Google, Amazon, Facebook, and other companies into our homes that can understand us, we might have to live with the trade-off that our intimate moments may well be harvested, sent to low-paid people doing contract work for review, and fed back to improve the products of multibillion-dollar corporations. And we’ll have to weigh that bargain against the question of whether these voice assistants have really made our lives that much better anyway.

The privacy invasion here isn’t as simple as having another human listen to our conversations, although that certainly is jarring. “The focus on privacy needs to about how we get in a situation where these devices are littering our homes and public spaces and we’re ultimately contributing extremely sensitive data to corporations whose incentives and motives may be significantly different than ours,” Whittaker said. After all, you don’t own the recordings of what you say to Amazon, Google, Facebook, Microsoft, or Apple—or a company that’s more obscure. Once the backlash dies down, they’ll still be perfectly free to do what they want with it.

Correction, Aug. 15, 2019: An earlier version of this article misspelled Micha Breakstone’s first name.

Future Tense is a partnership of Slate, New America, and Arizona State University that examines emerging technologies, public policy, and society.