The Great A.I. Beta Test

Social media platforms are moving to automated content moderation, and no one is likely to win.

Robot hands on a laptop with Facebook on the screen.
Photo illustration by Slate. Photos by Getty Images Plus.

On the evening of March 16, Facebook made a quiet announcement on its growing page of COVID-19-related responses. There, the company revealed that it would soon be “sending home” its content moderators, the often-unheralded group of human workers who evaluate social media platforms’ content for material that—per the platform’s rules—doesn’t belong. By virtue of their own lucrative business models, social media outlets like Facebook, Twitter, Instagram, YouTube, Snap, various Microsoft properties, gaming platforms, and others rely on user-generated content to get and hold the interest of their user bases. For this reason, they all have the massive task of sorting out the good from the bad, the junk from the fair game, the rule-breaking material from the stuff that passes muster. That work of sorting and adjudication—content moderation—is a tough job on a good day, exposing workers to thousands of potentially problematic posts through the course of their shifts, ranging from the relatively benign but inappropriate (too much skin, a bit too violent) to the downright disturbing cases of abuse, exploitation, hate speech, mistreatment of animals, bloody violence, and gore. The list, unfortunately, goes on.

The work is taxing psychologically, but it’s also stressful in terms of the production metrics—the sheer amount of content the moderators are expected to look at every day. Despite treating these (very underpaid) workers as something of an underclass, most people on the inside of big tech firms would classify their services as “mission-critical.” Now, these many thousands of workers would be sent home from the call-center–like installations where they typically labor. In their stead, Facebook said, artificial intelligence–based computational tools would come online. In an uncharacteristic display of candor and prescience, the company advised its users, “We don’t expect this to impact people using our platform in any noticeable way. That said, there may be some limitations to this approach and we may see some longer response times and make more mistakes as a result.”

But what constitutes a “mistake” at the scale and scope of a global informational platform? Presumably, it would include content that might be removed through an automated big net but would likely be able to stand under review and discernment from a human moderator. There may be policy exemptions that could be applied to material that would look, to a machine, like excessive blood and gore but, in fact, was the video of an unlawful attack on civilians in a conflict zone. Facebook’s warning message about errors and lag time suggests that it would be impossible for A.I. to take into account those exemptions. That should worry us all.

Unspoken, too, in the announcement was the fact that 72 hours prior, far from the cradle of American high-tech that is Silicon Valley, the entire metro Manila area of the Philippines had just gone under quarantine. The Philippines is the world capital of the offshoring and outsourcing of business process operations doing call center–based service work—human resources, accounting, programming, and, yes, content moderation—for the globe’s corporations. Facebook, YouTube, and Twitter are just a handful of the top firms known to have outsourced content moderation in the Manila area. In other words, Facebook may have had little choice in the matter.

As the call center employees in Manila and around the world logged off and turned out the lights, the top social media firms faced a worrisome new chapter. Typically they use automated computational tools in very specific and narrow instances and for cases that could produce a high probability of yielding appropriate, computationally discerned hits for takedown. Think spam, copyright violations, child sexual exploitation material, and other, already known bad content. Now these computational mechanisms would be the front line of defense for a vast amount of platforms’ content.

While companies have long been building and tweaking such tools—using methods like machine learning via the processing of presorted datasets, natural language processing, and algorithms designed to seek out and cull undesirable postings—the typical social media ecosystem has seen firms using them in concert with its significant pool of human moderators. Despite their technological sophistication, such automated tools fall far short of a human’s discernment. The A.I. tools are a blunt instrument. They can easily function at scale in a way that humans cannot, but with the terrible downside of being overly broad in yielding hits, unable to make fine or nuanced decisions beyond what they have been expressly programmed for.

These serious shortcomings were clear to the companies that had elected to employ the class of tools in the absence of human staffers. By March 31, industry leaders Twitter, Facebook, and YouTube had all made statements about the expected impact of automated moderation systems, and each characterized any unwanted outcome as some form of “mistake.” An important choice of words, perhaps, but also more demonstration that computational tools can do only what they are designed to do—a result that can indeed vary from what their designers had intended or what their users had come to expect. In this new ecosystem of reliance upon computational moderation tools, the firms were essentially demanding that all their users suffer through the pains of such systems. In effect, they have made unwitting beta testers of us all.

It would be one thing if the results of bad A.I.-tool-informed moderation decisions could always and easily be considered nothing more than a mistake, a gaffe, or an accident. But a move to fully automated moderation has long been the nightmare of many human rights and free expression organizations, which see the overly broad bluntness of these tools as less of a mistake and more of an infringement on the right to create, access, and circulate information. Under a global health crisis that has also become a political one, those rights become more important than ever.

And yet human moderation of complex content—a major mechanism to ensure that users can access good information on these notoriously opaque, undemocratic commercial platforms—has disappeared, leaving us with a flawed and even more impenetrable ecosystem in its wake. Ever tried to interview an algorithm to find out why it has made the decisions it has? Good luck on that one.  The A.I.-based automated tools represent a worrisome black box of which any external auditing or measuring—already difficult under a hybrid regime of humans and computation—would be nearly impossible.

Stakeholders from the global community for freedom of expression and human rights have long sounded the alarm about what content will bear the brunt of this new form of moderation: political and social advocacy, material and footage from conflict zones, and so on. In a recent post, Witness’ Dia Kayyali wrote that “automated content moderation has not been working well. There’s no reason to expect it’s suddenly going to work now.” As an example, Kayyali points to the work of advocates in war-torn Syria to get out important video documentation of their plight. Social network moderation regimes have deemed such photos and video too gory (rather than a more complicated, and yet perhaps more honest, reason of “too political”) to stand. Activists have spent countless hours discussing this issue with social networks, but their work presupposes the sophisticated mind of a human moderator making the decision. When decisions are rendered in automated fashion, bluntly and for expediency at scale, it is precisely these kinds of users, advocating for their own rights from the world’s conflict zones, who stand to lose.

Despite my own decadelong work to unveil the working conditions and psychological impact of commercial content moderation work and to advocate for the people who do it, I do not think that a turn to computation is a scenario that benefits the public in the long run. As Kayyali points out, we must be mindful that this situation not be an excuse to diminish and dismiss the already undervalued work of human moderators. Rather, the fault lines of problematic moderation decisions that will now be visible to countless more users—through the issuing of blanket warning statements, dialog boxes that describe these shortcomings when users go to report abusive content or, on the flip side, when a user notices her video or post has disappeared when similar material had stood before—should be used to open up a much larger social debate about the role of social media as the de facto (but certainly not de jure) public square of our age. They should be used to further tie that discussion to one about content moderation in general, the decisions those moderators and A.I. tools make, and the policies and politics that go behind them. Only then should we even consider a move to fully automated moderation as first-order social media policing. Now that we are all beta testers, we must be mindful that this great A.I. moderation beta test not become status quo.

Future Tense is a partnership of Slate, New America, and Arizona State University that examines emerging technologies, public policy, and society.