Future Tense

When Does Flagging False Content on Social Media Backfire?

A new study offers some answers.

A red iPhone is held by two hands over a quilt.
Tim Mossholder/Unsplash

In the past few months, social media companies have scrambled to address the misinformation tearing through their platforms—first about the election, then the coronavirus. Twitter has started using manipulated media labels. Facebook has been more aggressively removing harmful content and flagging false news. This week, YouTube announced that it will add information panels to searches in the U.S. that might bring up misinformation.

At the heart of these measures is fact checking. At Facebook, the “epicenter of misinformation,” an expanding network of professional fact checkers is sifting through the site’s posts to slap warning labels on false content. At first glance, these efforts appear uniformly good. But what’s the psychology behind those flags? How effective are they? And do red flags ever embolden the very users they’re meant to deter?

A study published this month by researchers at New York University’s Tandon School of Engineering and the University of Indiana sought to answer these questions. The researchers found that credibility indicators, or flags, can reduce users’ intentions to share fake content on social media regardless of political orientation. In short, fact-checking sources are overwhelmingly trusted. Yet the more nuanced conclusion is that the effect of these flags varies significantly based on gender and political orientation: Men are more 1.5 times more likely to share news that’s been flagged as false, the study concluded, while Republicans are much more inclined to disseminate that news than Democrats or independents.

The findings were based on an online study with 1,500 participants in the U.S. These participants saw 12 true, false, and satirical headlines marked with one of four types of credibility indicators: warnings that came from fact checkers, news media, the public, or artificial intelligence. Then, they were asked if they would share the article with friends. (The indicators from fact checkers and A.I. are the most relevant to current social media policies. Facebook, for instance, has used A.I.  to spot hoaxes copied and pasted by different accounts.)

An example of one of the headlines in the study. It reads "Coiled mattresses cause cancer by amplifying radio waves," with "Multiple fact-checking journalists dispute the credibility of this news" underneath.
An example of a “credibility indicator” by fact checkers in the study. Proceedings of the 2020 ACM CHI Conference on Human Factors in Computing Systems

The most effective indicator by far was fact checkers. Participants intended to share 43 percent fewer headlines that were marked untrue by fact checkers: 61 percent fewer for Democrats, compared with 40 percent for independents and 19 percent for Republicans. As for A.I., Democrats intended to share 40 percent fewer untrue headlines with the A.I. indicator, versus 16 percent for independents. Notably, Republicans said they would share 8 percent more untrue news with the AI indicator. “We were not expecting that, although conservatives may tend to trust more traditional means of flagging the veracity of news,” said Sameer Patil, a co-author of the study, in a press release.

Admittedly, this is just one study, but it’s important for two reasons. First, it provides further evidence that the “backfire effect,” or the idea that fact-based corrections may actually reinforce false beliefs, isn’t as serious as some researchers have believed. Over the past decade, understanding around the effects of flagging misinformation has shifted away from the backfire effect, which was popularized by a 2010 study. But the concerns over the effect has lingered. Facebook itself noted that “[a]cademic research on correcting misinformation has shown that putting a strong image, like a red flag, next to an article may actually entrench deeply held beliefs,” when it temporarily ditched the disputed flag in 2017. The general findings of the new study align with a 2019 study by Paul Mena, a lecturer at the University of California, Santa Barbara, who also concluded that warning labels may indeed disincentivize Facebook users from sharing fake news.

Second, the new research also provides insight into how partisanship and demographics affect misinformation campaigns—an area that’s so far been understudied. However, its conclusions should be taken with a grain of salt. The paper itself acknowledged the limitations of generalizing from this research, especially since even though they “carefully worded the indicators in a politically neutral way … [they] cannot rule out the influence of inherent systemic political biases regarding technology or the media.”

That said, this research aligns with a 2019 study that also found that during the 2016 presidential election campaign, 18 percent of Republicans shared links to fake news sites, compared to less than 4 percent of Democrats. Yet the 2019 study similarly cautioned against associating ideology with a predisposition to share false content, since most fake news circling during the campaign was pro-Trump or anti-Clinton. In Mena’s study, which did not focus on political affiliation and controlled for political leaning, Democrats were actually more likely to share false news with or without a warning label—though Republicans’ behavior was less affected by these labels. But Mena told me that this outcome likely resulted from the headlines he used being more ideologically attractive to Democrats, since the majority of his participants were Democrats. “More research is needed,” he said. “A possible explanation would be that the observed effect of political affiliation depends on the topic of the false news story.”

Regardless of the effects of partisanship, the new study’s findings are positive. In his 2019 study, Mena confirmed evidence of the “third-person effect,” where people believe that others are more likely to share false news—with or without flags—than themselves; essentially, we tend to overestimate the effects of misinformation on others. Ultimately, the new study only further shows that while people generally tend to have negative opinions of others’ gullibility or intentions, a much smaller percentage of people will share hoaxes when they’re flagged. Which is to say that Facebook and other social media sites should continue to moderate and mark their content, especially when large swaths of misinformation amid the current infodemic go undetected (and even look potentially more trustworthy without a label). As Patil pointed out, making fact checkers efficient enough to tackle the current scale of falsities floating around might require greater use of artificial intelligence moderators. “This could include applying fact checks to only the most-needed content, which might involve applying natural language algorithms. So, it is a question, broadly speaking, of how humans and AI could co-exist,” he said.

Future Tense is a partnership of Slate, New America, and Arizona State University that examines emerging technologies, public policy, and society.