Future Tense

Language-Generating A.I. Is a Free Speech Nightmare

Three rows of Twitter birds
Slate

“What in the name of Paypal and/or Palantir did you just say about me, you filthy degenerate? I’ll have you know I’m the Crown Prince of Silicon Valley, and I’ve been involved in numerous successful tech startups, and I have over $1B in liquid funds. I’ve used that money to promote heterodox positions on human enhancement, control political arenas, and am experimenting with mind uploading. I’m also trained in classical philosophy and was recently ranked the most influential libertarian in the world by Google. You are nothing to me but just another alternative future. I will wipe you out with a precision of simulation the likes of which has never been seen before, mark my words.”

That’s not the latest ill-advised Elon Musk tweet, nor is it one of his devoted fans roleplaying on Reddit. And it’s not quite Navy Seal copypasta—an over-the-top, comically written attack paragraph that parodies the voice of a “tough guy”—which spread, copied-and-pasted (that’s the “copypasta” part) around the internet.

Instead, it’s a parody of Navy Seal copypasta—notably, one that was written by a computer. Independent researcher and writer Gwern Branwen fed the language model GPT-3 a few examples of parodies of Navy Seal copypasta (such as minimalist—“I’m navy seal. I have 300 kills. You’re dead, kid”—or pirate—“What in Davy Jones’ locker did ye just bark at me, ye scurvy bilgerat … ”) and then asked it to use those examples to generate new parodies. (Branwen’s many experiments with GPT-3 can be found here.) For this parody, Branwen prompted GPT-3 with the input “Elon Musk and Peter Thiel.”

GPT-3 is the work of A.I. lab OpenAI, which describes its mission as “discovering and enacting the path to safe artificial general intelligence.” OpenAI has been the source of controversy, especially related to its decision to transition from a nonprofit to a for-profit corporation, which was followed by a $1 billion investment by Microsoft. (Microsoft now has the exclusive license to GPT-3.) OpenAI has been accused of fueling the A.I. hype cycle and was criticized for withholding the release of its previous language model, GPT-2, because it feared releasing the model would be too dangerous. Similarly, the recent release of GPT-3 (in a private beta) has sparked a lot of discussion. Some are heralding it as a leap forward in A.I., citing impressive examples of its abilities to generate code, answer medical queries, and solve language and syntax puzzles. Others are more wary, concerned about the potential for misuse or believing the hype is unfounded. Either way, it’s clear that sophisticated language models are making significant advances in their ability to generate convincing text. And in a world where social media platforms have disrupted the traditional gatekeepers to speech and reach (e.g., newspapers), convincing text-generating A.I. poses challenges to free speech and a free press. Namely, it could enable what sociologist Zeynep Tufekci calls “modern censorship”—information campaigns that harass, confuse, and sow mistrust with the goal of undermining individual agency and political action.

Online harassment is used to intimidate and punish people—often journalists and activists, disproportionately women and minorities—for their speech. Though much of the harassment online is the product of individuals, some is the result of organized campaigns. The Russian government pioneered the organized harassment campaign in the early 2000s, establishing a troll army that targets journalists, activists, and critics who threaten Russian interests.
Sophisticated language models could enable more effective automated harassment.
For example, a sophisticated language model could target harassment to specific speakers, making it more threatening and convincing. There have already been examples of GPT-3 creating mock obituaries that include accurate references to people’s past employers and current family members, which suggests it could be used to generate harassment that’s just as personal. Activists and journalists targeted by harassment often say they can tell the difference between “real” harassment and bot harassment, citing differentiators such as the frequency of posts and the coherence of the content. Models like GPT-3 could make it more difficult to tell the difference, making automated harassment more believable and thus more chilling.

In addition to targeted harassment, those looking to control public debate use a technique called “flooding” to drown out speech they object to and distort the information environment. Flooding involves producing a significant amount of content to distract, confuse, and discredit. Take the creation and dissemination of “fake news” in the United States: People both abroad and at home churn out stories that combine fact and fiction, undermining mainstream news organizations while distracting and confusing the public. By automating much of the writing process, sophisticated language models such as GPT-3 could significantly increase the effectiveness of flooding operations.

OpenAI’s paper about GPT-3 (currently a preprint) provides important evidence that supports this. The authors ran an experiment testing whether people could tell the difference between real news articles written by a human and articles written by GPT-3. They found that testers were barely able to distinguish between real and GPT-3-generated articles, averaging an accuracy of 52 percent—only slightly better than flipping a coin. This result has been borne out in the real world as well. It was recently revealed that a GPT-3-generated blog reached the No. 1 spot on the tech-focused news aggregator Hacker News. A student at the University of California, Berkeley set up the blog as an experiment to see whether people could tell that it was written by GPT-3. Tens of thousands of Hacker News readers didn’t suspect a thing, and the few who did were downvoted.

With A.I.-generated writing able to fool many readers, disinformation-as-a-service will become possible, eliminating the need for human-staffed “troll farms” and enabling organizations large and small to shape public debate with the low costs, high efficiency, and scalability of software. This has the potential to make flooding frictionless and pervasive.

Some people are skeptical of GPT-3’s eventual impact, commenting that it writes like a first-year student. This may be true, but have you read any misinformation? A first-year student could easily produce higher-quality misinformation than the status quo. GPT-3 doesn’t need to be writing a weekly column for the Atlantic to be effective. It just has to be able to not raise alarms among readers of less credentialed online content such as tweets, blogs, Facebook posts, and “fake news.” This type of content is a significant amount of what is created and shared online, and it is clear that it could be automated convincingly by GPT-3 and models like it.

Mitigating the harmful effects of sophisticated language models will require addressing information campaigns more generally. This means approaches that span the technical (textfake/bot detection and new social media), social (model release norms and digital literacy), and political (antitrust and regulatory changes). GPT-3 doesn’t change the problem; it just further entrenches it. As we have seen, our institutions have largely failed in the face of the challenges posed by the internet. GPT-3 and language models like it will only make safeguarding healthy public discourse online more difficult—and more important.

Future Tense is a partnership of Slate, New America, and Arizona State University that examines emerging technologies, public policy, and society.