On Tuesday, lawmakers, A.I. experts, and the guy chiefly responsible for ChatGPT gathered in the same room to swap analogies for just how dramatically A.I. is about to change our lives. The invention of the internet, the cell phone, and airplanes all made the list. For a Senate Judiciary Committee hearing ostensibly concerned with the dangers A.I. might pose to the world, everyone seemed to get along quite well.
Maybe too well. At one point. Sen. John Kennedy of Louisiana asked Sam Altman, the CEO of ChatGPT maker OpenAI, if he could recommend some people to oversee a new agency to oversee A.I.—that is, to pick his own regulators. Then again, Altman was doing an exceptional job projecting a self-critical persona. The man was begging for his industry to be regulated! Who does that? No, he did not want A.I. drones to self-select targets. Good answer! Nor did he want to see the industry force people to contribute their content to the training of A.I. tools.
“I think a minimum is that users should be able to, to sort of opt out from having their data used by companies like ours or the social media companies,” Altman said.
Unlike many other powerful tech companies, it looked like OpenAI, the company that created what’s one of the fastest-growing consumer apps ever, was open to criticism and early reform. You might have left the hearing feeling that Democrats, Republicans, tech leaders, and the American people all wanted the same thing, which is sensible oversight for this fast-growing and important field. Perhaps an A.I. licensing regime? But there was one easy-to-miss reality check highlighting just how complicated this whole A.I. thing is going to get, in ways the leaders in the space may not want to cop to yet. It came in the form of a single word: frivolous.
I will not tell you how ChatGPT defines it, because that schtick has already gotten old. Sen. Richard Blumenthal’s opening statement was, of course, written by ChatGPT. Instead I will offer you the setup.
We’re about 80 minutes into the hearing. Sen. Lindsey Graham of South Carolina has just been reflecting on Section 230 of the Communications Decency Act, the influential rule that prevents social media companies from being held responsible if users post hate speech, misinformation, terrorist propaganda, or defamation. The section, which the Supreme Court may soon revisit, is often blamed for making it difficult to hold services like YouTube and Facebook accountable for the consequences of spreading this material.
“This tool you create, if I’m harmed by it, can I sue you?” Graham asks, evidently trying to figure out if ChatGPT is protected by the same law.
“That is beyond my area of legal expertise,” Altman says. His answer makes sense. Until a generative-A.I.-specific case along these lines heads to court, it’s hard to say whether the technology’s maker can be held liable for harms, multiple lawyers and A.I. policy experts told me.
Still, Graham is not quite ready to move on. “Have you ever been sued?” he asked in a couple different ways.
“Yeah, we’ve gotten sued before,” Altman acknowledges.
“Um, I mean, they’ve mostly been like pretty frivolous things, like I think happens to any company,” he says. Graham moves on.
So what were these frivolous things? Given that OpenAI has been dealing with just one lawsuit over the past year, according to a search of court filings and multiple lawyers who are tracking the evolution of A.I. law, it seems likely that he’s referring to this: Last November, a group of developers filed a lawsuit against OpenAI, the code repository GitHub, and GitHub’s owner Microsoft in a district court in San Francisco for using their code without regard for the terms under which they licensed their work and creating a for-profit A.I. tool. Within two years, that tool, Copilot, was being used by around 1 million developers to write nearly half of the code on GitHub. (OpenAI did not respond to a request for comment on Wednesday.)
The lawsuit is not quite as exciting as some of the similar image-based A.I. lawsuits that have followed, in which artists have accused companies of stealing their work to train A.I. image generators. The coders have requested anonymity and their code would resemble gibberish for most of us. But frivolous?
“Of course it’s not frivolous,” said Sasha Luccioni, an A.I. researcher at Hugging Face, an A.I. company that recently released its own ChatGPT-like tool. Jason Schultz, the director of NYU’s Technology Law and Policy Clinic, agreed. “It’s significant,” he said, particularly given that just two weeks ago, a judge rejected OpenAI, GitHub, and Microsoft’s latest attempt to have key claims dismissed. Pamela Samuelson, a director of Berkeley Center for Law and Technology, who has been closely following the suit, said she wouldn’t call the case “strong.” But neither would she hand-wave it as Altman seemed to.
So why bring up this legal minutiae? Because the case is advancing more quickly than any of the ideas for A.I. that were discussed at the hearing, and it could affect how the stuff you have posted on the internet gets used to train A.I.
The “frivolous” case represents the first time that anyone has challenged A.I. companies’ ability to “suck in and scrape all this content and monetize that and put it back out there in direct competition with them,” said Joseph Saveri, one of plaintiffs’ lawyers. Though the open-source licenses did not require the coders to get paid for the code, it required them to get credit for it, he said. “It’s important for their career and for recognition,” he said, suggesting that this is a widespread A.I. dynamic that will affect photographers, musicians, journalists, and more.
He’s their lawyer. Of course he would say that. But Schultz, who has nothing to do with the case, offered a similar analysis. And even if you are inclined to agree with the folks who feel that Copilot benefits the average coder more than it hurts them (supposedly developers are now coding 55 percent faster), and that training A.I. on open-source code is not all that egregious, calling these coders’ concerns “frivolous” seems to contradict the idea that Altman and his $27 billion company care passionately about respecting who does—and does not—want to help train A.I.
Part of the coders’ concerns as outlined in the complaint also focus on the fact that OpenAI, GitHub, and Microsoft were so “cagey about what data was used to train the A.I.,” offering “shifting accounts of the source and amount of the code or other data used to train and operate Copilot.”
This dynamic, too, seems likely to affect many of us, beyond this lawsuit. At the moment, we have no way of knowing what ChatGPT, OpenAI’s image generator DALL·E 2, or many other A.I. tools were trained on. Luccioni believes that companies are avoiding making training data public in part to avoid lawsuits or requests to remove it.
It’s kind of like saying: You should feel free to request that a copy of your artwork is removed from this exhibition. But please understand that we cannot offer you a way to view the exhibition to figure out if your work was included.
Or to quote the 1990 Mel Gibson classic Air America: “Users are like mushrooms; feed them shit and keep them in the dark,” said Luccioni.