This article is from Big Technology, a newsletter by Alex Kantrowitz.
Richard Mathenge felt he’d landed the perfect role when he started training OpenAI’s GPT model in 2021. After years of working in customer service in Nairobi, Kenya, he was finally involved in something that felt meaningful and held a future for him. But the position left him scarred. For nine hours per day, five days a week, Mathenge led a team that taught the A.I. model about explicit content. The goal was to train it so it could keep such things away from users. Today, it remains stuck with him.
While at work, Mathenge and his team repeatedly viewed explicit text and labeled it for the model. They could categorize the content, the provenance of which was unclear, as child sexual abuse material, erotic sexual content, illegal, nonsexual, or some other options. Much of what they read horrified them. One passage, Mathenge said, described a father having sex with an animal in front of his child; others involved scenes of child rape. Some were so offensive Mathenge refused to speak of them. “Unimaginable,” he told me.
The type of work Mathenge performed has been crucial for bots like ChatGPT and Google’s Bard to function and to feel so magical. But the human cost of the effort been widely overlooked. In a process called “Reinforcement Learning from Human Feedback,” or RLHF, bots become smarter as humans label content, teaching them how to optimize based on that feedback. A.I. leaders, including OpenAI’s Sam Altman, have praised the practice’s technical effectiveness, yet they rarely talk about the cost some humans pay to align the A.I. systems with our values. Mathenge and his colleagues were on the business end of that reality.
Mathenge earned a degree from Nairobi’s Africa Nazarene University in 2018 and quickly got to work in the city’s technology sector. In 2021, he applied for work with Sama, an A.I. annotation service that’s worked for companies like OpenAI. After Sama hired Mathenge, it put him to work labeling LiDAR images for self-driving cars. He’d review the images and pick out people, other vehicles, and objects, helping the models better understand what they encountered on the road.
When that project wrapped, Mathenge was transferred to work on OpenAI’s models. And there, he encountered the disturbing texts. OpenAI told me it believed it was paying its Sama contractors $12.50 per hour, but Mathenge says he and his colleagues earned approximately $1 per hour, and sometimes less. Spending their days steeped in depictions of incest, bestiality, and other explicit scenes, the team began growing withdrawn.
“I can tell when my team is not doing well, I can tell when they’re not interested in reporting to work,” Mathenge told me. “My team was just sending signals that they’re not ready to engage with such wordings.”
Mophat Okinyi, a quality-assurance analyst on Mathenge’s team, is still dealing with the fallout. The repeated exposure to explicit text, he said, led to insomnia, anxiety, depression, and panic attacks. Okinyi’s wife saw him change, he said; not long after, she left him. “However much I feel good seeing ChatGPT become famous and being used by many people globally,” Okinyi said, “making it safe destroyed my family. It destroyed my mental health. As we speak, I’m still struggling with trauma.”
OpenAI knew these workers were supposed to get routine counseling, but Okinyi and Mathenge found it insufficient. “At some point, the counselor reported [to duty],” Mathenge said, “but you could tell he was not professional. He was not qualified, I’m sorry to say. Asking basic questions like ‘What is your name?’ and ‘How do you find your work?’”
In a statement to me, OpenAI said it takes the mental health of its employees and contractors very seriously. “One of the reasons we first engaged Sama was because of their commitment to good practices,” a spokesperson said. “Our previous understanding was that wellness programs and 1:1 counseling were offered, workers could opt out of any work without penalization, exposure to explicit content would have a limit, and sensitive information would be handled by workers who were specifically trained to do so.”
The OpenAI spokesperson said the company sought more information from Sama about its working conditions in early 2022. Sama, the spokesperson said, then informed OpenAI it was exiting the content-moderation space. Sama did not respond to a request for comment.
For Mathenge, the notion that he’d evaluate the tradeoffs before proceeding with this work sounded like a luxury. He was just happy to be employed as Kenya’s economy teetered amid the economic turbulence of the pandemic. “It is during the COVID season,” he said. “Getting work in a developing country, it’s a blessing in itself.”
Despite all this, Mathenge and his colleagues feel pride in the work they did. And it was indeed effective. Today, ChatGPT refuses to produce the explicit scenes the team helped weed out, and it issues warnings about potentially illegal sexual acts. “For me, and for us, we are very proud,” Mathenge said. They’re proud, but still hurting.
You can listen to Alex Kantrowitz’s full conversation with Richard Mathenge on this week’s episode of Big Technology Podcast.