Welcome to Source Notes, a Future Tense column about the internet’s knowledge ecosystem.
The banner posted across Scots Wikipedia bears an important notice: “Followin recent revelations, Scots Wikipedia is presently reviewin its airticles for muckle leid inaccuracies.” In addition to that general warning, hundreds of articles also display this more specific disclaimer: “The ‘Scots’ that wis uised in this airticle wis written bi a body that’s mither tongue isna Scots. Please impruive this airticle gin ye can.” This warning and its related call to action are currently stamped over hundreds of Scots Wikipedia articles on topics ranging from the COVID-19 pandemic to Kamala Harris to the Sermon on the Munt.
In August, someone going by the handle Ultach posted threads on 4chan and Reddit revealing that an American teenager who does not speak Scots was responsible for nearly half of the articles on Scots Wikipedia. That teenager is the 19-year-old North Carolinian behind the username AmaryllisGardner, whose alias is often shortened to AG. Ultach wrote in his viral posts that Scots Wikipedia was “legendarily bad” in part because AG did not understand Scots grammar or vocabulary.
Turns out, AG had misused common elements of the Scots language like syne and an aw. Many of AG’s articles did not use proper Scots grammar; instead, he had seemingly inserted Scots words at random into ordinary English sentences. Take AG’s original article for “Veelage,” which stated, “A veelage is a clustered human settlement or commonty, larger than a hamlet but smawer nor a toun.” This page has since been fixed with proper Scots and now states that a veelage is “muckler nor a clachan but no as muckle nor a toun.” Ultach concluded his Reddit post by writing that AG had “engaged in cultural vandalism on a hitherto unprecedented scale.”
The intention behind Scots Wikipedia is to produce an encyclopedia written in the Scots language, sometimes referred to as Lowland Scots and not to be confused with Scottish Gaelic, which falls within the Celtic language family. The Scots language is sometimes characterized as a sister language of Modern English, and sometimes classified as a dialect. Either way, the shared ancestry means that English speakers can often grasp the gist of what’s being communicated in Scots. A classic example is the poetry of Robert Burns, who wrote “The best laid schemes o’ mice an’ men/ Gang aft agley.” At present, there are nearly 58,000 articles on Scots Wikipedia compared with the more than 6 million articles on English Wikipedia. In relation to other language editions, Scots Wikipedia is often classified as a small wiki.
But the public response to Ultach’s revelations about AG has been anything but small or minkie. British news outlets have been highly critical, describing the situation as a “hijacking of the Scots language” (the Spectator), “Wikipedia boy butchers Scots language” (the Times), and “Shock an aw” (the Guardian). In the midst of the fallout, it’s worth considering why the news about Scots Wikipedia has struck such a cultural nerve and how open knowledge projects like Wikipedia can potentially self-correct.
Actual Scots speakers say their passion for the language stems from their lived experience. Cobra! is a 26-year-old based in North Lanarkshire, Scotland, who has actively edited Scots Wikipedia for about five years. “For a lot of my life, like many natives, I’ve had to suppress my Scots because of its stigma,” Cobra! told me on Discord. Growing up, he heard Scots described as “dirty talk.” Later in life, he made a deliberate effort to rediscover his language. Cobra! said he became interested in improving Scots Wikipedia largely because he was aware it wasn’t the best representation of the language and that it was sometimes used by Redditors as proof that Scots wasn’t a language at all.
Whether or not the Scots Wikipedia project serves to delegitimize the Scots language has been a long-term concern. My colleague Jane C. Hu wrote for Slate in 2014 that the project is “no joke,” although she admitted that at first glance it read like a transcription of a person with a Scottish accent. Historically there has been some debate about whether Scots is its own separate language, a dialect of English, or slang. Yet Scots is formally recognized as one of Scotland’s three official languages, together with English and Scottish Gaelic.
Ryan Dempsey, aka Ultach, told me in an email that he was motivated to go public with the AG story because he was frustrated with how the language was perceived. He’s from a region of Northern Ireland that speaks Ulster Scots, and “it’s always been very annoying to see [Scots] maligned as ‘just English with an accent’ by people who aren’t familiar with it,” he wrote. After discovering that AG did not actually speak Scots, Dempsey said he felt that he had “cracked the code” on why Scots Wikipedia had such a bad reputation. Dempsey told me he initially suspected that his posts on 4chan and Reddit would fizzle out instead of going viral.
But Dempsey also notes that AG was by no means the only culprit. In fact, there were dozens of people writing on the encyclopedia in mangled English that they passed off as Scots. The prospect of so many Wikipedia articles being written in phony Scots is especially scary because, as Nicolás Rivero reported for the Verge, Wikipedia pages are often used as training data to teach A.I. systems the language. Garbage Scots in, garbage Scots out.
Among the many offenders, AG stood out for being the most prolific. When I reached out to AG himself, he described his history with the Scots Wikipedia project. “In 2013, 12-year-old me was eager to contribute to any Wikimedia project in any way I could,” AG wrote in an email. After writing many articles in English, he turned his attention to the relative lack of content in foreign language wikis. Scots Wikipedia seemed like a good fit because the “bare content was mutually intelligible to English.”
Despite his well-meaning enthusiasm, AG admitted to making many mistakes. For example, AG used the Scots Online Dictionary to look up specific words in Scots. But the content that AG added to Wikipedia was not a true translation because he did not fully understand Scots grammar and syntax. “Many people wonder how it’s possible for someone to rack up so many edits on any site as I have on Wikipedia,” AG wrote in an email. “My best explanation is naïve passion and (clinically diagnosed) OCD.” In a statement to the overall Wikimedia community about his conduct, AG said that he was devastated “after years of my thinking I was doing good.”
Since the revelations about AG, the Scots Wikipedia community has considered several proposals for how to move forward. There was the “Nuke and start over” option, which stemmed from the argument that Scots Wikipedia has done more harm than good for the language. Another proposal involved paying Scots linguists as “auditors” to review the project. This was dismissed because it defied the Wikipedia ethos that the encyclopedia projects are owned and managed by volunteers. Still other proposals involved the “mass rollback” of all articles with content that had been added by AG. But this mass rollback could introduce entirely new problems, since other contributors have added content on top of AG’s edits. In some ways, the strategy of removing AG’s content is like a highly precarious move in Jenga: Extracting his work could bring the entire structure down.
For now, the community is using a wide variety of technical tools to address the issues, including posting notices on top of AG’s articles. The Wikipedia editor James Salsman also used a Scottish government word list to identify about 1,000 articles that have English words that should not appear in Scots. According to Salsman’s tool, the Scots article on Rafael Nadal, for example, has 93 non-Scots English words, including stay, which, and friend.
In addition to providing technical tools, volunteers have been generous with their time. Soon after news of AG spread, Cobra!, the 26-year-old Scotsman, organized “E Scots Leed Editathon.” Throughout the virtual event, volunteers made about 3,000 edits, created new pages, and fixed old ones. The Scots Language Centre based in Perth was involved in both publicizing the event and inviting group members to attend. But it’s worth noting that the Scots Wikipedia Editathon had only about 30 attendees—compared with the 18,700 upvotes on the original Reddit post about AG. With Wikipedia, as in life, there is more outrage than productive engagement.
In Dempsey’s opinion, the situation with Scots Wikipedia is highly specific to the Scots language experience. “Being a sister-language to [one of] the most widely spoken language[s] in the world puts it in an awkward position: You’re always going to get some people who think they can speak a language just because they already speak a language related to it,” Dempsey wrote. Then again, Wikipedia editors are understandably concerned with ensuring that a situation like this does not recur in the future—or at the very least, that it’s discovered more quickly than the seven years it took to interrupt AG. One current proposal is the small wiki audit, an initiative to regularly check in on the content of smaller wiki projects. Implementing a routine audit process can perhaps provide an extra degree of governance, which matters for smaller wikis since they generally have less contributors to monitor one another’s work.
Perhaps the Scots language will benefit from helpful headwinds. “I think Scots speakers have found an amazing environment to thrive in and finally find their voice on the internet,” wrote Joanna Kopaczyk, a senior lecturer in Scots and English at the University of Glasgow. Kopaczyk cited ongoing efforts to pass a Scots Language Act and normalize the use of Scots in professional, educational, and broadcasting contexts, among others.
If there is any reason to think the situation with Scots Wikipedia will improve over time, it might simply be that Wikipedia editors themselves are quite industrious—and, relatively speaking, more forgiving. When I connected with him by email, AG described how he had been the target of a torrent of online harassment ever since the Scots Wikipedia story broke, particularly on non-Wikipedia platforms. “I can say that in contrast to the negativity of Reddit and 4chan, [the] welcoming atmosphere of the Wikimedia and Scots language community has warmed my heart,” AG said in an email. “They’ve told me that I’m welcome to contribute to the wiki in the future (just not in the way I contributed before without learning properly).”
Future Tense is a partnership of Slate, New America, and Arizona State University that examines emerging technologies, public policy, and society.