Future Tense

Online Test Proctoring Claims to Prevent Cheating. But at What Cost?

Laptop with a "Begin Test" button on the screen and one corner folded back to reveal two dark, creepy eyes watching
Photo illustration by Slate. Photos by kasinv/iStock/Getty Images Plus.

When Madi Mollico signed up to take the GRE in May, she knew she’d have to do it online. With all testing locations shut down to comply with COVID-19 policies, the Educational Testing Service, the company that administers the GRE, digitized its exams—and as part of the process, Mollico downloaded software from ProctorU, which assigned a proctor for her exam. He watched as she gave a video tour of her test space: no posters on the wall, nothing on her desk, and appropriate lighting. “They can see you, but you can’t see them, which I didn’t feel good about,” Mollico told me in June, when I reported on the GRE’s online format for Science. To make matters worse, her proctor kept calling her “sweetheart.” “I thought it was a little bit condescending.”

While some aspects of the pandemic-era classroom translate just fine to a digital format, exams have become more complicated. Typically, students take the SAT, the GRE, or any number of midterms or finals in classrooms with proctors standing in the front of the room. But with students at home, some instructors have turned to proctoring software to ensure students aren’t using unauthorized notes, textbooks, or other tools to aid their test taking.

There are several brands of proctoring software, and they take different approaches. Some record students’ screens and prevent them from opening certain applications, like a web browser, or from taking screenshots of tests. Other software actually watches students: ProctorU, the service Mollico used while taking the GRE, employs live proctors to monitor students in real time as they take a test, while Proctorio records video of students and then uses A.I. to analyze videos to determine whether students are doing anything “suspicious.” Unsurprisingly, students are not thrilled about being surveilled, and many have reported their frustrating experiences with software incompatibility issues or malfunctions. Others have uncovered the troubling biases of algorithms that scan students’ videos for potential cheating behaviors: Darker-skinned students report that some apps ask them to improve their lighting, and one student told the New York Times that the software flagged her involuntary mouth movements. Airing grievances about proctoring software is a whole genre of student tweets; a Twitter account started in August is dedicated to “retweeting evidence of emotional harm” from what it calls surveillance software.

In response to concerns about bias and surveillance, some schools, like McGill and the University of California, Berkeley, have banned the use of “technology-enabled invigilation” entirely, but hundreds of institutions still use these services. Judging by my interviews with instructors, it appears that many ultimately can decide whether to use software provided by their schools. Given the outcry from students, how are instructors thinking about their approach? And while these apps obliquely hint at preventing cheating and promoting fairness, what do we know about their true effect on students’ learning?

Helaine Alessio, the chair of the kinesiology, nutrition, and health department at Miami University in Ohio, published a study about proctoring software in 2018—and it all began with the suspicion that her students were cheating. There were a “larger than expected number of As” on an exam in an online course she co-taught, she says, and she and her colleagues were alarmed by it. They then decided to implement proctoring software in four of the courses’ nine sections and compared 147 students’ exam results across four exams. (Three sections used Software Secure, which records students and uses proctors to review the footage, and one used Respondus Monitor, which “locks down” test-takers’ web browsers and records students.) On average, students using proctoring software scored 17 points (out of 100) lower than students who did not use the software. That, Alessio says, was a strong indication that students could be cheating.

Surely, some students do cheat—and in the case of Alessio and her colleague’s course, it’s certainly possible that proctoring software deterred or prevented some students from cheating. But there are other explanations for that drop in grades, too. For instance, a 2019 study by Metropolitan State University instructor Daniel Woldeab and University of Minnesota psychologist Thomas Brothen found that students who rated more highly on anxiety scored worse when using proctoring software versus just taking an online test without proctoring. “We have identified an issue that seems to have escaped the attention of researchers studying online learning—test anxiety,” they write. (When I asked Alessio whether her work addressed the possibility that proctoring itself could affect scores, she said it’d make for an interesting study.)

The mere threat of being flagged can be anxiety-provoking for students. One recent viral TikTok showed a student crying after she got a zero on an exam because proctoring software flagged her reading a question to herself as “suspicious.” According to Insider, the student was using ProctorU, which analyzes video using A.I. and scans for certain suspicious behaviors: whether someone else enters the room during the exam, as well as the student’s head or eye movements. (Other popular software, like Proctorio and ProctorTrack, also use A.I. to analyze students’ behavior.)* Looking up or moving your eyes toward one direction could be an indication that a test-taker has notes just off screen—but according to one 2018 study, it also could just mean the test-taker is just anxious. Study co-author Tammi Kolski, who is an adjunct faculty member at three U.S. universities, says that instructors should approach students to have “fair communications” about what software perceives as “suspicious.” Kolski says that in her courses, she tries to maintain open lines with students to build trust and to talk about test anxiety in advance. “I don’t think we should assume a student is going to [cheat], but I think we need to be realistic that there are students that do,” she says. “If you’re not a cheater, this shouldn’t bother you.”

But that’s the thing: While instructors may not intend to “bother” students who aren’t cheating, proctoring can still have a negative impact. Even if students may not consciously feel anxious or nervous, being proctored can take up valuable brain real estate that might otherwise be used on concentrating on the exam, says Joshua Eyler, director of faculty development at the University of Mississippi and author of How Humans Learn: The Science and Stories Behind Effective College Teaching. “In the best of times, students are wrestling with cognitive load: how much space we have in our brains available to manage the learning task we are given,” he says. “But we’re not in the best of times. We’re in pandemic times.” Learning how to navigate new software, wondering whether it’s working correctly, worrying your roommate might accidentally walk into the frame and trigger the software to flag you as cheating, managing your eye movements to avoid seeming “suspicious”—all that requires mental space. “It’s like a student sitting in a classroom and someone periodically screaming at them, ‘DON’T CHEAT,’ ” says Eyler. It would make anyone jumpy. And though flags from this software don’t automatically mean students will be penalized—instructors can review the software’s suspicions and decide for themselves how to proceed—it leaves open the possibility that instructors’ own biases will determine whether to bring academic dishonesty charges against students. Even just an accusation could negatively affect a student’s academic record, or at the very least how their instructor perceives them and their subsequent work.

A system set up to penalize students suspected of cheating also doesn’t address the root of the problem: why students cheat in the first place. “Students don’t cheat because they’re twisting their mustaches and trying to figure out how to cheat the system—they cheat mostly because they’re overwhelmed and under-resourced and undersupported,” says Shea Swauger, a senior instructor at the University of Colorado Denver’s Auraria Library. If a lot is riding on a single test, or a grade in a course determines whether you can keep your scholarship or get into a major, then perhaps you might see more students cheating. Alessio says that in their analysis of scores, they found that the students who performed the most differently while being proctored were more likely to be facing higher stakes, like those who were in majors that needed a GPA over 3.5. “We didn’t directly say they cheated, but the implication was very strong,” she says. For Alessio, this is evidence proctoring is necessary. But other educators see this differently: The fact that students feel the need to cheat is evidence of larger issues in education, says Swauger, and educators need to find better ways to support those students rather than penalizing them.

The incentives certainly are not in favor of underpaid and overcommitted instructors spending time to do this. Even so, many educators are committed to avoiding proctoring software, even if that means more work on their end. Dorothy Christopher, a postdoctoral fellow teaching an intro plant biology course at the University of Wisconsin-Milwaukee, says she’s set a time limit for exams so students can’t look up every answer easily, but refuses to use a proctoring service. “I have better things to do than police students’ eye movements,” she wrote to me in an email. “Probably some of my students do cheat, but preventing a small amount of cheating is not worth the cost of assuming my students are all criminals.”

Katherine Wolsiefer, a psychology assistant professor at Plymouth State University (and a friend of mine from high school), says she’s also been reluctant to use proctoring software. “We have many students from difficult backgrounds who may feel especially self-conscious about a professor seeing their surroundings, and the basic idea is that such extreme surveillance does not build a community of trust in which the student would feel comfortable reaching out to a professor for help,” she says. Like Christopher, she knows cheating is a definite possibility, but has found less invasive ways to discourage it: She randomizes the order of questions and responses on exams so that students trying to share answers would have a harder time coordinating answers, and her exams are all open book and open note. That all takes time and effort, but designing exams in this way seems to take about the same amount of time as it would for an instructor to review “suspicious” behavior flagged by proctoring software.

When I asked Alessio about some educators’ suggestions to use alternative methods of assessment, she said her course does use multiple assessment tools, like discussion boards, case studies, and other assignments, but proctored, multiple-choice tests are necessary to prepare students to take other multiple-choice assessments they may encounter in the course of their education, like standardized tests for graduate school or board certifications. “If we never gave our students the type of test [that] they will definitely see in their profession in order to become a certified you-name-it, I think that would be doing them a disservice,” says Alessio. And proctoring, she says, is a way to “even the playing field.”

While Alessio sees proctoring software as a way to enforce fairness by distinguishing between cheating students and honest ones, Swauger sees an age-old “us versus them” mentality that has always been pervasive in higher education. “Anytime people from a non-dominant group seek to participate in education, predictable counter arguments emerge that rest on the belief that their inclusion would harm current students, academic standards, productivity, etc.,” Swauger writes in a chapter included in Critical Digital Pedagogy. One popular example of this is how the SAT was created by eugenicist Carl Brigham as a way to “prove” the superiority of the white race and to exclude Black students. “In general, higher ed has never really shaken this exclusionary stance or posture. It just tries to mask it in different ways,” says Swauger. Flagging “suspicious” or “unworthy” students is a new type of exclusion, which, by algorithmic design, flags students who deviate from the “average”—students with darker-than-average skin or students who move their head or eyes around more than average. “Literally anyone on the margins is suspicious,” says Swauger.

Swauger also vehemently disagreed with Alessio’s view of an educator’s role. While it’s true that students may need to take high-stakes proctored exams later on, he says, merely taking other multiple-choice exams won’t necessarily prepare students—and research has shown there are other ways to help do that, like ensuring students gain a deep overall understanding of what they’ll be tested on. “The project of education is different than testing well,” he says.

Correction, Oct. 26, 2020: This piece originally misstated that it was not clear which proctoring service was used by a crying student in a viral video. According to Insider, the student was using ProctorU.

Future Tense is a partnership of Slate, New America, and Arizona State University that examines emerging technologies, public policy, and society.