How Algorithmic Bias Hurts People With Disabilities

Though a huge portion of the population lives with a disability, it comes in many different forms, making bias hard to detect, prove, and design around.

A laptop sits half-open on a wooden desk in front of an empty chair.

A hiring tool analyzes facial movements and tone of voice to assess job candidates’ video interviews. A study reports that Facebook’s algorithm automatically shows users job ads based on inferences about their gender and race. Facial recognition tools work less accurately on people with darker skin tones. As more instances of algorithmic bias hit the headlines, policymakers are starting to respond. But in this important conversation, a critical area is being overlooked: the impact on people with disabilities.

A huge portion of the population lives with a disability—including one in four adults in the U.S. But there are many different forms of disability, making bias hard to detect, prove, and design around.

In hiring, for example, new algorithm-driven tools will identify characteristics shared by a company’s “successful” existing employees, then look for those traits when they evaluate new hires. But as the model treats underrepresented traits as undesired traits to receive less weighting, people with disabilities—like other marginalized groups—risk being excluded as a matter of course.

One famous example of this arose when Amazon trained a resume screening algorithm by analyzing resumes the company had received over the past 10 years. Reflecting the disproportionately higher number of men who apply to Amazon, the algorithm learned to downgrade resumes that included terms such as “women’s” (as in “women’s chess club captain”) or reflected a degree from a women’s college. Amazon tweaked the tool, but the lesson is clear: If an algorithm’s training data lacks diversity, it can entrench existing patterns of exclusion in deeply harmful ways.

This problem is particularly complex when it comes to disability. Despite the large number of people living with disabilities, the population is made up of many statistically small sets of people whose disabilities manifest in different ways. When a hiring algorithm studies candidates’ facial movements during a video interview, or their performance in an online game, a blind person may experience different barriers than a person with mobility impairment or a cognitive disability. If Amazon’s algorithm training sample was light on women, it’s hard to imagine it effectively representing the full diversity of disability experiences. This is especially true when disabled people have long faced exclusion from the workforce, because of entrenched structural barriers, ableist stereotypes and other factors.

Despite these challenges, some vendors of AI hiring tools are marketing their products as “audited for bias,” seeking to reassure employers in the face of mounting concerns. But many of these companies rely on outdated guidelines that the Equal Employment Opportunity Commission published in the 1970s. These guidelines spell out the so-called “four-fifths rule,” which looks to see whether a hiring test selects a certain protected group at a lesser rate than its majority counterpart (for example, if female candidates are being selected with a pass rate of 80 percent or less of the pass rate for men, or black candidates with a pass rate of 80 percent or less than the pass rate for their white peers). If an audit uncovers that an algorithm is failing the four-fifths rule, the vendor will tweak it for better parity—and present their tool as “bias audited.” But this simplistic approach ignores the diversity of our society. Most critically, it looks only at simplified (and U.S.-defined) categories of gender, race and ethnicity.

The flaw in this approach becomes clear, again, if you consider disability. The diverse forms of disability make it virtually impossible to detect adverse impact with the type of auditing that companies currently use: to show with statistical significance, for example, that people whose autism presents in a particular way are faring less well on the test than other applicants. There simply are not enough data points to see how different autistic candidates are being impacted, especially since many people choose not to disclose their disability status. While some have called to fix this data problem by collecting more detailed information about job candidates’ disabilities, further collection raises its own distinct and very real concerns about privacy and discrimination.

These problems exist for others, too: people who have marginalized sexual orientations or nonbinary gender identities, those who fall outside U.S. definitions of race and ethnicity, and for people who are members of multiple, intersecting marginalized communities. At Amazon, leadership readily noticed and acted upon the lack of women successfully passing their resume screening. We must ask whether a tool that was downgrading trans women of color, or not recognizing the nation’s leading college for deaf students as a prestigious institution of higher education, would have been detected nearly as quickly—if at all.

So, how to think of the path ahead? One lesson is that we cannot rely on simplistic promises of statistical auditing to solve algorithmic bias. Employers and their vendors need to dig deeper to assess the real-world effects of any algorithmic tool before it is deployed. Instead of looking only at how particular groups fare in a selection process (which wrongly assumes sufficient data are available), employers must interrogate the actual variables being considered and weighed in the algorithms themselves.

In doing this, employers must ask certain core questions. Does a hiring test evaluate factors that truly relate to the job in question? Does it assess candidates on their individual merits, rather than inferences about disability? Was the test designed and reviewed by people with diverse lived experiences, to identify potential barriers? Are candidates told enough about the methodology to know if they should seek an accommodation in testing—and is a meaningful accommodation available, and given equal weight? Many of these obligations are already established in our employment laws and the Americans With Disabilities Act. Employers and vendors must apply them—including asking themselves the central question of whether they should use the test at all.

Policymakers need to take notice, too. The EEOC should update its outdated guidance on employee selection tools (the genesis of the “four-fifths rule”), and lawmakers must engage in the meaningful work of strengthening our civil rights laws to respond to technological change. At a broader level, it’s time for the growing community working on algorithmic fairness to place greater emphasis on the perspectives of disabled people—and those of other marginalized communities whose heterogeneity raise distinct challenges in the effort to detect and address bias.

Elizabeth Warren recently took the major step of including algorithmic fairness in her platform on disability rights, and experts are starting to engage in more rigorous ways. But as the use of algorithmic tools expands, we need to consider the impact for all individuals, and to truly grapple with the real-world effects of these technologies.

Future Tense is a partnership of Slate, New America, and Arizona State University that examines emerging technologies, public policy, and society.