In December, the University of Texas at Austin’s computer science department announced that it would stop using a machine-learning system to evaluate applicants for its Ph.D. program due to concerns that encoded bias may exacerbate existing inequities in the program and in the field in general. This move toward more inclusive admissions practices is a rare (and welcome) exception to a worrying trend in education: Colleges, standardized test providers, consulting companies, and other educational service providers are increasingly adopting predatory, discriminatory, and outright exclusionary student data practices.
Student data has long been used as a college recruiting and admissions tool. In 1972, College Board, the company that owns the PSAT, the SAT, and the AP Exams, created its Student Search Service and began licensing student names and data profiles to colleges (hence the college catalogs that fill the mail boxes of high school students who have taken the exams). Today, College Board licenses millions of student data profiles every year for 47 cents per examinee. The data is collected through a Student Data Questionnaire that is administered when a student registers to take one of the College Board exams. ACT Inc., College Board’s main competitor, also collects and sells millions of student data profiles through its Educational Opportunity Service, or EOS, Program. Students respond to an Interest Inventory and a 59-question Student Profile Questionnaire prior to taking the ACT exam, and their responses are used to form the profiles that are sold to colleges. Until 2018, student disability status was included in the data profiles for those students who chose to complete the questions in the ACT’s optional student survey.
These student profiles, which include information such as ethnicity, GPA, and intended college major, can be used alone or by third-party consultants along with other student data to potentially create hidden barriers for underprivileged students. For example, data tools used by private consulting companies allow colleges to build detailed profiles on prospective students that include test scores, student profiles purchased from College Board or ACT, online browsing histories, household income, ethnic and racial information, and other data that allows colleges to determine what they believe a prospective student may financially contribute to the college over time. As many colleges have seen their target populations and sources of income shrink, they are increasingly using these detailed student profiles in their admissions decisions.
While many of these practices started before 2020, the COVID-19 pandemic has driven many educational institutions to turn to more data-driven or algorithmic solutions—and the resulting unintended consequences may be more likely to harm students than help them. Because students could not complete their May 2020 exams due to COVID-19, the International Baccalaureate Organization decided to use a predictive algorithm to determine students’ final grades and award diplomas, which are consequential in scholarship and college admissions decisions. The algorithm’s predictions substantially deviated from student’s predicted grades, leading to global protests and an online petition asking for the IBO to take a different and fairer approach to grading. The College Board and ACT were forced to cancel or postpone their respective standardized tests, which pushed some colleges to adopt test-optional admissions policies, a change that advocates have urged for decades. For some colleges, this shift meant that other admission materials like application essays and recommendation letters would hold more weight—a good thing for students who may not have been able to afford pricey test prep.
Other colleges used this as an opportunity to identify new data-driven alternatives like tracking prospective student engagement through social media or using econometric modeling to predict the financial needs of applicants to determine the type of incoming class they can afford to admit. As long as students know how their data can affect their chances of admission and the financial aid packages they are likely to receive, these data-driven methods can be beneficial to students. However, if the policies lack transparency or if data is used in an aggregate way (that is, to shape an incoming class or to meet overall enrollment goals rather than a tool to evaluate individual candidates), these data-driven alternatives will leave students with little control over the process.
In addition, these alternatives may be more likely to exclude underrepresented students and students with greater needs, thus creating new and less obvious barriers to higher education. For example, the methods used to track student engagement are likely to favor students with the resources to visit the colleges they are applying to and the access to private college counselors who advise them to engage with colleges’ social media accounts and by email. We’ve seen it happen in other sectors. Emerging research on the use of predictive analytics and other data-driven technologies in sensitive social sectors like policing, hiring, credit, and child welfare have demonstrated that these technologies’ marketing claims of efficiency and cost savings have significant societal costs. They have perpetuated racially discriminatory practices and policies, they exclude and reinforce stereotypes about underrepresented groups, and they do not produce the results they are often adopted to achieve.
Algorithms include the conscious and unconscious bias of their human creators, and they often reflect societal biases and inequities. So the potential benefits they produce in research labs are typically not borne out in reality. Algorithms cannot correct the racial, gender, and disability discrimination of the past and should not be used to reverse the progress we are making toward equality and accessibility in higher education.