The Genealogy Site That Helped Catch the Golden State Killer Is Grappling With Privacy

After a controversy, GEDmatch gave users more control over their DNA data. But it also took a turn down a slippery slope.

Photo illustration of a person behind the bars of a DNA strand.
Photo illustration by Slate. Photo by Tinnakorn Jorruang/iStock/Getty Images Plus.

Since April 2018, when police announced they had apprehended Joseph DeAngelo, the man they alleged to be the long-elusive Golden State Killer, the floodgates have opened.

The key insight responsible for DeAngelo’s arrest came courtesy of a then-little-known forensic technique known as genetic genealogy: a method in which investigators try to link crime scene DNA to DNA from biological relatives in the hopes of generating leads for identifying suspects or remains. The science behind the technique has been around for a while. Yet the real potential to get hits in these searches has only been made possible by the recent advent of online, easily accessible DNA databases like GEDmatch (where police got a match for a distant relative of DeAngelo’s) and FamilyTreeDNA—sites that now boast more than 1 million user profiles each. Many of these come from individuals who uploaded their own genetic data from popular consumer DNA testing kits like 23andMe and AncestryDNA.

The Golden State Killer investigators were not the first to take advantage of this trove of data. But the high-profile arrest put the technique on the map. Since that time, law enforcement agencies across the country have used consumer genetic databases to initiate investigations in more than 50 criminal cases, nearly all of which involved homicides or rapes. And thus far, at least two major consumer genetics platforms, GEDmatch and FamilyTreeDNA, have made themselves available for such sleuthing.

But now, that may be changing in a big way.

Last week, GEDmatch—the database more widely used for these purposes—announced two significant changes to its policies governing law enforcement use of its database. First, GEDmatch changed its terms of service to require users who upload their DNA to explicitly opt in to allow their profiles to be used in law enforcement investigations. The move puts considerable new power in the hands of users (though not their genetic relatives), who now get to choose whether to assist or decline law enforcement efforts to use their personal genetic profiles to help identify, say, a third cousin they never met as a potential criminal suspect. Users previously had no choice, and often little awareness, related to this use. Many have chalked this up to a privacy win, and they should.

But the service also announced a second, more overlooked change, one that marks a serious expansion in law enforcement agencies’ ability to use the GEDmatch database to investigate a wider range of crimes. For those who have been watching the issue, the expansion was predictable. It reflects a history of how U.S. law enforcement has widened its access to and use of DNA data. It also foreshadows further expansions in the future—ones likely to include cases that the public at large may not support.

GEDmatch was the first consumer genetics database to cooperate publicly with law enforcement. Shortly after the arrest in the Golden State Killer case, GEDmatch updated its site policies to explicitly embrace law enforcement use of their data, at least for certain crimes.

Specifically, a May 2018 revision to its website’s terms of service added that—in addition to authorizing users to upload DNA data if it was their DNA, the DNA of someone deceased, the DNA of someone for whom a user was a legal guardian, or the DNA of someone a user had specific authorization from—“DNA obtained and authorized by law enforcement to … identify a perpetrator of a violent crime against another individual” was also aboveboard. That policy defined “violent crime” to include only homicide and sexual assault. After being outed for cooperating with the FBI earlier this year, FamilyTreeDNA also adopted a similar line for acceptable uses, which states that law enforcement can only use its service to to “identify a perpetrator of homicide, sexual assault, or abduction.”

But critics saw that as a line that was sure to erode, sooner or later. As co-authors and I observed last year in an article in Science, it is difficult to hold the line that demarcates what police can and cannot do to investigate a crime. After all, law enforcement has powerful incentives to push the envelope when it comes to acceptable crime-solving methods, and it is often difficult for decision-makers to say “No” when faced with gruesome facts from a particular case.

Sure enough, GEDmatch could not hold the line. Earlier this month, BuzzFeed revealed that in late 2018, the operator of the platform granted an “exception” to his site’s user policies to permit law enforcement to investigate an aggravated assault using the GEDmatch database. The case represents the first time on record that forensic genetic genealogy has been used to identify the suspect in a case of violent assault. Parabon NanoLabs, the company working with law enforcement that was behind the search, told BuzzFeed, “We made an exception in this one case, but our policy is to remain within the terms of service.” But the fact remains that GEDmatch did not inform its users of this expanded use of their genetic information prior to authorizing it. Only after an arrest was announced did GEDmatch acknowledge that its operator had circumvented his site’s policies.

Less than a week later, GEDmatch—likely responding to public outcry—issued its new opt-in policy that gives users a choice when it comes to law enforcement matching. This shift to an opt-in policy for all users is highly significant, and it is a move for which GEDmatch should be applauded. It facilitates greater genetic privacy and delays the day at which the United States arrives at a de facto universal DNA database.

But this positive move should not overshadow GEDmatch’s other major change to its terms of service: expanding the definition of “violent crime” to now include “murder, nonnegligent manslaughter, aggravated rape, robbery, or aggravated assault.”

One need only consider the history of law enforcement use of DNA in this country to predict what the future will hold. Virginia established the first DNA database for law enforcement purposes in 1989. At the time, Virginia only authorized DNA collection only from certain sex offenders and violent felons. But as with GEDmatch, that line gave way in short order. Just one year later, Virginia revised its own rules and expanded its database include DNA from all felons.

That story repeated itself across jurisdictions in the United States. While many states initially restricted the collection and retention of genetic profiles in their DNA databases to just sex offenders—presumably because such crimes are uniquely reviled and were alleged to have higher rates of recidivism—today, nearly all states, as well as the federal government, mandate DNA sampling from all convicted felons, violent or not.

Many states have gone further. At least 16 require the collection and retention of DNA from some individuals convicted of mere misdemeanors. At least 32 include the DNA of certain individuals merely arrested, and not yet convicted, of a felony in the official law enforcement database. More than half of those authorize DNA collection from all felony arrestees, and some even permit collection from some people arrested for misdemeanor crimes. As all of these show, the history of law enforcement use of forensic DNA databases in the United States has been one of continued expansion, including encompassing individuals that fall well outside the scope of violent felons.

Similar encroachments have also already bedeviled law enforcement searches of official forensic DNA databases to identify family members of known offenders as possible suspects—an earlier, cruder form of genetic genealogy. Some states have rules regarding when these searches can be done, including requirements that the investigation involve a certain type of serious crime and that traditional investigative techniques must first be exhausted. But these lines too have failed to meaningfully restrain law enforcement. For example, in 2009, authorities in Colorado used this familial searching technique to resolve a car break-in where the burglar “left a drop of blood on a passenger seat when he broke a car window and stole $1.40 in change.”

Given this history, it is likely that consumer DNA databases like GEDmatch will continue to expand how they allow law enforcement to trawl their available riches of genetic data too. Whether these moves will provoke public backlash is another question, though.

Indeed, it is not clear that all of the newly authorized uses at GEDmatch could command broad public support even now. Although one recent study found that 80 percent of people surveyed said that they supported law enforcement use of consumer genetics data “to identify perpetrators of violent crimes,” far fewer supported such use when “nonviolent crimes” were at issue.

But the distinction between what constitutes a “violent crime” or “nonviolent crime” is murky at best. For example, GEDmatch now authorizes law enforcement to upload crime scene DNA from suspected robberies—a crime that may involve a truly violent and forceful altercation, but could also involve, for example, a “pickpocket who attempts to pull free after the victim catches his arm.” As the Marshall Project has documented, purse snatching, too, is “considered a ‘violent’ offense in several states,” as is “the manufacture of methamphetamines and theft of drugs.” And it is worth noting that in the public opinion survey above, crimes like “car theft” and “drug possession” were classified as nonviolent offenses.

The possibility of solving more crimes can be alluring. That makes GEDmatch’s adoption of a way for users to not fall into law enforcement efforts noteworthy, as it may slow, rather than speed, law enforcement use of this personal data—and do so in the name of genetic privacy and control.

But these recent events also underscore the risks of putting so much decision-making power in the hands of a private entity, like GEDmatch. Police had to convince just a single person, Curtis Rogers, to approve their circumvention of GEDmatch’s then-existing terms of service to investigate an assault.

This is where state and local governments come in. They can, theoretically, be moved to protect and defend privacy interests and are expected to operate largely in public and publicly accountable ways. Though, as documented above, many of these governments and legislatures have repeatedly expanded the reach of forensic DNA databases in the past, that does not mean that they cannot be moved. States like California and Illinois have been at the forefront of efforts to enhance online and biometric privacy, for example. Just a few months ago, the Maryland House of Delegates introduced a bill that would ban law enforcement from using any DNA database to try to find biological relatives of a suspect who left unidentified crime scene DNA. Such measures are sorely needed, and soon. Without them, existing constraints on law enforcement use of genetic genealogy are only as durable as one person’s say-so—and users of consumer genetics platforms may only learn about important changes after the next scandal.

Future Tense is a partnership of Slate, New America, and Arizona State University that examines emerging technologies, public policy, and society.