A majority of Americans could be genetically traced by the FBI using consumer genealogy databases and pinpointing a distant family member’s DNA, researchers say, greatly expanding investigators’ ability to identify members of the public suspected of crimes.
With access to two large consumer genealogy libraries, federal investigators now have the potential to find a DNA match as close as a third cousin to more than half of Americans, and the growing popularity of at-home testing kits means it will only get easier with time.
More than 15 million people have purchased direct-to-consumer genetic testing kits that reveal the makeup of their ancestry and disclose distant relatives, but the booming sales have also become a valuable resource for law enforcement.
The services have raised significant questions over privacy, but the debate took on new significance last week after BuzzFeed News reported that Family Tree DNA, one of the largest private genetic testing firms, has allowed the FBI to search its genealogy database, in essence doubling the number of profiles authorities can try to match with crime scene DNA.
Using public genealogy databases like GEDmatch, law enforcement has already begun to harness genetic data to search for distant relatives of suspected serial killers and rapists, aiming to crack cases that have gone cold for decades. According to a leading researcher in the field, in just a few years most people in the US might be linked in the databases to a relative as close as a second cousin.
"It's getting close to the point that almost everyone with European heritage will have a third cousin in these databases," said Yaniv Erlich, chief science officer at MyHeritage, one of the largest consumer DNA testing companies, and an assistant professor of computational biology at Columbia University.
In a study published last year, Erlich — who was dubbed "the genome hacker" by Nature magazine — and fellow researchers found that the popularity of at-home DNA testing has meant many Americans can now be identified by genetic material uploaded by relatives to genealogy databases.
Erlich and his colleagues analyzed a database containing more than 1.28 million profiles to calculate that about half of Americans could be linked to at least a third cousin.
Today, the FBI is able to search a combined database nearly twice that size with access to the open-sourced GEDmatch library, containing 1.2 million profiles, and Family Tree DNA, with more than 1 million others.
People of European ancestry are more likely to be identified through the data, Erlich said, since the home testing kits have so far been more popular among whites than minorities in the US. Those with mostly Northern European ancestry, for example, are 30% more likely than those with mostly sub-Saharan ancestry to find a third cousin, according to the study.
A third cousin, a relative connected by a great-great-grandparent, might seem like a distant family member but the identity of a person could be found relatively easily with some additional sleuthing.
With a database encompassing 7 million people, or about 5% of the population, Erlich said authorities would eventually be able to link nearly every American of European descent to a second cousin, according to the study.
The suspected Golden State Killer, whose arrest in April sparked the interest of law enforcement agencies in investigative genealogy, was identified after authorities linked him to a third cousin. Once that familial link was found, investigators created a family tree, traced the lineage of the DNA match back to a common ancestor, then followed it down the generations to a possible suspect.
It's a long, arduous, and detailed process that, in the case of the Golden State Killer, ultimately led to Joseph James DeAngelo after police collected DNA from the driver's side handle of his car and matched it with that obtained from crime scenes.
News of Family Tree's coordination with the FBI last week immediately raised concerns among several genealogists and privacy advocates, particularly since the company had started allowing FBI access to its database in the fall of 2018 without first notifying its customers.
That pushback prompted the company's president, Bennett Greenspan, to issue an apology "for not having handled our communication with you as we should have," while defending the company's work with the FBI.
Still, concerns remain about law enforcement's use of DNA data from private testing firms.
"I have substantial concerns when any company that collects sensitive health information from individuals affirmatively makes that information available to law enforcement without appropriate legal process," John Verdi, vice president of policy at the Future of Privacy Forum, told BuzzFeed News.
The organization, which had previously cited Family Tree DNA as a supporter of its best practices for genetic privacy, was removed from the list last week and on Wednesday called for the company to end its agreement with the FBI.
"It comes into deep conflict with what consumer expectations are in this space," Verdi said.
Family Tree's Y-DNA database, which the company states is the largest publicly available, could also be of significant help to law enforcement. Y-DNA reveals the paternal ancestry of the subject, and a match in that database would not only reveal a possible distant relative, but eventually lead authorities to a surname, narrowing their search.
For law enforcement officials, investigative genealogy has opened up an entirely new field of possibilities, one they say could prove as valuable in the effort to crack cold cases as the use of DNA matching about 20 years ago. To make use of the new tool, the FBI recently created an investigative unit focused on the field, enlisting agents to travel across the country to instruct police departments on its use.
Despite advances by authorities, some of the biggest DNA testing companies continue to prohibit law enforcement officials from using their databases for investigations.
Spokespersons for Ancestry.com and 23andMe told BuzzFeed News they have included language in their terms of service to ensure law enforcement does not access their databases. MyHeritage also includes similar language in their terms of service.
The companies' policies state they would only provide law enforcement with information on their customers if served a warrant or subpoena.
That might not be commonly known by its customers, but Verdi told BuzzFeed News those who pay to have their DNA sequenced expect that level of privacy.
"Consumers typically sign up to learn more about their heritage or learn more about their health, maybe to learn whether they're more likely to appreciate a particular kind of wine or cheese," he said. "They don't sign up for genetic testing to become subject to wide-ranging FBI criminal searches."