DNA identification of the suspected Golden State Killer was inevitable, say investigators and genetics experts, the long-expected payoff of a six-year genetic hunt.
Arrested on Wednesday, Joseph James DeAngelo, 72, faces capital murder charges as the suspect in 51 rapes and 12 murders in California between 1974 and 1986. His capture is the first known report of law enforcement identifying a suspect through the DNA of relatives on genealogy websites.
“It’s a dramatic outcome that everyone predicted,” computer science professor Yaniv Erlich of Columbia University told BuzzFeed News. In a 2014 study, Erlich and a colleague predicted that the police would turn to “genealogical triangulation” websites such as GEDmatch, frequently used by adoptees to find relatives, to hunt for a suspect or their family members.
That seems to be exactly what happened, the Bay Area News Group reported, with cold-case investigator Paul Holes confirming that GEDmatch, a website that pools publicly shared genetic profiles and family trees, provided leads on relatives of the suspect based on crime scene DNA.
"It was just a matter of time; we knew it would happen," Billy Jensen, a journalist and an investigator on the syndicated television show Crime Watch Daily, told BuzzFeed News.
In 2013, authorities released the killer’s genetic markers to independent investigators including Jensen, Paul Haynes, and the late Michelle McNamara (who was married to the comedian Patton Oswalt). McNamara had run family names searches on genealogy databases a year earlier. Their investigation is chronicled in the book I'll Be Gone in the Dark, which came out after McNamara’s death.
Like Holes, this team also scoured GEDmatch and other public databases. “We were entering the killer’s profile into these databases every three or four months or so, hoping that a new entry from a family member would have been added since the last time we had checked,” Jensen said. But Holes, a retired cold-case investigator with the Contra Costa County District Attorney’s Office, ultimately beat them to the finish.
“It’s crazy — I just think how excited Michelle would have been,” Jensen said.
The math of these searches is fairly simple, Erlich said. Everyone has a mother and father, and most people have more cousins. Your number of possible relatives to match increases exponentially with each degree of relation, creating a net of relatives around your genome.
“Once you move out into the distant cousins everyone in the US has a relative somewhere in the databases,” Erlich said. That means a reasonable search out to a few degrees of separation might turn up around 140 relations, half of them excludable by gender, and many of the rest by age, leaving a suspect pool of perhaps a few dozen suspects easily investigated by traditional police work.
While GEDmatch had perhaps 100,000 genomes in 2014, he added, it now has a million and counting, making it more powerful every day.
The I’ll Be Gone in the Dark team often lamented they could not use the even larger databases of genomes at private DNA sites such as Ancestry or 23andMe, said Jensen. They speculated they could order a synthetic version of the killer’s DNA from a lab, mix it with water to make an ersatz spit sample, and then submit it to one of the private services to see if it generated any matches. But they never did. “We thought there would be a match in the next five years just from the public databases.”
That sort of fake account would violate the terms of service for private sites such as Ancestry, which require the user to own the DNA sample and offers strict step-by-step privacy controls.
Still, there is not a lot a company could do to prevent such an attempt, if an investigator didn’t mind being kicked off the site once they created a fake account.
“In general, I think we can agree it is a very good thing to catch this guy,” Erlich said. “It would also be a good thing if we started a wider conversation on how we should use this kind of information.”
This post has been updated with information from Paul Haynes on when their team's genealogical website searches began in 2012, and when culprit DNA marker searches began in 2013.