A landmark study that endorsed a simple way to curb cheating is going to be retracted nearly a decade later after a group of scientists found that it relied on faked data.
According to the 2012 paper, when people signed an honesty declaration at the beginning of a form, rather than the end, they were less likely to lie. A seemingly cheap and effective method to fight fraud, it was adopted by at least one insurance company, tested by government agencies around the world, and taught to corporate executives. It made a splash among academics, who cited it in their own research more than 400 times.
The paper also bolstered the reputations of two of its authors — Max Bazerman, a professor of business administration at Harvard Business School, and Dan Ariely, a psychologist and behavioral economist at Duke University — as leaders in the study of decision-making, irrationality, and unethical behavior. Ariely, a frequent TED Talk speaker and a Wall Street Journal advice columnist, cited the study in lectures and in his New York Times bestseller The (Honest) Truth About Dishonesty: How We Lie to Everyone — Especially Ourselves.
Years later, he and his coauthors found that follow-up experiments did not show the same reduction in dishonest behavior. But more recently, a group of outside sleuths scrutinized the original paper’s underlying data and stumbled upon a bigger problem: One of its main experiments was faked “beyond any shadow of a doubt,” three academics wrote in a post on their blog, Data Colada, on Tuesday.
The researchers who published the study all agree that its data appear to be fraudulent and have requested that the journal, the Proceedings of the National Academy of Sciences, retract it. But it’s still unclear who made up the data or why — and four of the five authors said they played no part in collecting the data for the test in question.
That leaves Ariely, who confirmed that he alone was in touch with the insurance company that ran the test with its customers and provided him with the data. But he insisted that he was innocent, implying it was the company that was responsible. “I can see why it is tempting to think that I had something to do with creating the data in a fraudulent way,” he told BuzzFeed News. “I can see why it would be tempting to jump to that conclusion, but I didn’t.”
He added, “If I knew that the data was fraudulent, I would have never posted it.”
“If I knew that the data was fraudulent, I would have never posted it.”
But Ariely gave conflicting answers about the origins of the data file that was the basis for the analysis. Citing confidentiality agreements, he also declined to name the insurer that he partnered with. And he said that all his contacts at the insurer had left and that none of them remembered what happened, either.
According to correspondence reviewed by BuzzFeed News, Ariely has said that the company he partnered with was the Hartford, a car insurance company based in Hartford, Connecticut. Two people familiar with the study, who requested anonymity due to fear of retribution, confirmed that Ariely has referred to the Hartford as the research partner.
Days after this story was published, Suzanne Barlyn, a spokesperson for the Hartford, confirmed to BuzzFeed News that “we scanned our archives and found that there was a small project with Dr. Ariely.” However, the company said in a statement, “we have been unable to locate any data, deliverables or results that may have been produced.” Asked to describe what it had found in its archives, the company declined to comment.
Ariely did not return a request for comment about the insurer.
Do you have more information to share about this story? Contact this reporter at email@example.com or reach her securely at firstname.lastname@example.org. You can also email us at tips.buzzfeed.com.
The imploded finding is the latest blow to the buzzy field of behavioral economics. Several high-profile, supposedly science-backed strategies to subtly influence people’s psychology and decision-making have failed to hold up under scrutiny, spurring what’s been dubbed a “replication crisis.” But it’s rarer that data is faked altogether.
And this is not the first time questions have been raised about Ariely’s research in particular. In a famous 2008 study, he claimed that prompting people to recall the Ten Commandments before a test cuts down on cheating, but an outside team later failed to replicate the effect. An editor’s note was added to a 2004 study of his last month when other researchers raised concerns about statistical discrepancies, and Ariely did not have the original data to cross-check against. And in 2010, Ariely told NPR that dentists often disagree on whether X-rays show a cavity, citing Delta Dental insurance as his source. He later walked back that claim when the company said it could not have shared that information with him because it did not collect it.
It is a failure of science that it took so long for the 2012 data to be revealed as fake, the scientists behind the Data Colada blog argued.
“Addressing the problem of scientific fraud should not be left to a few anonymous (and fed up and frightened) whistleblowers and some (fed up and frightened) bloggers to root out,” wrote Joe Simmons of the University of Pennsylvania, Leif Nelson of UC Berkeley, and Uri Simonsohn of the Esade Business School in Spain.
Some are now questioning whether the scientist who literally wrote the book on dishonesty is himself being dishonest. But when asked whether he worries about his reputation, Ariely said he is not concerned.
“I think that science works over time and things fix themselves,” he said.
The 2012 paper reported on a series of experiments done both in a lab and in the field. In the main real-world experiment, the researchers teamed up with an unnamed car insurer in the US. Nearly 13,500 drivers were randomly sent one of two policy review forms to sign: one where the statement “I promise that the information I am providing is true” appeared as normal at the bottom of the document and the other where the statement was moved to the top. People are naturally incentivized to lie on these forms, the scientists explained at the time, because a lower mileage means a lower risk of accidents and lower premiums.
Whether they signed next to the honesty pledge at the bottom or the top, customers were asked to report the current odometer mileage of their cars covered by the insurer. This number was compared to the mileage they’d reported over an unspecified time period in the past. The researchers reported that the people who’d signed the top of the document reported driving more miles — about 2,400, or 10%, more — than those who signed at the bottom.
While the actual mileage couldn’t be verified, the researchers hypothesized that this sizable difference was due to the switch in signature locations. This easy tweak, they wrote, “likely reduced the extent to which customers falsified mileage information in their own financial self-interest at cost to the insurance company.”
The discovery fit neatly into a growing body of research about “nudging”: the idea that subtle cues can encourage people to unconsciously make the right decisions. The top-of-form signatures concept was tested on taxpayers in the United Kingdom and Guatemala, though to mixed results. A government in Canada also reportedly spent thousands trying to change its tax forms. In the US, the Obama administration took notice, and the IRS credited the method with helping it collect an additional $1.6 million from government vendors in a quarter.
“Such tricks aren’t going to save us from the next big Ponzi scheme or doping athlete or thieving politician,” Ariely wrote in a 2012 Wall Street Journal excerpt from The (Honest) Truth About Dishonesty. “But they could rein in the vast majority of people who cheat ‘just by a little.’”
“I think that science works over time and things fix themselves.”
Ariely went on to join the millennial-focused insurance startup Lemonade as chief behavioral officer. His field “touches every aspect of our lives,” the company noted in 2016, including “cheating on our tax forms and insurance claims.” Lemonade took a page straight out of the behavioral economist’s playbook: to submit fraud claims, customers must “sign their name to a digital pledge of honesty at the start of the claims process, rather than at the end,” Fast Company reported in 2017. (Ariely left at the end of 2020, according to his LinkedIn profile.)
Simmons, one of the Data Colada sleuths, taught the study for years in his decision-making class at the University of Pennsylvania’s Wharton School. “It’s a simple intervention that any policymaker who’s interested in quelling dishonesty could easily put into practice,” he told BuzzFeed News.
But eight years after the study was published, the original authors announced that they no longer had confidence in their conclusions.
A few other researchers had been unsuccessfully trying to replicate and build upon some of the experiments, so they joined forces with the five original scientists, including Ariely, to run one of those tests again. This one also involved signatures on the top versus bottom, but on tax return forms and with a bigger group of participants in a lab environment. They failed to see any difference in honesty between the two groups.
Looking back on the original mileage-reporting experiment, the researchers realized that the two groups — the top and bottom signers — had significantly different mileage records to begin with. Honesty likely had nothing to do with the differences between their responses, the authors explained last year in an op-ed titled “When We’re Wrong, It’s Our Responsibility as Scientists to Say So.” They also detailed the failure of their replication attempt in a new paper.
At the time, the journal asked whether the scientists wanted to retract the original finding. Bazerman, the Harvard scientist, told the Data Colada bloggers that he and another author, Lisa Shu, then at Northwestern University, were the only two authors who were strongly in favor. In the end, the group didn’t move forward with a retraction. (Bazerman declined to be interviewed by BuzzFeed News.)
But that error turned out to be just the tip of the iceberg.
When the researchers published their 2020 update, they posted the data from their 2012 paper for the first time. Publicly sharing data was once a rarity in science but is slowly becoming more commonplace amid calls for greater transparency.
Poring through the Microsoft Excel spreadsheet, Simmons and the other data detectives unearthed a series of implausible anomalies that pointed to at least two kinds of fabrication: Many of the baseline, preexperiment mileages appeared to be duplicated and slightly altered, and all the mileages supposedly collected during the forms test looked like they were made up. Much of this data seemed to be produced by a random number generator, they wrote in their Data Colada post.
In the first sign of something amiss, the 13,488 drivers in the study reported equally distributed levels of driving over the period of time covered in the study. In other words, just as many people racked up 500 miles as those who drove 10,000 miles as 40,000-milers. Also, not a single one went over 50,000. This pattern held for all the drivers’ cars (each could report mileage for up to four vehicles).
“This is not what real data look like, and we can’t think of a plausible benign explanation for it,” Simmons, Nelson, and Simonsohn wrote. (Most of the issues, they added, were initially raised by a group of researchers who asked to remain anonymous.)
"This is not what real data look like."
Those highly unlikely mileages were created in part by data that appeared to have been subtly duplicated. Almost exactly half of the mileages in the baseline data were entered in the font Calibri and the other half in Cambria. The two sets were “impossibly similar,” the researchers wrote. All the Calibri entries had Cambria “twins,” participants with mileages that were always higher, but consistently within the same small range — an indicator of a random number generator at work.
Another oddity: almost none of the mileages in Ariely’s forms experiment were rounded. That was strange because these were supposedly self-reported, and it would be expected that many drivers would give ballpark estimates rather than looking up the exact figures on their odometers. “When real people report large numbers by hand, they tend to round them,” Simmons, Nelson, and Simonsohn wrote. It was yet another sign of a random number generator at play.
Their conclusion was clear: “There is very strong evidence that the data were fabricated.” Journals clearly need to require authors to share their data, they argued in their post.
“A field that ignores the problem of fraud, or pretends that it does not exist, risks losing its credibility,” they added. “And deservedly so.”
All of the five original authors, including Ariely, have stated that they agree with the Data Colada bloggers’ conclusion that the data were faked and expressed regret over failing to take notice earlier. A group of them requested that the Proceedings of the National Academy of Sciences retract the study last month.
“We are aware of the situation and are in communication with the authors,” PNAS editorial ethics manager Yael FitzPatrick told BuzzFeed News.
In a statement to the blog, Francesca Gino, a Harvard Business School professor and one of the authors, wrote, “I was not involved in conversations with the insurance company that conducted the field experiment, nor in any of the steps of running it or analyzing the data.” Another author, Nina Mazar, then at the University of Toronto and now a marketing professor at Boston University, told the blog, “I want to make clear that I was not involved in conducting the field study, had no interactions with the insurance company, and don’t know when, how, or by whom exactly the data was collected and entered. I have no knowledge of who fabricated the data.”
Gino declined to be interviewed for this story, and Mazar did not return a request for comment.
Meanwhile, Ariely has given vague and conflicting answers about how he obtained the data.
Asked by BuzzFeed News when the experiment was conducted by the insurance company, he first replied, “I don’t remember if it was 2010 or ’11. One of those things.”
The Excel file that was publicly posted — the file with the original data, according to him and his team — was created by Ariely in February 2011, its metadata shows. But Ariely discussed the study’s results in a July 2008 lecture at Google and wrote an essay about it, though with slightly different results, in 2009 for the Harvard Business Review. That would suggest the data file was created up to three years after the experiment was conducted.
The spokesperson for the Hartford said that the project with Ariely had been done “from 2007-2008,” and that most of the employees associated with the work are no longer at the company.
Ariely told the Data Colada bloggers that the spreadsheet was exactly what he had received from the company. “The data were collected, entered, merged and anonymized by the company and then sent to me,” he wrote. “This was the data file that was used for the analysis and then shared publicly.”
But in an interview with BuzzFeed News, his recollection of the file’s evolution was murkier. “Did it come in Excel, or did it come in text and I put it into Excel? I don’t know,” he said, adding, “The thing I wanted to make clear is that the data I got, that was the data.”
And there is evidence that Ariely did edit the file to some extent before handing it to the rest of the team.
On Feb. 16, 2011, Ariely emailed Mazar with a version of the same Excel file, according to emails she shared with Data Colada. There was what looked like a major error: bottom signers seemed more likely to be honest, the opposite of the scientists’ hypothesis. When Mazar pointed this out to Ariely, he told her that when preparing the dataset for her, he had accidentally switched the labels of the top signers and the bottom signers. Mazar swapped them upon his request. The file eventually posted in 2020 did not contain this error.
Ariely told BuzzFeed News that while he was aware of Mazar’s account of this exchange, he did not personally remember having it and did not have any emails from that time to review.
Instead, he appeared to blame the insurance company for the fraudulent data, saying his mistake was staying hands-off during the data collection process. Looking at the data only when they were aggregated, anonymized, and sent to him, he said, freed him from the work of securing ethics approval from the university to perform research on human subjects.
“Out of concern for privacy, I stayed away from the details,” Ariely said. “In retrospect, that wasn’t a good choice.”
At least one member of the team claims to have had questions from the start.
Bazerman of Harvard told Data Colada that when he first read a draft of the paper, in February 2011, he had questions about the insurance experiment’s seemingly “implausible data.” A coauthor assured him the data were accurate and another showed him the file, though he admitted that he did not personally examine it.
When the 2012 paper made waves, he “then believed the core result” and taught it to students and corporate executives alike. In retrospect, he wrote, “I wish I had worked harder to identify the data were fraudulent, to ensure rigorous research in a collaborative context, and to promptly retract the 2012 paper.”
Shu, another coauthor who now works in venture capital, voiced similar regrets on Twitter this week. “We began our collaboration from a place of assumed trust — rather than earned trust,” she wrote. “Lesson learned.” She declined to comment for this story.
Ariely claims he has learned his lesson, too. He said that since the 2012 study, his lab has been more involved in data collection efforts when it partners with companies to do research. His lab members now store data in the cloud and regularly share it publicly, he added.
After the data sleuths approached him with their findings this summer, Duke’s research integrity office interviewed him and was given access to his email, he said. Asked how the review was conducted and what it found, a Duke spokesperson declined to comment.
With the revelation that fake data propped up a famous study about honesty, the social sciences has its latest cautionary tale.
“People are going to still keep studying dishonesty,” Nelson, one of the researchers who unearthed problems in the study, told BuzzFeed News. “But my guess is they’ll be a little more circumspect about how much they’ve celebrated it over the previous decade.”
Meanwhile, Ariely said he holds out hope that the basic idea of the study still holds true, pointing to other research about “priming” people to be honest. He said that more signature experiments being run in the UK, which have not yet been published, will vindicate him.
The 2012 data “is not to be trusted,” he said. But “there is still the question of whether the principle behind it is correct. Those are separate questions.” ●
This story has been updated to include a statement from the Hartford.