Can Fitbits Be Trusted In Science?

Fitbit says its devices shouldn't be used for medical or scientific reasons. Thanks to their low cost and high ease of use, researchers are thinking about doing so anyway. There's just one problem: The results probably can't be trusted.

For Dr. Amy Gelfand, who's studying sleep patterns in kids with episodic migraines, Fitbits are an ideal research tool in many ways. Fitbit says its devices aren't designed for medical or scientific research — even as it offers discounts to those who use them for just that. But the devices are cheap, discreet, and easy to use, and Gelfand can remotely track users' sleep without intruding in their lives. Which helps explain why many scientists are thinking about using them for research. The only problem? The results may not be very accurate.

That is the rub for Gelfand as she plans a large study that hopes to determine whether melatonin, a natural hormone, can ease migraine pain in pediatric patients by improving their sleep. In a small test run, she first wanted to check if study participants would wear a sleep-monitoring wristband 24/7, so she outfitted 10 people with Fitbit Flexes. But when Gelfand embarks on that bigger study, she likely won't use Fitbits because she worries about the reliability of the data they generate. Indeed, some third-party studies have shown them to be inaccurate to varying degrees.

Consumer activity trackers like Fitbits are growing in popularity among universities, hospitals, and pharmaceutical companies that want to study certain health conditions in daily-life scenarios. "With remote trials, we have the potential to allow populations of patients who haven't been able to participate in research before to participate," Gelfand, a pediatric neurologist at the University of California, San Francisco, told BuzzFeed News.

But it may be some time before the devices are fully embraced by scientists, a few of whom question the reliability and replicability of the data Fitbits generate. "I have a heebie-jeebie factor about somebody using them in their science," said Hawley Montgomery-Downs, an associate professor of psychology at West Virginia University, whose studies have found inaccuracies in Fitbit's sleep-tracking abilities. "They don't vary by a couple minutes or a couple percentage points. They vary — and as a scientist, I don't use this word lightly — very dramatically."

Montgomery-Downs recently started a sleep-tracking device testing company. None of her clients currently have commercially available products.

Potential benefits

Fitness trackers are just one example of a new consumer technology that could dramatically expand the scope of medical research beyond people who live near a clinical trial center. More than 75,000 people have enrolled in studies that double as iPhone apps made with Apple's ResearchKit, an open-source software framework unveiled in March. And this month, the National Institutes of Health said it may use smartphones and wearables in a larger study of how drug therapies and preventive measures can be tailored to individuals' unique health risks. "These devices could provide the ability to track health behaviors and environmental exposures much more frequently with minimal burden on participants," the agency noted.

Fitbits do make it significantly easier for people like Julia Roy, 17, to participate in studies like Gelfand's. Roy, who just graduated from a San Francisco high school, experiences migraines as many as 11 days a month. Without dropping by Gelfand's clinic, she can participate in the study simply by taking the pills given her (either melatonin or a placebo), reporting headaches every day on an iPhone app, and wearing the Fitbit Flex around the clock. "I had to wear it on prom night," she said. "So it got my three and a half hours of sleep."

For researchers, wearable, connected devices like the Fitbit have clear advantages over more traditional electronic trackers. "Five to 10 years ago, the only thing around were pedometers. In order to get the information off pedometers, you had to bring in your pedometer, and I'd write it down or log it on a computer, or they had to write it down," said Dr. Mitesh Patel, an assistant professor of medicine at the University of Pennsylvania, who has studied how wearables and smartphones can monitor fitness. "One of the biggest benefits of these new, connected devices is we can passively pull data via cellular or Wi-Fi connections. We can do that passively to observe how people are behaving in their everyday lives."

Another benefit is cost. Actigraphs, medical-grade sleep-tracking devices, can strain budgets at prices that range between $300 and $1,000, and don't always include associated software. In contrast, a sleep-tracking Fitbit starts at $100 — software included.

Cost and convenience are a large reason why consumer trackers are catching on among researchers. The pharmaceutical giant Biogen is using Fitbits to track mobility in multiple sclerosis patients. Medidata, a cloud-computing platform for clinical research, is using Fitbit Flex and Garmin's Vivofit for similar reasons. And Misfit Wearables told BuzzFeed News that there are now about a dozen health research studies in which its trackers are used to monitor participants' activity levels.


Because these wearable-enabled studies are conducted remotely, researchers sometimes meet patients infrequently, or never at all. That, too, is a concern when it comes to vetting data gathered via tools explicitly built for nonscientific use.

In the case of Fitbits, the consumer activity tracker most used in the studies reviewed for this article, step-counting seems to be highly accurate, independent reviews have found, supporting Fitbit's claims that they are 95% to 97% accurate when worn correctly.

But Fitbit may not be quite as accurate at tracking other things. In a December study by West Virginia University and a handful of other institutions, 63 children dozed overnight in a sleep laboratory while wearing Fitbit Ultras. In default mode, the Fitbit overestimated sleep time by 41 minutes and sleep efficiency by 8%, compared to the official sleep test. And in the more detailed "sensitive" mode, the Fitbit underestimated sleep time by 105 minutes and sleep efficiency by 21%.

Those findings alarmed Montgomery-Downs, an author of the study, because Fitbit wasn't simply off by a consistent margin across subjects — it was more off the more poorly a person slept.

"Consumers should stop paying for these things until they can be shown compelling evidence that they are rigorously evaluated and dependable," she told BuzzFeed News.

Fitbit disputed those claims, saying that its sleep-tracking is validated by internal studies. A spokesperson added in an email, "It's important to note that although the sleep-tracking study referenced was conducted among children and adolescents ages 3-17, Fitbit is not directed at persons under the age of 13 and we do not knowingly collect any personally identifiable information from children under 13."

Patty Freedson, a kinesiology professor at the University of Massachusetts, Amherst, who's studied Fitbits' calorie-burning capabilities, was similarly dubious of Fitbit's viability in medical research. "If you're doing a research study where the goal of the study is to get people to increase energy expenditure, it's a great tool to help motivate people," she said. "But I'm not certain about precision and accuracy enough to use it as the outcome measure, to say, 'Yes, this person's activity level increased by 42%.'"

A Fitbit spokesperson told BuzzFeed News, "Our everyday trackers do a great job calculating calorie burn for step-based activities." Newer Fitbits with advanced heart-rate tracking better measure calorie-burn rates "for non-step based workouts like cycling, elliptical or group exercise classes, as well as more everyday activities."

A fitness device, but not quite a medical device

Fitbits primarily collect data via an accelerometer, then algorithmically translate that raw data into consumer-friendly metrics on the wearer's physical activity. "Our team has spent years and performed multiple internal studies to rigorously test the accuracy of our wrist-based products," a Fitbit spokesperson told BuzzFeed News. The company, which recently went public and is worth $9 billion, does not disclose information about its algorithms and data processing methodologies, citing the "extremely competitive environment" in which it operates.

That leaves scientists unsure of how — exactly — Fitbit's wearables work. "We have no idea what their algorithm was at the time when we published [the study], and we have no idea whether they changed it between then and now," said Montgomery-Downs, the sleep researcher who compared Fitbit sleep data against data collected via validated sleep-laboratory tests.

As a consumer electronics company that's not regulated by the Food and Drug Administration, Fitbit can't tout its gadgets for medical or scientific uses, nor does it. "This product is not a medical device, and is not intended to diagnose, treat, cure, or prevent any disease," the company's product manual warns users.

At the same time, Fitbit doesn't want to actively discourage researchers from buying dozens or hundreds of Fitbits for studies. Institutions doing "research and educational projects" are given product discounts, and can apply for access to Fitbit's open API, a Fitbit spokesperson told BuzzFeed News.

So as tools for conducting medical research, wearables like Fitbit are certainly unproven. But there are medical scenarios in which they have proven useful. In multiple studies, including some of postmenopausal women and hip replacement patients, Fitbits have been shown to be effective in encouraging patients to adopt healthier behaviors like working out more.

"What we look for, for any given patient, are changes over time, and it doesn't matter necessarily whether they start at 300 steps or whether they start at 1,000 steps," said Dr. David Cook, a physician at the Mayo Clinic who helped lead a study in which hip replacement patients were outfitted with Fitbits. "What we want to see is a recovery and normalization by the end of 30 days."

Gelfand, the researcher who's studying migraines in children, said that for the next, bigger study, her concerns about accuracy have led her to think about investing in medical-grade devices for as many as 300 people. But, in what's becoming a common researcher's dilemma, that makes it all the harder to get the study off the ground.

"We would need a bigger grant to be able to afford more expensive devices," she said.