Journalists should cover lax oversight of research data, says data sleuth

Uri Simonsohn is an outspoken advocate for open science — adding transparency to the research process and helping researchers share what they’ve learned in greater detail with a broad audience.

Many people know Simonsohn for his data analyses on Data Colada, a blog about social science research he writes with two other behavioral scientists, Leif Nelson and Joseph Simmons. The three scholars, who co-direct the Wharton Credibility Lab at the University of Pennsylvania, occasionally use the blog to spotlight evidence of suspected fraud they’ve found in academic papers.

In his role at the Credibility Lab and as a professor at Esade Business School in Barcelona, Simonsohn travels to speak on issues around scientific integrity and data science. During his recent visit to Harvard University, The Journalist’s Resource asked for his thoughts on how journalists can improve their coverage of academic fraud and misconduct.

Here are three big takeaways from our conversation.

1. Before covering academic studies, ask researchers about preregistration.

Preregistration is “the practice of documenting your research plan at the beginning of your study and storing that plan in a read-only public repository such as OSF Registries or the National Library of Medicine’s Clinical Trials Registry,” according to the nonprofit Center for Open Science. Simonsohn says preregistration helps prevent research fraud. When researchers create a permanent record outlining how they intend to conduct a study before they start, they are discouraged from changing parts of their study — for instance, their hypothesis or study sample — to get a certain result.

Simonsohn adds that preregistration also reduces what’s known as “p-hacking,” or manipulating an analysis of data to make it seem as though patterns in the data are statistically significant when they are not. Examples of p-hacking: Adding more data or control variables to change the result or deciding after the analysis is complete to exclude some data. (For more on statistical significance, read our tip sheet on the topic.)

Preregistration is particularly important when researchers will be collecting their own data, Simonsohn points out. It’s easier to alter or fabricate data when you collect it yourself, especially if there’s no expectation to share the raw data.

While preregistration is the norm in clinical trials, it’s less common in other research fields. About half of psychology research is preregistered as is about a quarter of marketing research, Simonsohn says. A substantial proportion of economic research is not, however, because it often relies on data collected by other researchers or nonprofit organizations and government agencies such as the U.S. Census Bureau.

Simonsohn urges journalists to ask researchers whether they preregistered their studies before reporting on them. He likened reporting on research that isn’t preregistered to driving a car that hasn’t been inspected. The car might be perfectly safe, but you can’t be sure because no one has had a chance to look under the hood.

“If the person says ‘no,’ [the journalist] could ask, ‘Oh, how come?’” he says. “And if they don’t provide a compelling reason, the journalist could say ‘You know, I’m not going to cover work that hasn’t been preregistered, without a good rationale.’”

Research registries themselves can be a helpful resource for journalists. The Center for Open Science lets the public search for and read the thousands of preregistered research plans on its Open Science Framework platform. Researchers who preregister their work at AsPredicted, a platform Simonsohn helped create for the Wharton Credibility Lab, can choose whether and when to make their preregistered research plan public.

2. Report on the lack of oversight of research data collection.

Journalists and the public probably don’t realize how little oversight there is when it comes to collecting and analyzing data for research, Simonsohn says. That includes research funded by the federal government, which gives colleges, universities and other organizations billions of dollars a year to study public health, climate change, new technology and other topics.

Simonsohn says there’s no system in place to ensure the integrity of research data or its analysis. Although federal law requires research involving human subjects to be reviewed by an Institutional Review Board, the primary goal of these independent committees is protecting the welfare and rights of study participants.

Academic papers are reviewed by a small group of experts before a scholarly journal will publish them. But the peer-review process isn’t designed to catch research fraud. Reviewers typically do not check the authors’ work to see if they followed the procedures they say they followed to reach their conclusions.

Simonsohn says journalists should investigate the issue and report on it.

“The lack of protection against fraud is a story that deserves to be written,” he says. “When I teach students, they’re shocked. They’re shocked that when you submit a paper to a journal, [the journal is] basically trusting you without any safeguards. You’re not even asked to assert in the affirmative that you haven’t done anything wrong.”

Journalists should also examine ways to prevent fraud, he adds. He thinks researchers should be required to submit “data receipts” to organizations that provide grant funding to show who has had access to, changed or analyzed a study’s raw data and when. This record keeping would be similar to the chain of custody process that law enforcement agencies follow to maintain the legal integrity of the physical evidence they collect.

“That is, by far, the easiest way to stop most of it,” Simonsohn says.

3. Learn about open science practices and the scientists who expose problematic research.

Nearly 200 countries have agreed to follow the common standards for open science that UNESCO, the United Nations’ scientific, educational and cultural organization, created in 2021. In December, UNESCO released a status report of initiatives launched in different parts of the globe to help researchers work together in the open and share what they’ve learned in detail with other researchers and the public. The report notes, for example, that a rising number of countries and research organizations have developed open data policies.

As of January 2024, more than 1,100 open science policies were adopted by research organizations and research funders worldwide, according to the Registry of Open Access Repositories Mandatory Archiving Policies, which tracks policies requiring researchers to make their “research output” public.

In the U.S., the universities and university departments that have adopted these policies include Johns Hopkins University, University of Central Florida, Stanford University’s School of Education and Columbia University’s School of Social Work. Such policies also have been adopted at Harvard Kennedy School and one of its research centers, the Shorenstein Center on Media, Politics and Public Policy, which is where The Journalist’s Resource is housed.

Simonsohn recommends journalists learn about open science practices and familiarize themselves with research watchdogs such as Nick Brown, known for helping expose problems in published studies by prominent nutrition scientist Brian Wansink.

Retraction Watch, a website that tracks research retractions, maintains a list of more than two dozen scientific sleuths. Elisabeth Bik, a microbiologist and science integrity consultant who has been called “the public face of image sleuthing,” was a guest speaker in The Journalist’s Resource’s recent webinar on covering research fraud and errors.

Here are some of the open science organizations that journalists covering these issues will want to know about:

The Virginia-based Center for Open Science.
The Berkeley Initiative for Transparency in the Social Sciences.
Yale University Open Data Access (YODA) Project, an online platform for sharing clinical trial research data.
VEDA (Visualization, Exploration, and Data Analysis), NASA’s open-source platform that allows researchers to collaborate on Earth science studies.
MedRxiv (pronounced “med-archive”) is an online platform owned and operated by Cold Spring Harbor Laboratory. Researchers can use it to share preprint research — academic papers that present preliminary findings and have not been peer reviewed — on medical, clinical and health science topics.

Journalists should report on lax oversight of research data, says data sleuth

1. Before covering academic studies, ask researchers about preregistration.

2. Report on the lack of oversight of research data collection.

3. Learn about open science practices and the scientists who expose problematic research.

About The Author

Denise-Marie Ordway