Nationally representative sample: 5 things you need to know to cover research accurately

Journalists can’t report accurately on research involving human subjects without knowing certain details about the sample of people researchers studied. It’s important to know, for example, whether researchers used a nationally representative sample.

That’s important whether a journalist is covering an opinion poll that asks American voters which presidential candidate they prefer, an academic article that examines absenteeism among U.S. public school students or a clinical trial of a new drug designed to treat Alzheimer’s disease.

When researchers design a study, they start by defining their target population, or the group of people they want to know more about. They then create a sample meant to represent this larger group. If researchers want to study a group of people across an entire country, they aim for a nationally representative sample — one that resembles the target population in key characteristics such as gender, age, political party affiliation and household income.

Earlier this year, when the Pew Research Center wanted to know how Americans feel about a new class of weight-loss drugs, it asked a sample of 10,133 U.S. adults questions about obesity and the effects of Ozempic, Wegovy and similar drugs. Pew designed the survey so that the answers those 10,133 people gave likely reflected the attitudes of all U.S. adults across various demographics.

If Pew researchers had simply interviewed 10,133 people they encountered at shopping malls in the southeastern U.S., their responses would not have been nationally representative. Not only would their answers reflect attitudes in just one region of the country, the individuals interviewed would not represent adults nationwide.

A nationally representative sample is one of several types of samples used in research. It’s commonly used in research that examines numerical data in public policy fields such as public health, criminal justice, education, immigration, politics and economics.

To accurately report on research, journalists must pay close attention to who is and isn’t included in research samples. Here’s why that information is critical:

1. If researchers did not use a sample designed to represent people from across the nation, it would be inaccurate to report or imply that their results apply nationwide.

A mistake journalists make when covering research is overgeneralizing the results, or reporting that the results apply to a larger group of people than they actually do. Depending on who is included in the sample, a study’s findings might only apply to the people in the sample. Many times, findings apply only to a narrow group of people at the national level who share the same characteristics as the people in the sample — for example, individuals who retired from the U.S. military after 2015 or Hispanic teenagers with food allergies.

To determine who a study is designed to represent, look at how the researchers have defined this target population, including location, demographics and other characteristics.

“Consider who that research is meant to be applicable to,” says Ameeta Retzer, a research fellow at the University of Birmingham’s Department of Applied Health Sciences.

2. When researchers use a nationally representative sample, their analyses often focus on what’s happening at a national level, on average. Because of this, it’s never safe to assume that national-level findings also apply to people at the local level.

“As a word of caution, if you’re using a nationally representative sample, you can’t say, ‘Well, that means in California …,” warns Michael Gottfried, an applied economist and professor at the University of Pennsylvania’s Graduate School of Education.

When researchers create a nationally representative sample of U.S. grade school students, their aim is to gain a better understanding of some aspect of the nation’s student population, Gottfried says. What they learn will represent an average across all students nationwide.

“On average, this is what kids are doing, this is how kids are doing, this is the average experience of kids in the United States,” he explains. “The conclusion has to stay at the national level. It means you cannot go back and say kids in Philadelphia are doing that. You can’t take this information and say, ‘In my city, this is happening.’ It’s probably happening in your city, but cities are all different.”

3. There’s no universally accepted standard for representativeness.

If you read a lot of research, you’ve likely noticed that what constitutes a nationally representative sample varies. Researchers investigating the spending habits of Americans aged 20 to 30 years might create a sample that represents this age group in terms of gender and race. Meanwhile, a similar study might use a sample that represents this age group across multiple dimensions — gender, race and ethnicity along with education level, household size, household income and the language spoken at home.

“In research, there’s no consensus on which characteristics we include when we think about representativeness,” Retzer notes.

Researchers determine whether their sample adequately represents the population they want to study, she says. Sometimes, researchers call a sample “nationally representative” even though it’s not all that representative.

Courtney Kennedy, vice president of methods and innovation at Pew Research Center, has questioned the accuracy of election research conducted with samples that only represent U.S. voters by age, race and sex. It’s increasingly important for opinion poll samples to also align with voters’ education levels, Kennedy writes in an August 2020 report.

“The need for battleground state polls to adjust for education was among the most important takeaways from the polling misses in 2016,” Kennedy writes, referring to the U.S. presidential election that year.

4. When studying a nationwide group of people, the representativeness of a sample is more important than its size.

Journalists often assume larger samples provide more accurate results than smaller ones. But that’s not necessarily true. Actually, what matters more when studying a population is having a sample that closely resembles it, Michaela Mora explains on the website of her research firm, Relevant Insights.

“The sheer size of a sample is not a guarantee of its ability to accurately represent a target population,” writes Mora, a market researcher and former columnist for the Dallas Business Journal. “Large unrepresentative samples can perform as badly as small unrepresentative samples.”

If a sample is representative, larger samples are more helpful than smaller ones. Larger samples allow researchers to investigate differences among sub-groups of the target population. Having a larger sample also improves the reliability of the results.

5. When creating samples for health and medical research, prioritizing certain demographic groups or failing to represent others can have long-term impacts on public health and safety.

Retzer says that too often, the people most likely to benefit from a new drug, vaccine or health intervention are not well represented in research. She notes, for example, that even though people of South Asian descent are more likely to have diabetes than people from other ethnic backgrounds, they are vastly underrepresented in research about diabetes.

“You can have the most beautiful, really lovely diabetes drug,” she says. “But if it doesn’t work for the majority of the population that needs it, how useful is it?”

Women remain underrepresented in some areas of health and medical research. It wasn’t until 1993 that the National Institutes of Health began requiring that women and racial and ethnic minorities be included in research funded by the federal agency. Before that, “it was both normal and acceptable for drugs and vaccines to be tested only on men — or to exclude women who could become pregnant,” Nature magazine points out in a May 2023 editorial.

In 2022, the U.S. Food and Drug Administration issued guidance on developing plans to enroll more racial and ethnic minorities in clinical trials for all medical products.

When journalists cover research, Retzer says it’s crucial they ask researchers to explain the choices they made while creating their samples. Journalists should also ask researchers how well their nationally representative samples represent historically marginalized groups, including racial minorities, sexual minorities, people from low-income households and people who don’t speak English.

“Journalists could say, ‘This seems like a really good finding, but who is it applicable to?’” she says.

The Journalist’s Resource thanks Chase Harrison, associate director of the Harvard University Program on Survey Research, for his help with this tip sheet.

What’s a nationally representative sample? 5 things you need to know to report accurately on research