Epidemiological models: 10 things to know about coronavirus research

When researchers study the transmission of an infectious disease such as COVID-19 or want to make predictions about how it might impact people in the future, they create epidemiological models.

These models can be computer simulations or other mathematical representations of the virus and its impacts. Government officials and public health leaders rely on them to make decisions affecting public health. That’s why journalists covering the new coronavirus need a basic understanding of what epidemiological models can and cannot do and how to explain the knowledge they reveal.

“Mathematics may sound like an unlikely hero to help us overcome a global epidemic; however, the insights we gain from studying the dynamics of infectious diseases by using equations describing fundamental variables are not to be underestimated,” the editors of the PLOS ONE academic journal write in a recent article.

“Mathematical modelers make use of available data from current and previous outbreaks to predict who may get infected, where vaccination efforts will be most effective, and how to limit the spread of the disease.”

Another reason reporters need to know about epidemiological modeling: As COVID-19 infection and death rates have risen across our planet, thousands of new academic papers have flooded the internet, many of which are based on models. Most papers posted online and uploaded to preprint research servers have not been vetted by scholars. In fact, a small fraction have undergone formal peer review, a process by which experts in that particular field of study analyze and critique the paper and help guide revisions.

Journalists unfamiliar with models such as the commonly used SIR model will have trouble spotting problems in these new studies. They’re more likely to make mistakes and might unknowingly exclude crucial context.

So what key things should reporters know? How can they do a better job explaining studies and their findings? We reached out to researchers and science writers to ask. Here’s what they told us:

Make it clear in your coverage that models are only as good as the data used to build them, and that researchers currently lack high-quality data about this pandemic.

The late statistician George E. P. Box is famous for saying, “All models are wrong, but some are useful.”

To create models to better understand an infectious disease, researchers must use the data they have at the moment, even if it’s low quality, Helen Jenkins, an epidemiologist and assistant professor of biostatistics at the Boston University School of Public Health, explained by e-mail.

There’s still a lot that researchers don’t know about the new coronavirus and COVID-19, the disease it causes. Among the missing data: Reliable estimates for the number of people who have contracted COVID-19, recovered from it and died because of it.

“In this pandemic, there’s a lot of poor-quality data around — for example, case reporting is so dependent on testing and hugely underestimates true cases,” Jenkins explained. “Models are only as good as the data that has gone into them and not usually acknowledged in reporting of models.”

Explain to your audience that researchers also make assumptions when creating models.

Researchers hoping to learn something new about COVID-19 also must make assumptions in order to build their models, according to Jenkins — for example, assuming that people in a certain location interact with one another in a certain way. Or, that it’s reasonable to expect people in the U.S. will be hospitalized with COVID-19 at rates similar to people in Wuhan, China, where the new coronavirus likely originated.

It’s important for journalists to explain data quality problems and the assumptions that go into building a model. That necessary context helps the public understand the shortcomings of mathematical models.

“Often, I think the general public thinks that a model is a perfect crystal ball into the future and does not recognize those caveats that apply,” she explained.

Keep in mind that researchers use a variety of models to study infectious diseases. They are designed to answer different questions.

Models can vary significantly in terms of the questions they tackle, the data researchers use to build them and how they analyze information. Researchers use some models to study the behavior of an entire population of people, while others allow them to examine the behavior of individual people.

A model that journalists are sure to come across because of its frequent use is the SIR model. This model, which can help researchers better understand how a disease spreads and how to prevent it, divides people into three categories based on their relationship to a disease:

1) “Susceptibles,”refers to people who have not yet contracted a virus 2) “Infectives” are those who are infected and can pass the virus to others 3) “Removed” is the label given to people who have either recovered and become immune or who have died of the disease.

Mandy Izzo, senior science writer at the Institute for Disease Modeling, a private research organization in Washington, recommended journalists check out a comic on pandemic modeling that FiveThirtyEight published last month.

“I would say the cleanest breakdown for why modeling is so difficult is the FiveThirtyEight cartoon — it’s quite nuanced and does a great job explaining all of the pieces that go into the models,” she told Journalist’s Resource via email.

When reporting on a model that makes a numerical prediction — for example, the number of Americans who will die from COVID-19 during a period of time in the future — emphasize that the prediction is a ballpark estimate represented by a range of possible numbers.

This type of model generally doesn’t generate one number. Its predictions are estimates presented as a range of values. Although journalists tend to focus on one number — the average, or the highest or lowest number — they actually should report the full range of possibilities, suggested Brooke Nichols, a health economist and infectious disease mathematical modeler.

“I see in mainstream media many articles citing: ‘85,000 deaths predicted by XYZ modeling group in XZY city!’ whereas models are more likely to say, ‘between 20,000-130,000 deaths predicted,’” Nichols pointed out during an email interview. “Understanding and expressing uncertainty in mathematical modeling results is key.”

Tell your audience what the study adds to what we know about that particular topic and which big questions remain.

“That’s almost a critical framing for any new COVID-19 studies: Here is what the study does add to our understanding and here are things it can’t tell us,” says Dylan H. Morris, a doctoral student studying mathematical biology at Princeton University who has co-authored academic journal articles on predictive modeling and COVID-19. “If journalists go in with that framing in their head … that will help readers understand this is one piece of the puzzle. It’s not the solution.”

Ask these seven questions when interviewing researchers about epidemiological models.

Izzo and Morris say these questions are key:

What type of model was used and what are its strengths and weaknesses?
What assumptions went into creating the model?
What was this model designed to do?
Where did the data used for the model come from and how did using this specific data affect results?
What factors or data were intentionally left out of this study and why?
Does this study focus on a best-case or worst-case scenario?
What caveats must be included in an explanation of this study’s findings?

Give additional scrutiny to models created by researchers who have not demonstrated expertise in model building.

Scholars who don’t have experience making epidemiological models but want to help answer the myriad questions we have about the coronavirus have been creating models and posting their findings online. Some of these researchers are making basic mistakes in the use and interpretation of their models, Morris says.

He urges journalists to ask researchers with model-building experience — who did not play a role in conducting these studies — for help spotting such errors.

“People are hungry for the one answer and precisely because of this, it’s important that journalists and scientists work together in a constantly evolving process of discovery,” Morris says.

Nichols recommended reporters look into modelers’ backgrounds, including their publication history. “Google the researcher or research group to see whether they have many peer-reviewed publications in the type of modeling they are reporting on,” she wrote.

One red flag to watch out for: If a study is based on one type of model — say, a dynamic transmission model — but the modeler only has experience in a different type — for instance, a statistical model.

Be leery of epidemiology models from scientists who aren’t experts in epidemiology.

Just because a researcher has created successful models to investigate other health science topics in the past doesn’t guarantee that person’s current epidemiological model is sound, or that it’s the best type of model for studying that particular problem. “I’ve noticed models coming out of groups that actually do, say, diabetes [research], and are now attempting infectious disease modeling for the first time,” Nichols explained.

Use Twitter to find out what academics and others are saying about new research.

In recent months, academics have been weighing in on new coronavirus research via social media, especially Twitter. Following these conversations is another way to gauge the strengths and weaknesses of a new paper and the model it’s based on.

Jenkins advised reaching out to experts and following the Twitter conversations of those you know and trust.

“The reason it’s good to do both is that many researchers are overwhelmed with journalist requests and make the effort to write Twitter threads to communicate to a wide audience,” she noted. “Using that resource will save time — a researcher can then not have to repeat all the same points in a phone interview.”

Learn more about epidemiological models. It will help you ask stronger questions and better explain coronavirus research in plain language.

Izzo suggested journalists check out these resources:

In the May 4, 2020 episode of This Podcast Will Kill You, Mike Famulare, a senior research scientist at the Institute for Disease Modeling, explains the basics of modeling and how to evaluate models.
A recent blog item from PATH, a nonprofit formerly known as the Program for Appropriate Technology in Health, discusses COVID-19 modeling in lay terms.
In this video from Oxford University’s Mathematical Institute, research fellow Robin Thompson gives a public lecture on the modelling of infectious diseases such as COVID-19.
Pennsylvania State University offers a free online course, “Epidemics: the Dynamics of Infectious Diseases,” through the Coursera education platform.

Take a look at our other coronavirus-related resources, including tips on covering biomedical research preprints and why journalists shouldn’t use the terms “patient zero” and “party zero” in their news coverage.

We’ve also pulled together research on topics such as consumer spending in the wake of the pandemic and how public health messaging might help explain why the coronavirus appears to have a disparate impact on racial and ethnic minorities.

About The Author

Denise-Marie Ordway