When the Institute for Health Metrics and Evaluation at the University of Washington released its estimate that COVID-19 had killed 912,345 people in the U.S. by May 6, 2021, many were shocked. That’s 60% higher than the 578,555 coronavirus-related deaths officially reported to the U.S. Centers for Disease Control and Prevention over this same period.

How can two estimates differ so widely? It’s not like the Institute for Health Metrics and Evaluation researchers stumbled upon a morgue of more than 300,000 dead people who hadn’t been tracked elsewhere.

Here’s what goes into some of the various counts of COVID-19 pandemic deaths and how I as a statistician think about their differences.

## Tracking deaths

When someone dies, a medical professional records the immediate cause and up to three underlying conditions that “initiated the events resulting in death” on the death certificate. Death certificate information is transmitted to the National Vital Statistics System for a variety of public health uses, including tabulating the leading causes of death in the U.S.

But death certificate information may not reflect the actual number of COVID-19 deaths. A COVID-19 diagnosis could have been missed by health care workers, or the disease could have gone unrecorded on a death certificate. There’s always going to be some error in the data.

OBSERVED COUNT = TRUE COUNT + ERROR

That is, we want to know the real number of COVID-19 deaths in the U.S., the “true count.” But because the real world is messy, we’ll never know that true count and can only approximate it. The unknown true count combines with unknown errors to give us the observed count – for instance, the tally from all the nation’s death certificates.

If the predominant error is that some COVID-19-related deaths were missed – perhaps due to a lack of testing earlier in the pandemic – then the observed count would be an underestimate of the true count. However, there could be additional types of errors as well, and those may cause the observed count to deviate further or in other ways from the true count.

## Calculating ‘all cause’ excess mortality

One way around this dilemma is to focus on how many deaths were recorded over and above the number expected by epidemiologists and statisticians had the pandemic not happened. This count is called “all cause” excess mortality. It’s based on historical data.

Estimates from this type of analysis suggest that the reported number of COVID-19 deaths may be an underestimate. Many more people died during the pandemic than normally would have during that time period. And it’s a higher number than how many people died of COVID-19 according to death certificate counts.

For example, the estimated number of deaths above what was expected in 2020 was almost 412,000 people, while the number of deaths the CDC attributed to COVID‐19 as of Jan. 6, 2021 was 356,000.

This type of analysis cannot conclude that the excess deaths are due to COVID-19 itself, only that the aggregate impact of the pandemic resulted in more deaths than would have been expected in its absence.

## Reconsidering the number of expected deaths

So if by May 2021 there were 578,555 reported COVID-19-related deaths and perhaps as many as 663,000 excess deaths according to CDC data, how did the Institute for Health Metrics and Evaluation come up with the figure 912,345?

Their analysis seeks to determine the true number of COVID-19 deaths by estimating other effects due to the pandemic. IHME then uses its estimates of those effects to adjust the observed COVID-19 death count.

Some factors they considered would likely contribute to more deaths: health care that was delayed or deferred; mental health disorders that were untreated; increased alcohol use and opioid use during the pandemic. They also considered factors that would likely cut down on deaths: decreased numbers of injuries; reduced transmission of diseases that weren’t COVID-19.

They then used these estimates to adjust the expected number of deaths in an effort to better quantify the number of deaths attributable to COVID-19. In effect, they were applying these pandemic-specific “errors” to the excess death estimates that were based on pre-pandemic historical trends.

Ideally, this type of analysis should result in excess mortality being a better measure of the number of deaths that can be attributed to COVID-19. It depends, though, on having sufficient detailed data available and requires certain assumptions about that data.

## So which number is right?

Such a simple question is actually quite hard to answer for many reasons.

One is that each number is the answer to a different question. The number of “all cause” excess deaths quantifies how many people died from any cause above what we would have expected if the death rate during the pandemic had followed pre-pandemic patterns. The Institute for Health Metrics and Evaluation number is an estimate of the total number of deaths that can be attributed to COVID-19. Both are useful for understanding the impact of the pandemic.

Yet, even two estimates of the total number of COVID-19 deaths are going to differ because the estimates could be based on different methodologies, different sources of data and different assumptions. That’s not necessarily a problem. It may be that the results turn out to be relatively consistent, suggesting the conclusions don’t depend on the assumptions. Alternatively, if the results are very different, that can help researchers understand the problem better.

However, even small differences between studies can, unfortunately, sow distrust in science for some people. But it’s all part of the scientific method in which studies get reviewed by researchers’ peers, questioned and dissected, and then revised as a result. Science is an iterative process in which gut instinct and guesses get refined into theories and then may be subsequently refined into facts and knowledge.

In this case, the Institute for Health Metrics and Evaluation study provides some evidence of what researchers like me suspected: The number of excess deaths in the U.S., while larger than the number of deaths attributed to COVID-19, may also be an undercount of the true number of COVID-19 deaths. It is also consistent with a World Health Organization analysis that concludes the number of COVID-19 deaths in some countries could be two to three times greater than the number recorded. But no single study offers definitive proof, just one more piece of evidence on the path to better understanding the deadly impact of this pandemic.

Ronald D. Fricker Jr., Professor of Statistics and Senior Associate Dean, Virginia Tech