Back in August I did my first detailed Excess Deaths assessment (See Link) based on data from CDC’s “Wonder” database on deaths from 2017 and 2018 and comparing it to data from the CDC’s provisional COVID-19 death counts (Link). Using this data I was able to measure data by state and by 10 year age demographic. What I found was interesting. To summarize, there were significant excess deaths in the groups that wouldn’t be surprising to you (65+ years old, Northeastern states and DC). But the really interesting (and concerning!) thing I discovered was that there were significant excess deaths in younger demographics who had been lightly impacted by COVID-19.
Quick Explanation of Methodology
The CDC Wonder Database allows one to search for total deaths by all types. The data is very detailed but it isn’t recent. In general the newest data in Wonder is 2 years old. Knowing that 2017 was a “high death” year due to large numbers of flu deaths and that 2018 was a bit below average, I decided to take these two years and average the deaths as my baseline to compare to 2020 data. The data from Wonder can be aggregated across regions (I chose States) as well as by demographics (I chose age in 10 year groupings).
The 2020 provisional death data put out by the CDC can also be grouped in similar ways (states and 10 year groupings). Plus, in addition to providing COVID-19, Influenza, and Pneumonia deaths, it also provides total death numbers for these groupings. This allows for an easy comparison. It is unclear how CDC arrives at these numbers, but they don’t seem to be extremely laggy and they line up more or less with the numbers from Johns Hopkins. Here’s a picture of the website where you can pull the data. As you can see, the claim from the CDC is that the data is as of 10/14.
Since the year is still not over, I’m doing a very simple scaling assuming that the death rate will continue at the current rate for the rest of the year. This isn’t a solid assumption, but I don’t think it matters much. Since we’re in October, 10 months along, I used a scaling factor of 1.2. Back in August (when the data was lagging a bit) I used a scaling factor of just under 2, accounting for 7 months of data.
Changes from July/August Data
Just to cut quickly to the chase, I noticed a number of changes from my last post on excess deaths from August.
- Excess death percentages in the over 65 age population had decreased quite a bit. There were numerous states where this population had over 150% excess deaths but in the current results, I onnly see two older age demographics in the top ten. Note that since we’re comparing 2020 COVID/Flu/Pneumonia deaths with overall 2017-18 averages for each age group, this accounts for cases where total death numbers for older demographics are much larger than death numbers for younger demographics.
- Excess deaths for younger demographics, particularly 25-34 and 35-44, have remained the same. This implies to me that the rate of overall excess deaths for these groups has stayed consistent while the rate of excess deaths for the older generation has fallen significantly. This is not surprising to anyone who has watched the data because it’s clear that even while COVID cases rise and fall, COVID deaths have been falling everywhere (for lots of good reasons). BUT, whatever is killing the younger demographic at higher rates than normal years has yet to slow down.
- Overall COVID/Flu/Pneumonia Deaths as a percent of 2017-18 averages has fallen since August. This also aligns with the sharp decrease in COVID deaths since July/August.
- Washington DC seems to have excess deaths across all age demographics. Note that the 5-14 year group’s 250% excess death number is only like 5 excess deaths… I’m not sure I could make a good guess as to why DC’s numbers are so high. Maybe someone can weigh in on this?
You can see the data yourself in the table below (sorted by 2020 excess death percentage). Yellow indicates a state/demographic pair that has low COVID/flu/pneumonia impact (around 15% or less) but still has high excess deaths
I also showed an overall histogram of excess deaths in my last post. This histogram is a type of chart that measures “counts” of samples that fit into a specific bin. For instance, in this case, each sample is a state/demographic pair and the histogram is plotted over 80 bins that range from around 10% of 2017-18 deaths up to around 150% of 2017-18 deaths. So each bin represents roughly 2%. We can see in this histogram that the peak of the histogram is where about 60 state/demo pairs fell into a bin that looks like around 90%. If you see this as the mean and the histogram as a rough bell curve (normal distribution) then you can see that using this method and based upon the CDC’s 2020 death projection numbers, the overall excess death distribution for 2020 has shifted to the left since August (when the peak value was in the bin that represented 110% (go back and look… don’t take my word for it!). This also makes sense knowing that the high death rates from April through June have slowed.
Since I was curious, I wrote code to plot the histograms for each age demographic to see how they related to each other. It’s a bit messy, but you can see in the legend which colors correspond to which demographic. Key takeaways from this visualization is that 1) 35-44 has been hardest hit, followed by 25-34, at least on an excess death percentage basis, 2) 65-74 seems to be slightly below the 100% which would represent the 2017-18 average, and 3) 5-14 and 15-24 have less excess death than 2017-18.
Highly Reported-on CDC Excess Death Pre-print (from 10/20) – take it with a grain of sand.
On October 20th a CDC scientist released a pre-print that the CDC published here. The assessment of the authors, based upon their simulation is that there were 299K excess deaths in the US during 2020. Of course, this was immediately picked up upon by our fearless media. In many cases, they reported on the pre-print incorrectly because the statistics in the pre-print go a bit beyond that of a newspaper data scientist. Actually, the statistics in the pre-print are a bit muddy and don’t seem to line up in places, so I can’t blame the news journalist folks much. I might write a longer report on this paper if I get time, but I’m not confident in their simulation’s assumptions on a typical year-to-year death growth rate and they don’t account for deaths that didn’t occur because a sick person died of COVID first. And their overall numbers don’t match the ones that CDC publishes in the provisional 2020 death numbers either, so this is problematic. I took a stab at replicating their model based on a much simpler and more reasonable regression model than what they selected and their 299,000 number (compared to the 2015-2019 average) appears to represent expected growth in deaths, not excess deaths (see chart after conclusions). We’ll have to wait for the actual paper to come out with all the details I guess. Of course, the Washington Post didn’t wait..
Conclusions
It is tough to make any solid projections based on ANY COVID-19 data. It is always possible that the CDC’s data is inaccurate (it usually is… these kinds of things are infamously hard to measure). And clearly 2020 is a unique year for deaths. It isn’t clear from the CDC’s data that COVID-19 has created significant excess deaths, however.
The really serious question is about the real excess deaths that haven’t slowed down in the younger demographics. This problem is not coming from deaths due to COVID but is likely related to the anxiety and stress created by COVID, by government actions that are aimed at reducing or eliminating COVID cases, by isolation, etc. Unfortunately there is a lot of evidence coming out that these governmental actions haven’t been exceptionally effective (a quick look at COVID case rates across various new government actions shows that they haven’t had very measurable impacts). The other takeaway is that excess deaths for ages younger than 15 have been much less than 2017-18 averages. The combination of being isolated from society (driving in cars less, less exposure to disease, etc.) and the lack of an effect on this group from COVID are likely the cause.
Backup: Tod’s “Simpler” 2020 excess death model
Thanks Tod.
For some normal/new folks seeing the data, a small definition of excess deaths in the very beginning would help.
I also wonder how the chart tracks relative to population growth and population density. Hmmm.
Good point, Ray! My thinking (check me on this) is that population density is already factored in since we’re comparing a region to itself 2-3 years ago. The population density in the region has gone up (and I’m normalizing deaths by the population of the related year) but I’m assuming any effect due to increased density is negligible.
T.