The Rarity of High Deaths per 1000 People

I was looking at COVID-19 data that was sorted by case count and noticed that the Dakotas and Wisconsin were at the top of the list and then looked a column over and realized that all those regions still had low deaths per 1000 people. It made me curious about how common it is to have a high-deaths region.

So I built a histogram of all the counties in the U.S. and binned them by their deaths per 1000 persons. Just as a reminder, the height of the bar represents the number of counties in the bin. For instance, the tall bar on the far left represents about 500 counties all of which have less than about 0.1 deaths per 1000 persons. The really short bars on the far right represent the one or two counties with over 3.0 deaths per 1000 persons (0.3%).

I put labels on the histogram to identify which bars well-known counties fall in (yes, it’s biased towards Arizona).

Yes, the NYC boroughs (Queens, Bronx, Manhattan) all are still at the top of the list, but their death rates have slowed significantly from the peak rates back in March/April.

Also, the red line represents the exponential function that fits the decay of the histogram. Therefore, the likelihood of a county having a large death rate follows an exponential decay. The formula would be DECAY RATE = 432*e^(-2.5) + 4.3

2020 Excess Deaths Update – October 20

Back in August I did my first detailed Excess Deaths assessment (See Link) based on data from CDC’s “Wonder” database on deaths from 2017 and 2018 and comparing it to data from the CDC’s provisional COVID-19 death counts (Link). Using this data I was able to measure data by state and by 10 year age demographic. What I found was interesting. To summarize, there were significant excess deaths in the groups that wouldn’t be surprising to you (65+ years old, Northeastern states and DC). But the really interesting (and concerning!) thing I discovered was that there were significant excess deaths in younger demographics who had been lightly impacted by COVID-19.

Quick Explanation of Methodology

The CDC Wonder Database allows one to search for total deaths by all types. The data is very detailed but it isn’t recent. In general the newest data in Wonder is 2 years old. Knowing that 2017 was a “high death” year due to large numbers of flu deaths and that 2018 was a bit below average, I decided to take these two years and average the deaths as my baseline to compare to 2020 data. The data from Wonder can be aggregated across regions (I chose States) as well as by demographics (I chose age in 10 year groupings).

The 2020 provisional death data put out by the CDC can also be grouped in similar ways (states and 10 year groupings). Plus, in addition to providing COVID-19, Influenza, and Pneumonia deaths, it also provides total death numbers for these groupings. This allows for an easy comparison. It is unclear how CDC arrives at these numbers, but they don’t seem to be extremely laggy and they line up more or less with the numbers from Johns Hopkins. Here’s a picture of the website where you can pull the data. As you can see, the claim from the CDC is that the data is as of 10/14.

Since the year is still not over, I’m doing a very simple scaling assuming that the death rate will continue at the current rate for the rest of the year. This isn’t a solid assumption, but I don’t think it matters much. Since we’re in October, 10 months along, I used a scaling factor of 1.2. Back in August (when the data was lagging a bit) I used a scaling factor of just under 2, accounting for 7 months of data.

Changes from July/August Data

Just to cut quickly to the chase, I noticed a number of changes from my last post on excess deaths from August.

  1. Excess death percentages in the over 65 age population had decreased quite a bit. There were numerous states where this population had over 150% excess deaths but in the current results, I onnly see two older age demographics in the top ten. Note that since we’re comparing 2020 COVID/Flu/Pneumonia deaths with overall 2017-18 averages for each age group, this accounts for cases where total death numbers for older demographics are much larger than death numbers for younger demographics.
  2. Excess deaths for younger demographics, particularly 25-34 and 35-44, have remained the same. This implies to me that the rate of overall excess deaths for these groups has stayed consistent while the rate of excess deaths for the older generation has fallen significantly. This is not surprising to anyone who has watched the data because it’s clear that even while COVID cases rise and fall, COVID deaths have been falling everywhere (for lots of good reasons). BUT, whatever is killing the younger demographic at higher rates than normal years has yet to slow down.
  3. Overall COVID/Flu/Pneumonia Deaths as a percent of 2017-18 averages has fallen since August. This also aligns with the sharp decrease in COVID deaths since July/August.
  4. Washington DC seems to have excess deaths across all age demographics. Note that the 5-14 year group’s 250% excess death number is only like 5 excess deaths… I’m not sure I could make a good guess as to why DC’s numbers are so high. Maybe someone can weigh in on this?

You can see the data yourself in the table below (sorted by 2020 excess death percentage). Yellow indicates a state/demographic pair that has low COVID/flu/pneumonia impact (around 15% or less) but still has high excess deaths

Merged Table of CDC 2017-18 average numbers compared to CDC Provisional 2020 death numbers. 10/20/2020

I also showed an overall histogram of excess deaths in my last post. This histogram is a type of chart that measures “counts” of samples that fit into a specific bin. For instance, in this case, each sample is a state/demographic pair and the histogram is plotted over 80 bins that range from around 10% of 2017-18 deaths up to around 150% of 2017-18 deaths. So each bin represents roughly 2%. We can see in this histogram that the peak of the histogram is where about 60 state/demo pairs fell into a bin that looks like around 90%. If you see this as the mean and the histogram as a rough bell curve (normal distribution) then you can see that using this method and based upon the CDC’s 2020 death projection numbers, the overall excess death distribution for 2020 has shifted to the left since August (when the peak value was in the bin that represented 110% (go back and look… don’t take my word for it!). This also makes sense knowing that the high death rates from April through June have slowed.

Combined age groups’ histogram of 2020 excess deaths – October 20, 2020

Since I was curious, I wrote code to plot the histograms for each age demographic to see how they related to each other. It’s a bit messy, but you can see in the legend which colors correspond to which demographic. Key takeaways from this visualization is that 1) 35-44 has been hardest hit, followed by 25-34, at least on an excess death percentage basis, 2) 65-74 seems to be slightly below the 100% which would represent the 2017-18 average, and 3) 5-14 and 15-24 have less excess death than 2017-18.

Overlapping Histograms for Each Age Demographic.

Highly Reported-on CDC Excess Death Pre-print (from 10/20) – take it with a grain of sand.

On October 20th a CDC scientist released a pre-print that the CDC published here. The assessment of the authors, based upon their simulation is that there were 299K excess deaths in the US during 2020. Of course, this was immediately picked up upon by our fearless media. In many cases, they reported on the pre-print incorrectly because the statistics in the pre-print go a bit beyond that of a newspaper data scientist. Actually, the statistics in the pre-print are a bit muddy and don’t seem to line up in places, so I can’t blame the news journalist folks much. I might write a longer report on this paper if I get time, but I’m not confident in their simulation’s assumptions on a typical year-to-year death growth rate and they don’t account for deaths that didn’t occur because a sick person died of COVID first. And their overall numbers don’t match the ones that CDC publishes in the provisional 2020 death numbers either, so this is problematic. I took a stab at replicating their model based on a much simpler and more reasonable regression model than what they selected and their 299,000 number (compared to the 2015-2019 average) appears to represent expected growth in deaths, not excess deaths (see chart after conclusions). We’ll have to wait for the actual paper to come out with all the details I guess. Of course, the Washington Post didn’t wait..


It is tough to make any solid projections based on ANY COVID-19 data. It is always possible that the CDC’s data is inaccurate (it usually is… these kinds of things are infamously hard to measure). And clearly 2020 is a unique year for deaths. It isn’t clear from the CDC’s data that COVID-19 has created significant excess deaths, however.

The really serious question is about the real excess deaths that haven’t slowed down in the younger demographics. This problem is not coming from deaths due to COVID but is likely related to the anxiety and stress created by COVID, by government actions that are aimed at reducing or eliminating COVID cases, by isolation, etc. Unfortunately there is a lot of evidence coming out that these governmental actions haven’t been exceptionally effective (a quick look at COVID case rates across various new government actions shows that they haven’t had very measurable impacts). The other takeaway is that excess deaths for ages younger than 15 have been much less than 2017-18 averages. The combination of being isolated from society (driving in cars less, less exposure to disease, etc.) and the lack of an effect on this group from COVID are likely the cause.

Backup: Tod’s “Simpler” 2020 excess death model

Deaths per 100K persons since 2012. Note this is normalized by population, but despite this, deaths have been increasing consistently for the last 10 years or so. The Red Dot is the regression-based projection of 2020 deaths. Note that the delta of about 60 deaths per 100K between 2020 and the 2015-2019 average will amount to around 295K “excess deaths”.

COVID-19 Case Acceleration across US States – October 7th

As temperatures fall in different parts of the US, we’re starting to see case growth acceleration resume in some of the hardest-hit regions from the spring.

US State COVID-19 Data Table sorted by Case Acceleration (dIROC_confirmed) – 10/7/2020

Below we can see the table sorted by the acceleration of the death rate. These are pretty much the only states that are seeing increases of the rate.

US State COVID-19 Data Table sorted by Death Rate Acceleration (dIROC_confirmed) – 10/7/2020

Since New York seems to be re-emerging here with above average increases in the Case Rate and Death Rate, here’s their time series plots below, first Case Rate and then Death Rate. The Instantaneous Rate of Change for cases (IROC-Confirmed) is around 1000 new cases per day. For deaths the IROC is about 20 new deaths per day. Both of these values are growing. You can visibly see the Case rate increasing (the cumulative case line is curving upward) but the Death rate increase is a bit too small still to visualize well (but you can see the polynomial fit starting to show the upward curve).

New York state Cumulative Case curve plus 3rd order polynomial fit. 10/7/2020
New York state Cumulative Death curve plus curve fit. 10/7/2020