Arizona Case Growth Visual – 1/13/21

Here’s a map of the top 200 AZ zip codes by COVID case growth between 12/21 and 1/13.

Note that the dark blue dots are around 20% growth over these 3 weeks and the orange and red dots are around 80% and 90% growth. That’s a pretty big range.

The zip codes with higher percent growth look to be on the fringes of the more dense, inner city zip codes. See how the Dark Blue seems to be ringed by lighter greens and yellows in the Phoenix area. This may be a factor of how this is measuring growth as a percentage of what it was 3 weeks ago as opposed to raw case increases. Might be interesting to think about.

Lots of zip codes in small towns in rural counties with large percent increases in cases.

Map of top 200 AZ zip codes by case growth from 12/21 to 1/13

2020 Final Excess Deaths Evaluation – 1/5/21


This is a follow-on to two previous analyses (here and here) of excess deaths during 2020. My approach is different than most (or all) I have seen because I am looking at the impact of COVID through evaluating excess deaths in all 10-year age demographics across all US states and DC. What this does is set all demographics equal regardless of their population. I think this is a very reasonable approach because I’m primarily focused on learning what the impact was to each demographic through measuring the percentage of excess deaths (over an average from the two previous years with good data, 2017-2018) while also maintaining awareness of that demographic’s COVID impact. Note that this approach provides some interesting insight.


As I don’t want people to think that this work is conspiracy-based or whatnot, I want to make my assumptions clear here. First of all, I assume the Provisional COVID and total death data for 2020 from the CDC is correct. Or at least that it is not wildly incorrect. The CDC has made some mistakes and their analysis of their own data is often suspect, but it does seem like they’re being very careful not to mess up COVID data. Second, I want to reiterate that my approach is different from most you may have seen. This excess deaths analytic gives equal weight to every state and every 10-year demographic. Therefore, 1-4 year olds in Rhode Island have as much impact in the histogram below as 85+ year olds in New York who had very high numbers of deaths due to COVID. I think this is interesting, because COVID has impacted all of society differently. For some it has been devastating in loss of life but for others it may have been devastating for other reasons. Most analyses of excess deaths by the CDC or popular media is only focusing on the raw deaths, which overwhelmingly have come from the oldest demographics. Third, I am only using a two year set (2017 and 2018) as a baseline for the excess deaths. I could have used a 5 year average, but I chose to use the two most recent years with good death numbers to minimize the impact of growing US populations (because larger populations experience more deaths). It turns out that one of these years was a bad flu year and the other was an easy flu year, so that is serendipitous too. Data from the CDC on provisional deaths for 2020 can be found at this URL and the 2017-18 data can be easily captured on the CDC’s “Wonder” system

Histogram of Excess Deaths by Demographic and State

Histogram of excess deaths for all demographic/state combinations

Takeaway from the Histogram

This doesn’t tell us that we didn’t have excess deaths overall in 2020, so don’t be tricked. What this does tell us, however, is that the majority of state/demographic pairs didn’t have any excess deaths during 2020 over the 2017-18 average. You can also see that the numbers of demographics that experienced more than 100% excess deaths drops off very rapidly (probably at an exponential – e^n – rate) while the under 100% excess death demographics ramp up to the peak at more of a polynomial (maybe n^2 or n^3) rate. This tells us that the demographics across all states were more likely to experience less than 100% excess deaths during 2020. We’ll look at the data tables below to try to figure out both sides of the peak on our histogram.

Data Table – Highest Excess Deaths

Comparison of CDC 2020 Provisional Deaths with Average from 2017 and 2018 – sorted descending by Excess Deaths

Takeaway on Demographics with High Excess Deaths in 2020

The first thing that stands out in the above table is the large number of demographics in DC that are at the top of the excess death list. I have no idea what happened in DC this year to contribute to all these excess deaths, but only a couple of the demographics have large COVID impacts. (Edit. It struck me the reason is most likely because including DC in this analysis is like comparing deaths in Chicago with those across the whole state of Illinois. DC is more like a large city and therefore has unique death statistics)

Secondly, I also notice that there is a mix of “high-COVID” demographics and “low-COVID” demographics at the top of the table. I think most people would have expected the demographics with the most excess deaths to be over 65, but that isn’t the case. Of course, this is on a percentage basis. The demographics with the highest raw number of deaths are mostly over 65, but these are all demographics that experience higher numbers of deaths overall every year anyway. This is why I look at the percentages. Below are the demographics/states with the highest number of raw excess deaths (column on the right). Nothing much is surprising here. Elderly demographics in large states would be expected to have the most raw excess deaths because they have the highest number of deaths every year. This is a good way to parse the data if one is in search for nerve-wracking numbers, but it doesn’t give us any information we couldn’t infer on our own. It is clear, however, that in these groupings COVID deaths were significant, running from 15 to 31%.

Data sorted by “raw” excess deaths in 2020

Secondly, I also notice that there is a mix of “high-COVID” demographics and “low-COVID” demographics at the top of the table (the one sorted by percent – two above). I think most people would have expected the demographics with the most excess deaths to be over 65, but that isn’t the case. Of course, this is on a percentage basis. The demographics with the highest raw number of deaths are mostly over 65, but these are all demographics that experience higher numbers of deaths overall every year anyway. This is why I look at the percentages.

In the next table, I’ll rank the demographics by excess deaths after COVID is subtracted out and we’ll see some interesting results.

Data Table – Sorted by Highest Excess Deaths if COVID/Flu/Pneumonia numbers removed

Data Table sorted by excess deaths after COVID/Flu/Pnemonia removed

Takeaway on Groups with Excess Deaths beyond COVID

The thing that will probably stand out in the above table is that only two over 55 demographics exist in the top excess deaths once COVID and related illnesses are removed. Both are in Washington DC (again, I need to figure out why DC has so many excess deaths across the board. Are they counting differently?). All of the groups above still have excess deaths beyond the 2017-18 average even after the COVID numbers are subtracted. What might this be counting? Arizona, Tennessee, and Colorado demographics under age 44 are all over the top of the list. We know that these states tend to have above average suicides each year. I have also seen reports that deaths due to drug overdoses are exceptionally high in 2020 in younger demographics. This is kind of hard to think about, but it would seem that these excess deaths in 2020 by such large numbers are correlated with COVID and state reactions to COVID.

Data Table – Ratio of Highest COVID+Flu+Pneumonia 2020 Deaths to 2017-18 Average Deaths

2020 to 2017-18 Average Death comparison – Sorted by Ratio of COVID+Flu+Pneumonia to 2017-18 Average

Takeaway on Demographics with High COVID/Flu/Pneumonia Impact in 2020

Note that the above table is as expected. The highest percent impact of COVID/Flu/Pneumonia are in the over 65 year communities. The numbers are pretty staggering though. In the typical year, 30% of a community’s deaths are due to heart disease and about the same percentage to Cancer. The other 40% of deaths are a mix of everything from respiratory diseases to accidents to suicide and homicide. So for a demographic to be seeing 20 to 30% of its deaths in one year to COVID is incredibly catastrophic. I also notice that the Dakotas are very high on the list for their over 65 demographics and wonder if they saw much higher excess deaths due to a lack of government COVID controls (they both seem to have been known for a more Sweden-like approach). I also see a couple of under 65 demographics on this list, both from New Jersey. That would be an interesting thing to analyze.

Data Table – Sorted by Lowest Excess Deaths over 2017-18 average

Data Table sorted ascending by Excess Deaths

Takeaway on Demographics with Low 2020 Excess Deaths

The above is a very interesting way of looking at the data. What does it tell us? 2020 was a very safe year to be under 14. Why was this? I’m not sure, but I’d guess that many causes of deaths for these groups were avoided this year due to locking down at home. Car accident deaths, other accidents, possibly flu and other viral diseases, etc., might have been in very short supply for these younger groups. Interestingly, New York State has two older demographics in this list. If one looks deeper at this, one finds that New York has really low numbers of excess deaths in general. The chart below shows the raw numbers of excess deaths for the over 85 age demographic across all 50 states. You’ll notice that NY actually experienced LESS deaths in this demographic than expected. Perhaps this has to do with the lower mobility of this group during COVID (a good number are likely in nursing facilities where they can’t come and go or receive visitors).

Excess deaths by raw count in over 85 demographic

One big takeaway, however, is that if one was to evaluate the “silver lining” of 2020 by measuring Years of Life Lost, the low incidence of deaths in the younger groups would certainly carry a lot of weight.


  1. COVID does not seem to be the overwhelming contributor to excess deaths across all demographics in 2020. This does not seem intuitive, but the data (assuming it is accurate) does make a strong case that the most impact of excess deaths in 2020 went to demographics who had lower incidence of COVID-19. This is calculated on a percentage of excess death basis, not raw numbers. Perhaps this is the right way to look at the excess deaths though, as compared to raw counts. It does capture surprise and impact to the affected group to look at the percentage of change. Plus, if one simply subtracts the COVID percentage from the Excess Death percentage, many of the younger demographics high on the list would still have well over 110% excess deaths. The older demographics do not see this same effect. What does this mean? Some demographics (younger adults and older teens) experienced significant excess deaths due to something other than COVID in 2020.
  2. COVID did have a terrible impact to a large number of the older demographics across many states. Some of these demographics saw numbers of COVID (and pneumonia and flu because they’re hard to separate out) in 2020 that ran between 20 to 30 percent of the 2017-18 average. This is in the range of heart disease and cancer, each of which contribute to 20-30% of all deaths in a normal year. These groups made up the overwhelming majority of deaths in all regions during 2020.
  3. For Americans under 14, however, 2020 was a very safe year. Much safer than the 2017-18 average. It appears that though the COVID responses may have had an adverse impact on some demographics, it had a very good impact on the folks under 14.

Cumulative Case Charts from Around the Country – 1/4/20

I thought folks might find it interesting to see the cumulative case rate charts from a selection of states and counties. There are interesting things to note in all of these. Remember that these are all just measuring cases and that cases are a strange measure due to the wildly different response different people have to the COVID-19 virus. Still, case growth is a leading indicator for hospital overload and deaths…

Note: in all of these charts, I am choosing to show “non-normalized” data. This means, when you see an “IROC-Confirmed-Case” number, that refers to the instantaneous slope of the curve. This slope will represent the number of new cases per day for the curve, and is often a better indicator of the case growth than moving averages or other such approach. Therefore, when comparing charts, keep in mind that the visual slope of the line is the better comparison than the numbers (because larger places like LA county will most likely have more cases due to their population). Also, I start each chart at the point they hit 5000 cases.

Also, the light blue line might be confusing to people. I am modeling the entire outbreak in a region with a fourth-order (quartic) polynomial equation and this equation is plotted in blue. You can see how the red “actual” datapoints often align strongly with the quartic equation for the region. I’m not sure if the fact that I can fit the whole outbreak for a region with a quartic is interesting or not, but I do know that the quartic emerges often in fields like optics and the propagation of waves through a real-life transmission line (like a copper wire). I wonder if a virus propagating through a real society is a similar application?

Pima County, Arizona, 1/4/20

I’ll lead off with the chart of my home county. Pima is the location of Tucson and has over 1M in population. Mask orders have been in place since early June, but my observation was that a lot of residents were already observing mask orders well before that. The whole county has a curfew currently in place (I assume it is to reduce the numbers of people at bars?). You might note that the curve was solidly accelerating from about Halloween until Thanksgiving and then it started into a linear phase comprised of a bunch of positive and negative oscillations. This is to say, that the case rates have slowed for a few days, then sped up, and so on. Note that the last data point is a big jump over previous days. My guess is that this is an anomaly due to state DHS people taking holidays and accumulating numbers differently than in the past. Today’s data point (not shown here yet) is much lower, so I’m curious about whether the deceleration trend continues or not.

Maricopa County, Arizona, 1/4/20

Maricopa county is the largest in the state with over 4M residents. This is the location of Phoenix, Mesa, Tempe, etc. Note that the latter part of the chart looks similar to that of Pima County. You can see more of the Maricopa curve since they hit 5000 cases earlier than Pima. Right now Maricopa communities have individual mask ordinances in place (some cities cancelled theirs then brought them back as cases surged. I’m unaware of any curfews in Maricopa, but that may be from lack of looking. Maricopa County is also the location of a number of large kids soccer tournaments back in late November and early December that were notable due to media attention (lots of teams from California were participating since they can’t practice or play in California). I don’t see any evidence of case surges due to these tournaments… rather, it appears like cases decelerated all the way from Thanksgiving until about Christmas day.

Los Angeles County, CA, 1/4/20

Currently, California has the highest case acceleration of any state and it does seem like LA county is a big driver. Note that the curved formed by the red datapoints is steeper than the blue line would model. This is very surprising to me in light of California’s significant COVID-19 restrictions. One might speculate that high density plus low evening temperatures (in my previous entry or two I point out that most of these surges started when night temperatures fell to 50 degrees) could be leading to the really steep slope in California. However, density might not explain it, as I’ve noted in Arizona that the most dense zip codes tend to have lower case growth than the less-dense zip codes. Regardless, the situation in CA is a puzzle.

Orange County, CA, 1/4/20

Orange County’s case curve looks similar to LA’s except it seems to fit the quartic model better regarding acceleration.

Harris County, TX, 1/4/20

Now we’ll switch to another large county, Harris County in Texas. This is the location of Houston. Note that it’s case slope is much flatter than any of the other regions. It appears to be accelerating slightly. One thing I’d note about Harris County that is absent in the previous charts, and that is humidity. Whereas Arizona and California are very low in humidity this time of the year, Houston is somewhere around 70-80% humidity. It has been observed that the virus transmits most effectively in lower temperatures and low humidity.

New York State, 1/4/20

Above is the chart for the state of New York. It looks similar to California, just with a bit lower slope. The state also seems to have the same oscillation pattern over the last two weeks that AZ and CA regions have.

Pinal County, AZ, 1/4/20

Now to change the pace again, here’s the third largest county in the state of AZ, Pinal County. This is a mix of rural and suburban communities, probably leaning more towards the rural. To my knowledge they don’t have any county mandates in place for COVID-19 and their characterization through 2020 regarding COVID leaned more to the individualistic rather than the collectivist. Pinal saw some notable tapering off of the visual slope of cases around 12/15, but some of this is probably anomalous due to the large jump that happened around 12/13. This has the appearance of being a data collection glitch. Note though, that this county’s trend is to fall below the blue quartic model line.

Finally, here’s a picture of a much smaller county in AZ, Cochise County. This is a primarily rural county with a medium-sized retiree population. Cochise had very few cases for the longest time but they’re getting close to doubling their case number since mid December. This is another area that has hit lows of 50 degrees and has low humidity. It is a bit higher in altitude than Pima County.

Normalized Data for Selected States

Cases per 1000 persons, selected states, 1/4/20

Here’s a chart where I have normalized the data by population to give better comparisons. I’ve selected a handful of states, including a number from the section above. What do we see across the board? General slowing of cases across the southeast, a bit of acceleration remaining in California and New York, and a bit of “uncertainty” in places like Georgia and Arizona.

Arizona Data Update. Is the Current Surge Slowing?

The diagram below shows COVID rates per 1000 person of the affected age group. So for example, currently the 55-64 year grouping has seen close to 100 members out of the total Arizona 55-64 population come down with COVID-19. This helps us understand how each age group is affected by COVID. We know that there are far more 20-44 year olds than any of the other groups, so obviously they have more COVID cases. But when you divide by the total number of 20-44 year-olds in AZ do they see a greater rate of infection? As you can see below, no, they’re near the top, but they’re actually below the 55-64 group and right equal with the 45-54 group. This is pretty interesting and gives us some food for thought:

  1. The 20-44 and 45-54 groups make up the majority of the active workforce and are more likely to be taking public transportation to work.
  2. These two groups are also probably more likely to be working out in gyms and going to bars.
  3. From my observation (no data to prove this) all the groups other than the less than 20 appear equally likely to be going to restaurants.
  4. Why has the 55-64 group accelerated ahead of the other groups though? You can see they were right there with the 20-44 and 45-54 groups right around mid-November. One guess that I might hazard is that this is a reflection of growing numbers of 55-64 aged persons in the state from winter tourism. This would cause the number of persons in this group in the state to increase from the standard number of permanent residents that I use for the rest of the year. Note that the slope of the 65+ group also seems similar since Mid-November to the 55-64 group, but the slope of the 20-54 grouping (most likely there are few winter visitors in this group) is lower. So I suspect we’re seeing an increased COVID rate in AZ since mid-November due to an influx of winter visitors.
  5. Note that the 20-44 and 45-54 groups have the appearance of a decreasing slope and the under 20 group definitely has the appearance of a decreasing slope. This makes me think that the disease is slowing in its transmissability and we will continue to see case rates flatten out in the next week or two. Other reasons I might think this is because hospitalization rates have been slowing for a week and a half and death rates in the under 65 groups have also slowed significantly.

The below is the hospital bed comparison from the AZ DHS COVID dashboard. The red bars represent the % of ICU beds (in this case) that are in use by COVID patients. The dark grey is all other patients and the light grey is available beds. Note how the red bars are starting to trend over. This is a sign that the hospital COVID recoveries (and deaths, I suspect) are starting to exceed new admissions. This could be a false alarm and the hospitalizations will spike, but it doesn’t seem to be the way this disease works. Note that there are no real secondary spikes from the previous summer spike.

The second chart below is the one I have been maintaining that is more of a curiosity to me than anything. However, it has been interesting to note that at the same time that the AZ DHS bed usage chart started to slow, the percent of 65+ year olds that was hospitalized one week after being diagnosed with COVID started to trend down too. If you scroll a bit on this site you’ll see that for a long time, the 65+ year old trend line (maroon) sloped upward while all other groups sloped downwards. Now they all slope downward, indicating that less people are being hospitalized after being diagnosed with COVID.

Below is another sign that the potency of the disease in society (note that the words “in society” are important here) is slowing. The below chart represents a comparison of deaths in people over 65 due to COVID with those in everyone else in the state. The green line is the 5 day moving average of the ratio of over65 to under65. Note that during the summer outbreak the ratio was pretty constant at around 2.5 deaths over 65 to 1 death under 65. During the virus’ “off-season” between August and November the ratio was all over the place because there were very few deaths. However, once the winter surge started, the ratio has been steadily increasing and hasn’t really gone down to the ratio of the summer months. What might this mean?

  1. Perhaps the most susceptible people under 65 already died during the summer outbreak (or maybe they’re laying really low right now)? It does appear that people under 65 are far, far less susceptible to COVID absent comorbidities.
  2. The overall death numbers in the over65 population do appear to the eye to be around the same as during the summer (but from my zip code analysis, they seem to be distributed more widely across the state). I wonder if this implies that there is a fixed number of people whose immune systems are “rigged” to fail under attack by COVID? Rigged, of course, through the mysterious operation of some unknown genetic markers or existing conditions of the immune system?

Finally, here’s a view of the case rates in both Pima (Red) and Maricopa (Blue) counties compared to the overall testing per day in the state (yellow). Note that testing peaked around Thanksgiving (probably people hoping to get a negative test prior to a Thanksgiving gathering). Testing seems to have fallen from Thanksgiving until New Years. Note that as the first wave flattened off around the start of August, testing decreased steadily. Since testing is an indicator of people that think they’re sick or think there may be a reason they’ve been exposed, it may well be that this is a sign of the surge slowing.

Arizona Data Update – 12/24/20

Here’s a quick Christmas Eve update of the Arizona Data by Zip code. This is a pretty interesting dataset because it allows me to look into much smaller areas than counties and see more granular trends. I can sort this data a number of ways and generally compare one or two weeks of data by zip code to see where the trends are. The below table is sorted by the percent growth in the zip code, but I also like to look at the data sorted by Normalized Growth (growth per 1000 persons in the zip code area).

As you can see, Pima county is topping the list for percent growth. This represents a surge in cases that is independent of the population in the region. It really surprises me that zip codes in Pima county are at the top of this list because Pima probably has the most restrictive anti-COVID rules in the state (including an evening curfew). The 85710 zip code is interesting because it is an older suburban area East of Tucson and the Median Age is over 40. This is kind of unique for a zip code in the top 10 by percent growth. Note that their normalized growth as a percentage of the population is very low, so what this tells me is that this zip code has had very little impact from COVID until the last week and then they saw a 23% surge. Note that the zip code in Douglas, AZ, (on the border with Agua Prieta, MX) has very different statistics. They saw almost 19% growth last week but have had a much bigger impact from COVID (their normalized growth is 4x larger than 85710). That means that their population was 4x more saturated by COVID last week than 85710. The next four zip codes by % growth are Pima county zip codes. The one with the highest normalized growth is 85706, which has had higher overall case numbers and also experienced higher normalized growth. Incidentally, it is also the zip code with the lowest median age and median years of education. Not sure what that means, but if I were to guess, I’d say this zip code has more people that can’t work at home.

Comparison of Normalized Case Growth with Population Density

Earlier I posted this interesting visualization showing how the zip codes with the lowest population density seemed to be experiencing higher than normal cases per 1000 persons. Below, I re-ran the data for last week and we see that at least for last week this interesting trend continues. There may be a multitude of reasons for this, but one interesting possibility is that the areas with higher population densities have more effective COVID restrictions in place and the areas with lower population densities may have less or no restrictions in place. This could be over-simplification, but it does seem to be the case that the counties with the most restrictive COVID measures are the largest by population.

Comparison of normalized Case growth from 12-13 to 12-21 with Population Density (AZ Zip Codes)

Is the Arizona Surge Slowing?

Quickly, below is the cumulative case curve for Arizona as of 12/24. The blue dashed line is my polynomial curve fit, which as you can see has fit the red daily cumulative case counts very nicely until about a week ago when we saw the case acceleration rates start to slow. As I’ve noted in the past, the slowing acceleration rates are generally an indicator of the slowing of a COVID surge. So keep your fingers crossed. I also have another post that I recently put up that is looking at decreasing hospitalization as another leading indicator of the slowing of a surge.

Arizona: Is the Hospitalization Surge Easing? – 12/22/20

Here are a couple of visualizations that give me hope that the hospitalization surge is easing… The below is from the AZ DHS Dashboard. Red bars represent the percent of ICU beds being used by a COVID Patient, Dark Grey are ICU beds in use by non-COVID patients, and Light Grey is available beds. Note how the last few days have seen a flattening of the COVID percentage. This is the behavior we noted as the first Arizona ICU bed surge happened in July.

AZ CU Beds Available and In use (from AZ DHS Dashboard) 12/22/20

See below for my own metric, percent hospitalized today compared to cases from one week ago. If you scroll down to a previous post, you’ll see that the over65 group was still trending up on this metric. About a week ago (when I first noticed the AZ DHS metric flattening) the over65 trend on my metric flattened and then started decreasing.

Percent of current day’s hospitalization to new cases from one week previous. 12/22/20

None of this is certain, of course, but maybe these are leading indicators that the virus is starting to run its course. Looking at the AZ Cumulative Case curve below, you can see that the current outbreak (which started in late October) is getting close to two months in duration. Since I’ve noticed in other states that non-linear case surges last about 2 months in states that enforce COVID protocols, perhaps we’re nearing the end of our winter surge?

No One Can Prove that Thanksgiving didn’t Accelerate COVID, but the Data Indicates that it is Highly Unlikely that it Did.

Since the Thanksgiving Holiday season I’ve seen a number of major media outlets leading with stories about how Thanksgiving led to an increase in cases. Examples from WebMD, NPR, and others. So is this real, or is it an example of Confirmation Bias? It’s hard to know and even harder when one looks at just short-term trends.

Parts of the justifications of these publications for their assertion that Thanksgiving led to an increase in cases is that contact tracing has discovered a number of cases that can be traced back to Thanksgiving gatherings. The NPR article reported that:

“We are seeing a tremendous surge in cases in many locations around the United States that are associated with the Thanksgiving dinners, family get-togethers and social events,” says Michael Osterholm, an epidemiologist and director of the Center for Infectious Disease Research and Policy at the University of Minnesota. Much of the evidence comes from health departments that are tracing clusters of cases, but Osterholm suspects that hospitalizations and deaths — “lagging indicators” — will reveal the full impact in a few more weeks.

So, can we determine that the Thanksgiving Holiday gatherings were causal for increased case counts?

Lets start by looking at the data. Below, I picked a few different states to compare their case rates in one chart. Since I’m normalizing the raw case counts by the population of the state (actually, per 1000 persons), I’m able to compare cases in a relatively “apples to apples” way. Therefore, we see a number of things in the chart below…

Select states cumulative case growth per 1000 residents since mid April 2020.
  1. The Dakotas ran up to the highest numbers of cases per 1000 residents in the country. Their surge started around mid- to late-August where it appears that they transitioned from linear case growth to non-linear (3rd-degree polynomial) levels of case growth. During this latter stage, the growth rate increased every day (it accelerated, actually) until somewhere around mid-November when the cases began decelerating. You can see when this happens by looking at where the upward curve switches to a downward curve. Nebraska seems to have started decelerating around the same time. The Dakotas’ period of case acceleration appears to run from mid-August to mid-November (3 months) whereas in Illinois it ran from late-September to late-November (2 months). We see a similar outbreak range of 01 October to the end of November for New Mexico, which has had some of the strictest COVID policies in the nation. I’m curious if this is a sign of an effect of the stronger government COVID policies in Illinois and New Mexico, but this would take much more analysis to prove.
  2. While the Dakotas were surging, California (the light blue line) was maintaining linear case growth. However, sometime around mid-November, California’s linear growth began accelerating and you can see that their rate of acceleration (the highest in the country right now) is starting to approach that of the Dakotas from mid-November.
  3. We can also see a handful of other places where states transitioned from linear case growth to non-linear case growth. I’ve tried to eyeball these and place a blue diamond where I think the transition occurred. After the transition, as a reminder, every day the case growth rate increases. I did a quick peek over at to see what the high and low temperatures were in the largest city in each “blue diamond” state during the timeframe the transition from linear to non-linear growth occurred. In most to all of these cases, the non-linear transition occurred during a notable weather shift where the night-time temperatures went from somewhere in the 60’s or above to 50 or below (degrees F). In some cases, the low temperatures dropped more than 5 degrees in a day or two.
  4. I’ve marked the Thanksgiving holidays with a blue rectangle. At least of the states represented here, none of them had a linear to non-linear transition after Thanksgiving.

Since the NPR article mentioned a surge in the Southeast, I re-ran the code that generates the above chart using different states, mixing southeastern states with other warm states as well as NY and Mass. See below. Tennessee has a very high rate of acceleration right now (almost as high as California), so you can see that it is curving strongly upward. It seems like it’s inflection point between linear and non-linear happened sometime in early November. Looking at the other SE states, I see inflections in similar timeframes. I don’t really see any states here that were linear until Thanksgiving and then go non-linear (signal of a major outbreak). Since I live in Arizona, I paid special attention to the Arizona curve. You can eyeball on the green line below that all was fairly linear until mid- to late- October. Guess what, accuweather (see image below the curves) tells us that Phoenix had it’s daily low temperatures crash from 69 degrees to 54 degrees on October 26th.

Select states cumulative case growth per 1000 residents since mid April 2020.
Phoenix October 2020 daily highs and lows for Mid- to Late- October

The Rate of Deaths per 1000 – the Lagging Metric

Below is a different metric that might give us an insight. These are the top 8 states by Cumulative Deaths per 1000 persons. The initial states that were hit hard by COVID back in May experienced much more than their shares of deaths for reasons that are probably fairly obvious… the virus was new and these states were first up to bat. They made mistakes as well as breakthroughs in how a community would respond to this virus and that resulted in higher death rates. But note that after June their death rates flattened off or at least became linear. The Dakotas are a very interesting comparison, however. They experienced very few deaths during the first six or so months of the COVID pandemic but then saw pretty high death rates (which are still increasing at a fairly high rate) ever since. In just the last month or so, though, the northeast states have seen a transition from flat or linear death rates to non-linear. But the slope of the current increase is pretty low. So what might all this tell us?

  1. I suspect many of the people who died in the Northeast during May and June contracted the disease before anti-COVID policies (Masks, Lockdowns, Improved Retail cleanliness policy, etc.) went into place.
  2. I also imagine that ND and SD didn’t have a whole lot of COVID floating around early on. The weather was nice and people likely were outdoors, where evidence is showing that transmission is less likely.
  3. I hear anecdotally that ND and SD had no official policy about Government COVID intervention. I haven’t checked this, but it is what I heard and that seems to make sense as those states have a more independent streak to them. So what we see on their death rates is what happens absent a defined policy. My suspicion is that like most other states, their first death wave is in the susceptible community of people who have susceptible immune systems.
  4. Right now the death increases in the Northeastern states appear like they will be much less severe than their earlier deaths.
  5. As the Dakotas’ case rates have already slowed down and are decelerating further, I presume that their surge is over for a while. At some point I’d imagine that their deaths would flatten off too.
Top 8 states by deaths per 1000 residents since mid April 2020


  1. Though the articles state that contact tracing data indicates that a high percentage of current cases stems to Thanksgiving gatherings, I can’t see any evidence of a surge of cases in any state that started after these holidays. What might this mean? First, as with any subjective human measurement and data collection system, I don’t think contact tracing is anywhere near 100% accurate. COVID is everywhere these days and there may real difficulty determining if new cases were acquired during a holiday meal (or if they were acquired at the grocery store, or the office, or the Starbucks that one stopped in at on their way to the gathering). Second, if Thanksgiving led to a surge and the existing transmission rates just before Thanksgiving held constant, then we would see it in an increase in the existing case acceleration. I think that would be a hard case to make looking at these curves.
  2. COVID is very complex because it is interacting with a highly complex society. As such, attempting to find one causal reason for anything to do with COVID is probably going to be frustrating. That said, there does seem to be a strong correlation with temperature and COVID transitions to non-linear growth. I haven’t checked each one of these states (feel free to go off and check the others and report back!), but in the cases where I did, it seemed to be where there was a sharp fall in the nighttime temperatures.
  3. The concept of seasonal outbreaks of influenza has been investigated for years, but recently there is consensus around the causality of temperature and humidity for influenza outbreaks (see paper from the Journal of Virology). The temperature number that the linked paper references as being ineffective for influenza transfer is 30 degrees Celsius (86 degrees F). The paper also states that influenza transmission is highly efficient at 5 degrees C (41 degrees F). I’m not aware of any top-notch papers on the effect of temperature or humidity on COVID, but the NIH has a nice summary of around 20 primarily non-peer-reviewed papers on the subject, most of which found that COVID has higher transmissability in colder weather and less humid conditions. One of the papers they summarize indicates that COVID survives and transmits most effectively between 13-19 degrees C (55 to 66 degrees F) and 50 to 80 percent humidity. This seems to line up nicely with the weather during the times where states transitioned from linear case rates to non-linear case rates. It would make sense that a healthier, happier virus would be more effective at infecting its targets (us!).
  4. The Oxford Dictionary defines Confirmation Bias as “the tendency to interpret new evidence as confirmation of one’s existing beliefs or theories.” As such, the observations that most states were already in non-linear growth regions well before Thanksgiving and lack of any real evidence that any change in these acceleration rates occurred after Thanksgiving makes me qualify most of these articles about the Thanksgiving outbreaks as likely colored by confirmation bias (I’m sure we all saw lots of articles before Thanksgiving on how it would result in significant case surges). No one can prove that Thanksgiving DIDN’T create any increase in case growth, but there’s really no good evidence to indicate that it did.

Bonus – Top States by Case Acceleration and by Case Deceleration

Note that California has the highest case acceleration rate in the country. This means their IROC_Confirmed Case Slope (New Cases per 1000 residents per Day) will increase by .1211 or higher tomorrow. Note that North Dakota’s case acceleration is still decreasing and appears to be near to the point where they have just a handful of new cases per day.

Top US States by Case Acceleration (dIROC_Confirmed) on 12/22/20
Top US States by Case Deceleration on 12/22/20

Arizona COVID Update – 12/14/20

Here’s a bunch of Arizona (and some US State data) that touches on case growth as well as death and hospitalization trends.

Case Growth for top 10 Zip codes by raw cumulative case count. 12/14/20

Above is an interesting chart showing the case growth trends for the 10 zip codes in the state with the highest case counts. You can see a few things here:

  1. Yuma again has the fastest growing zip codes (the dark blue and orange lines at the top). The next 8 highest are a mix of Phoenix and Tucson (and a couple of suburbs of Phoenix).
  2. The aqua line that looks weird is likely a data error that happened a while back. Note how this line doesn’t follow any of the trends that the others do. It appears like whoever was collecting data messed that zip code up on 9/12 and kept messing it up a little bit undil 12/12 when they “fixed” the data suddenly. I point this out because this is pretty common. I presume that the state DHS collects this data and manages it (it does come from their site) and it does seem to be their habit to suddenly “fix” data. That’s probably better for them than backdating it due to their unwillingness to share historical data (their dashboard only shows the current day, so to build these plots I have to scrape the data manually every single day). One outcome of this habit is that much of their data is not very trustworthy (Hospitalization is a good example. They have messed that up multiple times).
Case Growth for top 10 Zip Codes when Normalized by Population. 12/14/20

Showing the same data as the previous chart above, except this time it’s normalized by population. This shows that the Somerton zip code of Yuma County is far outpacing the others per 1000 residents.

AZ Zip Codes with highest percent increase over last two weeks. 12/14/20

Above are the Zip Codes in Arizona with the largest “surges” in COVID cases over the last two weeks. This is a percentage of their previous case count, which isn’t the metric to end all other metrics, but it is interesting to note that for whatever reason, that region experienced an unusually large surge. In this case, we see Yuma County with the highest surge in 85349, followed by two zip codes in Coconino County. After that it is a mix of the two largest counties, Maricopa and Pima. Interesting things to notice:

  1. The median age of all of these is pretty low, as is the median income. One exception to this stands out, 85383 in Peoria, AZ, which has the highest median age and median income (the two do tend to go together, of course). So it would be interesting to study this zip code to see what happened.
  2. The two Flagstaff zip codes have very low density unlike the other zip codes which (other than Peoria above) are very high density. These two zip codes are north and east of the city of Flagstaff, so they are more rural. Both have a relatively sizable population of Native Americans. These represent areas that someone ought to investigate.
U.S. States sorted by Case Growth Rate (IROC_Confirmed). 12/14/20

Above we can see US States sorted by Case Growth Rates (IROC_confirmed). Note that some of these states case rate numbers are rising (Red Up Arrow) and some are falling (Green Down Arrow). What I have noticed is that once the dIROC_confirmed column goes down to near zero (or negative), the case growth flattens out shortly after. I saw this recently with North and South Dakota, and that did prove to be the leading indicator that their case rates were flattening out. I suspect that Indiana, New Mexico, and Utah are now through the worst of their winter outbreak (they all started before Arizona and California, who are now both on the rise). This is a good field to watch to understand when a state will stop accelerating in growth.

U.S. Counties sorted by change in Case Growth Rate (dIROC_confirmed). 12/14/20

Above we can see that two Arizona border regions have risen to the top of the list of Counties with the highest case growth and acceleration rates again. Santa Cruz and Yuma counties both had large outbreaks during the Summer and I had hoped to see them be relatively unaffected during the winter, but that seems to not be the case at all. Val Verde County in Texas is another border county, which makes me wonder if there’s another big outbreak in Mexico (I haven’t been looking).

Comparison of Arizona Over-65 and Under-65 Deaths per day and the Cumulative Case Curve for all demographics. 12/14/20.

I’ve showed this chart once before, so here it is updated. You can see a few things here:

  1. The case growth for Arizona (the orange curve) continues upwards unabated. You can see this in the tables above too, of course.
  2. Over-65 deaths continue to be the large majority of deaths. Even when the data isn’t normalized by population, the over-65 group (only 13% of the state’s population) dominates the death numbers. You can’t see this easily in this chart, but the ratio of over-65 deaths to under-65 deaths has risen from 2.8 during the first outbreak during June-August up to 3.7 since late October. This seems to indicate that the disease is either more dangerous for the over-65 group this time around or that it is less dangerous for the under-65 group. I’d lean towards the latter since the overall death numbers are still lower during this winter outbreak than they were during the summer outbreak by quite a bit.
Hospitalization Trends by Age Demographic. 12/14/20

The above chart is more experimental than anything. I was curious about what the ratio was of hospitalizations per day divided by the number of Cases from one week earlier. Then I calculated this ratio as a percentage for each age demographic. In theory, this represents the percentage of people that have a COVID case confirmed and then enter the hospital one week later. This isn’t a perfect metric (what if they enter 2 weeks later?), but it seems interesting and the trend has been pretty consistent for a while. Note the what I have done to see the trends is to fit a trendline to the data for each age group. The over-65 trend line slopes upward (maroon-ish color), which may indicate that the hospitalization is increasing for over-65 people as a percentage of over-65 people getting confirmed cases one week previous. For some reason, though, this ratio is decreasing for all other demographics. This may be meaningless (there’s not a whole lot of data yet), or it may indicate that the likelihood of going to the hospital due to COVID is decreasing for everyone but over 65 age people. I’ll keep building and tracking this.

Interesting Visualization Comparing Arizona Summer and Winter COVID 0utbreaks.

Comparison of Over 65 and Under 65 Deaths in Arizona due to COVID along with the cumulative case counts. 12/1/20

Above is an interesting way to look at the two outbreaks we’ve had in Arizona and the cumulative number of cases (useful because it shows us the case trends).

  1. Note that the deaths seem to be higher during the summer outbreak than during the current one considering the rate of case growth. During this current outbreak the deaths are so far staying under 50 per day, but back even in the earlier phases of the summer outbreak they were inching up to 100 per day.
  2. Also, the deaths are just the raw number of deaths and aren’t normalized by the respective populations. What this means is that the red lines represent the total number of deaths over 65 years old (about 13% of the AZ population) and the blue represent everyone else.
  3. Deaths during the current outbreak have a ratio of 2.95 deaths over 65 to 1 death under 65. During the summer outbreak the death ratio of over 65 to under 65 was 2.31. This is a pretty big difference and indicates to me that the virus might be getting less deadly for society as a whole. If I knew exactly how old the people dying were it would help (if they average 85 that’s much more informative than just knowing they’re over 65). This may indicate that the “Years of Life Lost” due to COVID is decreasing.
  4. In the chart above, the state had lockdown restrictions in place until May 15, then most counties put mask requirements in place on June 9th. Early October is when most of the second set of restrictions on bars, gyms, and movie theaters were lifted. It doesn’t seem like any of these dates are correlated with anything the virus did. Seems like it has it’s own mind…

COVID-19 Arizona: Where are the New Cases?

I decided to evaluate the new cases from the current outbreak differently. Previously I was interested in where case growth as a percentage of previous cases. This may be a useful metric, because it signifies anomalous case growth in a specific location. Presumably that info could be used by a public health organization to target localized outbreaks.

However, perhaps much more interesting would be Zip Codes where high case growth per 1000 residents is happening. The chart below shows this metric. Both the cumulative number of cases (blue) and the last month’s case growth (orange) are normalized by the population of the Zip Code.

What does this chart tell us?

  1. On the left of the chart, we see the zip codes (see table below for better visibility into this portion of the chart) that have had the largest number of cases per 1000 residents cumulatively (since the beginning of COVID). A couple of these regions have seen very high growth in the last month. But as your eyes move rightward, you can see some regions that had experienced high COVID cases in the past that had lower numbers of outbreaks in the last month. And of course, other areas (the peaky orange lines) have experienced very high numbers of cases in the last month. It would be good to understand why some regions have had worse outcomes over the last month than other regions. We’ll evaluate some of this below looking at the table.
  2. The general trend does seem consistent, though. Regions that experienced higher numbers of cases during the summer outbreak are in general experiencing higher numbers of cases during the current outbreak. I was hoping to see a different trend (that might have indicated immunity in some regions) but will keep watching for that trend to emerge.

Details of the Normalized Case Growth

The below table is sorted by the Cumulative Cases per 1000 in a Zip Code. The Growth-Norm column represents growth in Cases per 1000 over the last month. Note that some regions that have experienced high case growth up to this point didn’t have nearly as large of Case Growth as other regions that had experienced similarly high cases in the past. These are circled in green. You can also see regions with larger than expected case growth circled in red. Are there any factors that might be correlated with this lower and higher amounts of growth?

The first thing that I rule out is Education and Median Age. These on the surface don’t seem to be related. Some regions with lower median age are right next to regions with 15 or so years higher in median age. The same applies for education. What I do see, however, is a trend with population density, where regions with higher population density seem to be seeing lower COVID case growth per 1000 people. This might make some sense if you think about how regions with high density generally always have high populations, and therefore, a larger denominator in the growth per 1000 person equation… However, this also means that the numerator (the change in case count over the last month) is disproportionately low. Which, I think is interesting. Why would there be less cases than expected in regions with higher density? Thoughts:

  1. I wonder if this might be an indicator of the effectiveness of government interventions (mandatory masks, school restrictions, etc.)? Since all the data I’ve seen indicate that school restrictions aren’t resulting in large numbers of case reductions (regardless of whether they’re in school or not, every study seems to be showing that people under 15 don’t really transmit the virus), and most regions don’t have restaurant/gym/bar closures now, I’m assuming if it is anything it is the mandating (and compliance!) with the mask restrictions. Unless someone can chime in with a different idea…. Compliance is an interesting thought, because it seems like in a more populous area, there appears to be more social pressure to comply with COVID restrictions. Whereas, my observation is that in less dense areas, the social pressure is much less.
  2. Also interesting to me is the resurgence of cases on the border. These regions were very quiet ever since the summer wave slowed down and generally went from having the highest case rates in the country down to the very lowest. But now we see Yuma and Santa Cruz counties experiencing case growth again. Also, the South Mountain region of Phoenix (85042) is also experiencing another surge in cases. But looking at the highest number of new cases per 1000 over the last month, we see some interesting places. Page (up near Lake Powell), Cottonwood (near Prescott), and Douglass (on the border, but only lightly affected during the summer border rash of cases) all are near the top of the list with 85350 in Yuma.
Zip Codes Sorted by Density (orange) with last Month’s Case Growth in Blue.