todnewman.com

September 12, 2020

COVID-19 Arizona Case Growth – 9/11/20

Arizona has seen its case growth numbers head towards zero for the last few weeks, but there may still be some value in exploring how the infection is affecting the state. Remember, tracking COVID cases is not useful in itself. Cases are a strong leading indicator, of course, of things we structurally care about as a state, such as hospital overtaxing and ultimately deaths. I believe this is the most productive mindset to have when approaching cases. Here is what the state’s cumulative Case curve looks like now.

We note that the Instantaneous Rate of Change (IROC) of the curve has now dropped to somewhere around 790. The trend is decreasing, however, as you can note about 4 days in a row where the rate appears to be approaching zero. We have three to four days of anomalous data from about 9/2 to 9/4, where the state appears to have been capturing University Antigen tests as confirmed cases. As the U of Arizona learned, at least, many of these Antigen positive results have turned out to be false positives when checked with a subsequent, more accurate PCR test. It appears from the data that 60-70 percent of the Antigen positive results are false positive. Since this realization, the state appears to only be counting the university cases if they’re confirmed with a PCR test. But not doing this for 3 days or so appears to have inflated our case numbers. Enough on that.

Zip Code Case Growth Update

Top Thirty Zip Codes by Increase in COVID-19 Cases from 9/5 to 9/11

This map doesn’t look much different than the previous week’s case increase map, except that there appears to be a bit higher numbers in Flagstaff (home of Northern Arizona University) and Prescott (home of Embry-Riddle University). But by far, the top two zip codes in case growth over the last week continue to be the homes of the University of Arizona and Arizona State. This is true even though the numbers of cases reported have dropped a bit due to only recording the cases confirmed with PCR.

Table of Zip Codes

Top 12 Zip Codes by Case Growth, 9/5 to 9/11

The main thing to note here is that the top two are Tempe’s and Tucson’s University zip codes. Snowflake’s showing up as number three is a bit deceptive. They had 11 new cases this week, but they’ve only had 128 cases to date before this week. The 11 might be from one significant spreading event, or it could just be random noise. The 85009 zip code in Southwest Phoenix has been one that has had a handful of case spikes since Memorial day. The 200-ish new cases in that Zip code could be significant, especially since the Mexico-related infections from a month or two ago seem to have slowed significantly.

Conclusion

Data indicates that COVID-19 might be in the process of burning itself out in Arizona. For now at least… It will be interesting to see if the University cases lead to increased hospitalization numbers in their demographic about a week from now (so far, there hasn’t been any change). With this Zip Code approach above, we can also track if the University cases are spreading to adjacent or other Zip Codes.

September 6, 2020September 10, 2020

COVID-19 Arizona University Area Outbreaks

Below you’ll see the Arizona Zip Code map of Case growth in the last week. Color of the bubbles represents the % growth in cases over one week. Size of the bubble represents population size of the zip code. What do we see?

1. We see two zip codes with growth far greater than any others. 85719 (U of Arizona) and 85281 (ASU) come in at 38% and 23% growth in cases over the last week. The next highest zip code is in Buckeye and comes in at 7.3% growth.

2. Flagstaff comes in around 4.3% growth. Perhaps they party less at NAU, or maybe there are less cases at altitude?

3. The below map only shows the top 30 zip codes. Most of these are under 5% growth.

4. Right now I’m doing this to see if the university cases spread outside the university areas. My hypothesis is that they will remain contained and the infection will burn itself out in those zip codes. I’ll be watching this and publishing results about every week. I’m also watching the hospital stats closely to see if the university case growth will result in increases in hospitalization.

Arizona map of top 30 zip codes by case growth between 8/30 and 9/5/2020

Table showing top 18 zip codes by case growth between 8/30 and 9/5 and some info about each zip code

UPDATE

Apparently, it turns out that some of the numbers from the Antigen tests have been false positives. The U of A admitted this and in doing so, it became clear that positive Antigen tests are going to the university health center to take PCR tests to confirm. Initially, the state was counting all of the Antigen positive tests as positives overall, but that seems to have stopped. Recall my earlier discussions about specificity and false positives. Any time a test has a specificity of around 97 or 98% and the disease is infecting only about 2-3% of the population you’re going to have about 1/2 false positives. See the university’s chart below. If my detective work is correct, all 109 Campus Health tests below were on people who had come up positive in previous days/weeks on the Antigen test. If true, then there’s about a 60% false positive rate (which makes sense based upon the possible specificity of the Antigen test and the rate of infection on campus). Will keep watching this, but it seems less concerning than before.

September 4, 2020September 4, 2020

College is Back in Session. Did COVID Come Back With it?

Interesting data from the first week or so back at schools. University of Arizona had about 126 cases reported today while the entire rest of the county had around 30. Test positivity (a bad metric the way most government groups are trying to use it) has been about 2.5% at U of A since 7/31 until today where it jumped to 8.2%. Its hard to make much of a judgement from this as I don’t have any of the data between 7/31 and today, but that might be surprising. It does appear to be a large jump in tests from the average since 7/31, which might indicate there are more people feeling sick enough to get tested. I can’t tell much more because U of A’s data is kind of sparse and I can’t find numbers to indicate how many students are on campus right now. ASU, however, gives us a better wealth of data about their cases…

Here’s from the Arizona State University COVID page.

https://biodesign.asu.edu/research/clinical-testing/asu-covid-19-management-framework

Takeaways from this info are as follows:

First off, we don’t have good numbers from ASU on how many tests were given to arrive at the numbers listed above.
Cases for ASU faculty and staff appear low compared to their likely demographic in the rest of the state. The number listed is 0.2%, but it isn’t clear if that’s a cumulative count or an instantaneous count of active cases. Even if it is the count of Active Cases and these staff are in quarantine, that is still far lower than the instantaneous count of the students Active Cases (see below, it’s 3.4%). The demographics that ASU employees are most likely to be in ranges from 20 to 64, which represents 3 demographic categories the state has been collecting case data on. All three of these are experiencing something on the order of 3.6% cumulative infection rates (or 36 out of 1000 as my chart below shows), so we would expect their current outbreak rates to be similar. There may be a data collection divergence between ASU collection and the State of Arizona collection (perhaps some faculty and staff tested positive over the summer at a CVS but didn’t tell their employer?). However, this is still a fairly big gap. Does it mean that university employees are less likely to get COVID than their same-age counterparts outside the university? What about their potential exposure to sick students? More to follow, but this does present some interesting questions.
The 1.3% positivity across all 74,500 non-online students is probably a case where the denominator is artificially large. How many of these students left campus and have been sheltering at home? I don’t think this number is relevant.
The more interesting indicator is the 336 positive students out of 9662 living on campus at Tempe. It’s unclear what the time period is that ASU has been collecting this, but the fact that they are apparently currently in isolation sounds like they are Active Cases, not cumulative cases. If true, this is a very high rate of Active infection for the Tempe campus (3.4%) and is about equal to the infection rate their age demographic has experienced cumulatively in the state since the start of the outbreak. This would be a big jump (it would account for 1/2 of the cases in the County on 9/2) and it appears that we can see it on the Maricopa County case chart.
There are 32 cases out of 1195 in the ASU Downtown and ASU west campuses. Again, well have to assume that these are active cases since they’re referenced as being “in isolation”. Therefore, 2.6% of the students at these two campuses are currently sick with COVID-19. Compare to 3.4% from Tempe and 0.2% in the faculty and staff.
Finally, and my favorite stat here is that there are 0 cases out of 771 students at the ASU Polytech campus. That of course is 0%. What do we take from this? Nerds are more careful? They wash their hands more? Or maybe there’s just not much partying going on at this campus (it has various government agencies sharing the campus along with other technical education programs).

as of 9/2/20, the cumulative count of cases by age group measured as a count out of 1000 people in that age group.

cumulative total case count for Maricopa County. Note the measurable increase in slope on 9/2. Perhaps this is due to the cases on the ASU campus.

August 19, 2020

COVID-19: Arizona Age Normalized Cases

Arizona Cases Normalized for Age Groups – 8/19/20

This above chart is a bit different because it shows the cumulative number of COVID-19 cases in the state for each age demographic divided by the total population in the state of that age group. This allows us to see how COVID-19 is really affecting the different age groups. A few things that are interesting…

1) The true rate of infection for 3 groups is pretty much the same. The 20-44 age group always has the most cases by raw number, but when you consider there are more of them than any other group, you can then see that they’re not excessively effected compared to other groups.

2) The 65+ group has less cases by close to 1/2 of the top groups. This makes sense because I’d imagine that many of them are being more careful due to the severity of the disease for those groups.

3) The under 20 group is much less likely to get infected. This may partially be because a good number of people in this group aren’t economically active and schools have been closed. Or maybe their immune systems are better tuned to the disease and they never show symptoms. Remember these cases are confirmed by tests, so there may be many people who never show symptoms and never get tested who have been infected.

4) I’m very surprised at the lack of effectiveness of the state measures taken in late June. Pretty much every county in the state issued facemasks in public proclamations and the economy was essentially closed again. Still, we see no impact on cases for basically 6 weeks, then all of a sudden all the age groups show a marked decrease (the red vertical line). I truly expected the state measures to show a dramatic effect in 2-3 weeks (since the cycle time of the disease ia about 18-21 days). Very strange, but similar to what has been seen in other regions. Sweden (see below) had a sharp downturn in cases just like this and they had very few state measures taken. Makes me curious about what is really causing the rates to make such sudden changes.

5) Testing: The chart below tells the testing story. People have noted to me that media outlets are suggesting that falling case numbers have to do with the decreasing numbers of tests. The way I look at this data is this: First off, testing in Arizona is not strategic and random. People get tested because they feel sick or they work in jobs where there’s a high probability that they could be sick. This means the numbers of tests conducted has a high severity bias. So what this data might be telling us is that every day there are fewer people who feel sick who decide to go get tested and that an even lower percentage of these people are actually confirmed positive with COVID-19. This seems to indicate to me that the decreasing case numbers are probably legitimate.

Number of tests and % positivity. Note that I have no way of aligning the dates to determine if a positive test today = a confirmed case today. We’re counting on the effect of big data to give us information despite this.

August 13, 2020August 13, 2020

COVID-19 Topic: Excess Deaths

I’ve been seeing a lot of confusing excess deaths charts floating around on Facebook and in the news media. The consistent story is that 2020 is seeing excess deaths due to COVID-19 over previous years. So I decided to see if I could replicate this using CDC data. Fortunately CDC seems to be actively (?) counting COVID and COVID-like deaths for 2020 at this URL. Also, CDC’s “Wonder” system allows one to pull data from previous years. So my strategy was to take deaths from the two most recent years in Wonder (2017-2018) and average these deaths to get a baseline that we can compare 2020 deaths to. Of course we are just over halfway through 2020, so I have to account for that as well (it’s interesting because we only have about 5-6 months of COVID-19 deaths, but we have an additional month or two of other deaths. I just assume that we’re halfway through our deaths to simplify.

Results

First, doing the work to connect this data resulted in some interesting insights. Below I show the state demographics sorted by the Excess Deaths in 2020 and we see some surprising things.

Table showing Excess Deaths in 2020 compared to an average of deaths from 2017-2018. Also shows the percentage of 2020 deaths coming from COVID-19, Pneumonia, and Flu compared to the 2017-18 average. Data from CDC, therefore it’s probably about a month old

What does the table reveal? First off we see that the demographics that have the highest number of excess deaths in 2020 compared to the 2017-18 average are the older demographics from DC and New Jersey. This makes sense due to the large numbers of deaths per capita in these states. We also note from this data that there are clear gaps in the CDC data because we’re not seeing New York at the top of the excess deaths list. Right now the CDC data for 2020 seems to only have about 1/3 of New York’s deaths captured. This is a big liability with using CDC data…

Another interesting thing to note are the rows with yellow highlighting. These are all demographics in states that have had very little COVID-19 death impact compared to the 2017-18 baseline. However, they still have a high Excess Death number. There are many reasons why this might be the case, but I’m suspicious because many to most of these state/age demographic groups are also at high risk from suicide. I wanted to check this by looking at 2020 suicide statistics, but apparently no one has this data. The most recent suicide statistics you can find are in 2018 CDC data.

Histogram of Excess Deaths

Now I want to evaluate what the distribution of excess deaths looks like across all demographic groups in all states. This will give us an overall sense of the probability of having excess deaths in 2020. I do this with a histogram. See diagram below.

Histogram of Excess 2020 Deaths compared to 2017-2018 baseline. CDC Data 8/12/20

This histogram shapes up to look a lot like a Gaussian Distribution with a mean around 110% and a standard deviation of roughly 15%. This means roughly 70% of our demographic groups in the country are projected to have excess deaths ranging from 95% of the 2017-18 baseline all the way up to 125% of the baseline. This indicates to me that yes, 2020 is a worse year for deaths. Based off the data in the table above, we can safely assume that in many regions this is due to COVID-19. The data shows that for some states and their older demographics, COVID-19 is projected to exceed the 30% of total deaths that heart disease consistently accounts for.

Notes:

I’ll mention again that I have accounted for the roughly 1/2 of a year of death data that we’ve collected in 2020.
I averaged 2017 and 2018 deaths to make sure that I didn’t pick a year with unusually high deaths (2017 had a lot of flu deaths) as my baseline. It is not possible to get this data from 2019 off the CDC site yet.
Yes, the CDC data is spotty. Normally the older data is pretty solid, but newer data always has data staleness issues with the CDC. They call this provisional death data to make the point that they’re slow and we shouldn’t assume it’s as good as the older data is.
Remember, since I’m assuming the death rates will continue at a similar rate throughout the rest of the year, this is a projection.
It is very possible that COVID-19 deaths will accelerate or decelerate and the excess deaths will look different at the end of the year than I project right now.

Conclusion

Data truly gives us reason to believe that 2020 has been an unusually high year for deaths. This is unsurprising due to the focus our news media gives to COVID-19 cases. The mean value for excess 2020 deaths over the 2017-18 baseline is about 110%. This means that if there were 100 deaths in a region for the first 6-7 months of our baseline, on average, demographics have seen 110 deaths in 2020. This may seem like a small number, but an additional 10% is pretty significant and adds up.
Some demographics in some regions will see COVID-19 be one of their top overall sources of death in 2020. About 15% of the rows in my table (that I just show just a small portion of above) will have COVID-19 account for more than 15% of their total deaths. To give an idea of the significance of that, normally heart disease accounts for 30% of the total deaths in the country and cancer accounts for 25%. The next highest source of death across the board is accidents at 8%. Flu and Pneumonia normally account for around 2.5% of total deaths. Recall too that the CDC numbers seem low, so this percentage is likely to increase.

August 7, 2020August 8, 2020

COVID-19 Topic. Has Sweden’s Response Really Been a Disaster?

This article started with a simple graphic that I posted on Facebook for people to comment on.

Deaths across Age Demographics comparison for Arizona, Sweden, and NYC. Data from AZDHS, NYC Public Health, and Statistia. 8/7/20

I got the idea from a post on Linkedin that compared Sweden’s deaths with those in the US and it was really surprising, based on the constant media denigration of Sweden and their modified lockdown strategy. As the data shows above, despite not locking down their under 65 population, Sweden has to date had very few deaths under age 65. Less so than regions where lockdowns of the under65 populations were intense (and in Arizona’s case, happened twice). This comparison also made sense due to population size similarity between the regions (Arizona is about 7.3M, Sweden about 10.2M, this part of NYC is about 8.4M). Another interesting datapoint on Sweden’s unique management of the COVID outbreak is pasted below. Around early July the case rate adjusted sharply and now new case growth is a very small number per day. This is interesting that the case growth slowed so quickly, especially in light of their strategy to not close schools, restaurants, etc.

Sweden Cumulative Confirmed Cases since early March 2020. Data from JHU. 8/1/20

Population Density

One of the persistent questions about this comparison was whether it had merit since NYC is much more dense than Sweden and Arizona (I assume that’s true, but haven’t looked at the numbers). So since NYC is more dense, it makes some intuitive sense to us that the density factor may account for a greater number of deaths. Does it?

Correlations of different societal and geographical factors with COVID-19 Cases and Deaths has been one large area of interest of mine through this outbreak. I have reported on this in this blog multiple times as the outbreak has spread. In the past, I observed that population density is slightly correlated with case count across the globe but is basically uncorrelated with deaths. Does this still hold today now that the virus has spread to new places?

Correlation of Various Factors with Normalized COVID-19 Death Count

COVID-19 death and case data from JHU, Other data from the World Bank.

Note that the factor most positively correlated with Deaths in a region is the number of Cases in the region normalized by the population. This is followed closely by the Instantaneous Rate of Change of Cases (the slope of Case Growth). You would expect this to be the case, but it’s a bit surprising to see that the number of cases in a region is only just over twice as correlated with deaths as the Body Mass Index mean for males! This would also indicate that there are regions where the BMI of the population has had more of an impact on deaths as the case count in the region. As evidence that high case count does not always lead to high deaths (and conversely that lower case counts can lead to high deaths, see the chart of Arizona counties, where we have results all over the board. The counties with the highest death rates are generally the ones with lowest population density and highest pre-existing morbidities. Some counties (cities) have very high case counts and low to moderate deaths. Other counties have low case counts and high deaths. It’s all over the map.

Arizona COVID-19 Stats by County

Arizona stats by county, 8/7. Data from JHU.

Conclusion

I did the original assessment to compare what has happened in Sweden vs. other regions largely because of the negative media attention that Sweden has received from their COVID-19 lockdown strategy. As it turns out, for populations under 65 (the ones who were actually not on lockdown) there has been very few deaths (but lots of cases). This is surprising considering that in Arizona and NYC, government interventions such as lockdowns, closing businesses, and mandatory face masks have been credited with slowing the growth of the outbreak. There are many surprising things I’ve noticed through this time of COVID. I point out a few others in this post regarding the unintuitive role population density plays in COVID-19 deaths as well as the observation that the correlation of COVID deaths with high COVID case counts is much smaller than we would have guessed (I would have suspected 90% or higher correlation).

Overall what does this show us? Our intuition is not necessarily to be trusted and should be assessed more critically using data rather than prior beliefs. The same applies to media reports, which tend to only show data in support of a pre-existing narrative.

August 4, 2020September 6, 2020

COVID-19 Topic: The Scarcity of Counties with High Cases per 1000 people.

I have been watching COVID-19 Cases per 1000 numbers flatten off around 15 or 20 in counties regardless of whether they were actively managing the outbreak or not. This has made me wonder if there were not a biological reason why the outbreaks tend to hit limits. Collecting and visualizing existing data would give some insight as to whether this hypothesis had enough merit to evaluate more closely. Below is a quick analysis of what the data actually tells us about the commonality or scarcity of counties with high normalized case counts.

Methodology

First, I’ll explain what a histogram is. Whenever you have data that falls into a certain range, say 0 to 10, you can take a count of the number of examples of that data that fall into bins within that range. The simplest way to bin this 0-10 range would be 0-1, 1-2, 2-3, and so on. This would give you 10 new ranges as your bins. Counting the number of examples in your data that fall between 0 and 1 gives you the number in the y-axis of the histogram (the bins become the x-axis). For many processes, we may see the histogram form that looks like a Gaussian (or bell-shaped) distribution with low numbers in the bins towards the edges and high numbers of counts around the mean (say 4-5 or 5-6). The histogram then gives us a sort of probability distribution if done correctly that can tell us a lot about the process we’re measuring.

So below you’ll see a histogram where I have bins that each represent 2 Cases per 1000. This covers a range up to our highest COVID Cases per 1000 number (around 140). As you can see, the highest counts cluster in the bins toward the left side of the chart. This resulting histogram (the gray bars) looks like the discrete Poisson distribution and the shape of the distribution can be modeled as an exponential decay (the red line). This is pretty interesting because I’ve found that the slope of cumulative case growth is best modeled with a third order polynomial, but the exponential decay is a much steeper slope than a polynomial. I’m curious about what this might be indicating, but this is the same type of process as radioactive decay.

The formula for this exponential decay is y = a*(-b)^x + c , where a represents the original amount, b represents the amount of change (note that since this is decay, b is negative), x in this case represents the growth in cases per thousand, and c is a constant. The b parameter is a measure of the steepness of the curve at any position x, so it is interesting to see how b changes over time.

You can see the values of a, b, and c in the upper right of the graph below. This is the most recent histogram. We can see that there is a steep decay down to the asymptote where we see counties with more than 60 cases per 1000 to be somewhat of a black swan event.

Histogram of Number of Counties across Cases per 1000 – 8/4/20

Now we’ll look at the histogram from 2 weeks earlier on 7/18. As you can see the b value is a bit higher, which makes the slope a bit steeper.

Here’s the histogram from 7/4, one month earlier than the top chart.

And the histogram from 6/4.

And finally 5/4

Conclusion

Overall, what I note in this data is that the probability of counties with large numbers of cases per 1000 is increasing over time. The trend on the steepness of the exponential decay curve that fits these Poisson distributions is that it seems to half every month. This is also an exponential decay signal in itself. Interesting…

However, there does appear to be some fundamental limiting factor based on the total number of cases in the country. The exponential distribution has a finite variance, which limits surprising “black swan” events in the tails of the distribution. The fact that the counties with large numbers of normalized COVID-19 cases are rare and that this trend follows this distribution and is best fit with an exponential decay curve indicates that the system that generates COVID-19 cases in counties (a system which includes natural and geographical features, societal control features, and cultural elements) naturally limits the cases. At least this is what the data has shown so far.

Update – 9/6

The peak of the histogram has shifted to the right as more and more counties have experienced COVID case growth. However, the exponential fit of the slope (-b) from the peak downward is still in the same ballpark as it was a month ago. What does this indicate? I’m not completely sure, but it seems like the fundamental nature of the ecosystem (the world, the US, political systems, etc.) that generates “cases” remains consistent. Outlier counties in normalized case count are still very rare.

August 1, 2020August 2, 2020

COVID-19 Topic: Hospitalization Flow

cumulative flow diagram for Pima County hospitalization charts. Data source: Pima County

Here’s a topic I have written about in the distant past (March?) that is of high interest to me. I agree with the “flatten the curve” strategy if an area is in imminent danger of overrunning their capacity to hospitalize people. One challenge with the strategy, though, is that to do this effectively one needs to understand the “flow” of patients through a hospital system. The chart above (that I hand built using reports from Pima County located here) is a rough start at this. What is it?

Cumulative Flow Diagrams

I use cumulative flow diagrams often at work to understand how “value” flows across a process consisting of a number of steps into the “hands of the customer”. The best way to visualize this is to think of a factory with a number of assembly and test operations that a product flows across on its way to the customer. At each one of these operations, some set of unique actions takes place. These actions all take some amount of time and then the product moves to the next operation (the movement takes time too!). This is how we assemble something called a value stream map. This map is a supremely valuable thing because it allows us to understand what’s happening in the factory. Are the operations taking the correct amount of time? How many products (Work in Progress) are flowing in the factory? Are products stalling at one of the operation and creating a problem by backing up the factory? The cumulative flow diagram can give us a nice visualization of all this.

What can we see regarding Pima County Hospitalization from this Diagram

First, the only data we can measure about the “hospitalization flow process” comes in the reports of hospital admissions, deaths, and recoveries. In a sense, these are the “operations” in the value stream map. I agree with you that these are pretty crude measures to use to try to understand something as complex as the hospital network in a county. But apparently its all the county asks for. What would I want to measure in order to do a better job of understanding the flow through the hospitals?

How about these potential measures:

Time/Day a person arrives at the hospital and checks in.
Time/Day the person’s symptoms are reviewed and a disposition is made (send them home, refer them to a doctor, assign them a bed).
Whether a person tests positive for COVID (we might filter this data on this field)
Time/Day a person is assigned to a more specialized form of hospitalization (ICU? Other?)
Time/Day a person is put on a ventilator/intubated?
Time/Day the person is discharged from the hospital (Recovery)
Time/Day of death if appropriate.

With the above, we could build Cumulative Flow Diagrams that could tell us an awful lot about why the COVID recovery rate is over two weeks. We would learn where most of the time is spent (the bottleneck). If one knows this information, then they can take measures to relieve the bottleneck (add new nurses, add beds, improve the check-in process, etc.). I have to believe that the hospital already has very detailed measures like the above for their internal purposes, but from the standpoint of a County or a State evaluating the state of their hospital networks, this approach could be a game changer.

What do We See in Pima County?

Even from these crude measures which I assembled by hand into this CFD chart, we can see a few things.

The “Work in Progress”, i.e., number of COVID-19 hospitalized patients in the system right now appears to have grown during our current summer outbreak from about 700 in mid-June to about 930 right now. It isn’t clear where these people are in this system, because we have no information on ICU discharges.
The “Cycle Time” of the COVID-19 treatment process in the hospital system appears to be about 21 days. I’ll show a CFD or two from some European countries next and we’ll see if that’s good or not. This is the measure of the horizontal line between admissions and recoveries+deaths. You can think of it this way, go to any point on the y-axis (I’m showing this from 1500 counts) and calculate how many days it took the 1500th individual to get admitted to leave the hospital system. Obviously this presents an average since we don’t know the disposition of individual cases, but essentially the time between the 1500th admission and the 1500th “departure” is around 21 days. Note that this is an improvement from around the 500th count where we can see the cycle time was 28 days. I presume this positive trend has a lot to do with the improvements in the efficiency of care at the hospitals, along with new, better treatments, etc.
The slope of the Recovery line is roughly the same as the Admission line. This is not optimal because we want the cycle time to close. Once we see the slope of the Recovery line increase and become larger than Admissions, we have a good idea that either the cases are slowing or the hospitals are improving, or both.
Remember that much of what we believe that we can learn from this chart could be bogus if the data collection is haphazard or if the data is wrong. All the more reason for taking this seriously.

Cumulative Flow Diagram for Germany

Here is what a very good CFD looks like for Germany, where data collection was prioritized. This is not exactly the same CFD as what I show for Pima County, because we don’t have good access through the John’s Hopkins data to hospitalization numbers. So instead of hospital admissions, this CFD shows confirmed cases. If we had the hospitalization data, it would be a line somewhere in between the orange and the green lines.

If you draw the horizontal line connecting the orange and green curves pretty much anywhere, you can see that the cycle time for “Case to Recovery” ranges from about 14 days (each vertical line is 2 days) to maybe 18 days. Compare this to the hospitalization cycle time for Pima County of about 21 days! Note how the number of active cases (the WIP) in Germany was well over 50K cases back in April but closed to maybe 1000 cases or so in recent months. One thing, though, that I’ll also point out is that the WIP has opened a bit in the last week. Note how the orange line is curving upwards and the green line isn’t? That’s a reminder that even when this pandemic seems under control, one needs to keep measuring and watching the trends to be able to take quick action.

July 30, 2020

AZ COVID-19 Update: Case Growth by Zip Code – from 6/14 to 7/30

I haven’t looked at the zip code data in a little while so I was curious to see if the case growth was coming from different places. Here’s the top 20% of zip codes by case growth in the last month and a half (Note that I’m not showing the growth percentages… they range from 945% down to about 350% in these locations). Interestingly, some of the areas with the lowest case growth are the border counties that had the highest case growth back on 6/14. This indicates, obviously, that case growth is slowing in these counties.

Plots of top 20% of Zip Codes in AZ by COVID-19 Case Growth. Color represents total number of cases. Size of bubble represents population in the zip code. Note that all zip codes with less than 100 cases were removed from consideration

This is a more interesting way to look at the data than I had suspected. I add another dimension to the visualization by only including the top 20% of zip codes by case growth. Now we can see total number of cases (the color), the size of the zip code (bubble diameter) and case growth (the fact it’s on this chart).

What do we notice here?

1. We can see that a handful of zip codes with medium numbers of cases are apparently growing fast (the light green and orange).

2. We can also see that there’s large case growth on the fringes of the Phoenix metro area in zip codes that hadn’t been affected much (dark blue bubbles). This appears to include a handful of wealthier zip codes.

3. There is also a pretty large cluster of cases now in the East Valley of Maricopa County that didn’t exist before. This cluster stretches down to Florence, the location of the State Prison, which had a large number of cases per capita a month ago.

4. Both Pinal and Pima county are mostly absent from the top 20%, with only a couple of zip codes in Pinal (San Tan Valley and Apache Junction) and one zip code in Pima (Oro Valley) included. In the case of Oro Valley, the zip code (85737) just barely broke my 100 case limit for consideration, so their growth has been small in numbers, but larger in percentage.

5. Some of the large SW Phoenix zip codes (Maryvale, Laveen, Tolleson, South Mountain) are now missing from the top growers. This goes the same for the border counties in Yuma and Santa Cruz Counties

Conclusion

I have been wondering if there’s a mechanism that the virus “burns itself out” in a region/population. Perhaps there are some people who are many times more susceptible of getting infected who get infected first and then eventually things tail off. There could be a number of reasons for this: pre-existing conditions, personal and work situations, and perhaps even mild immunity coming from memory in T-cells or other mechanisms in the immune system (this seems pretty likely to me for a number of reasons… see this link from Nature.com). For whatever reason, the chart above does present the possibility that the current outbreak has burned out (new number of cases per day has slowed) in the previous AZ hotspots.

July 29, 2020July 29, 2020

Arizona COVID-19 Updates: 7/28/20

Update on Maricopa and Pima COVID Cases. Above is the chart I showed a few weeks ago. Not much has changed but here’s what I can see:

1. Cases are now growing at a noticeably lower rate than they were a week or 2 ago in Maricopa. Pima County rate is also lower. Note, though, that both are still increasing. Just at a lower rate. This COULD be a sign of a flattening or maybe it’s just a pause.

2. Tests have decreased (yellow dashed line) since the peak. This is kind of a chicken and an egg thing. It could be that less tests are being done therefore we have less cases confirmed. OR, it could be that there are fewer people feeling sick and therefore less people getting tests. We can’t really know since the state doesn’t have a randomized testing strategy.

3. From the state’s data dashboard, the percent of tests that are positive is still around 12% (and the trend is decreasing). This may be the source of our County Supervisor’s mysterious 11% “transmission rate” that he announced in the letter that recommended schools not reopen… If so, it’s not a very good metric for much of anything, much less school reopenings. All it means is that 12% of the people in the state who feel sick and get a test are coming up positive for COVID. The problem, again, is that the testing is biased towards people who feel sick or who work in professions where there’s a high likelihood that they will get exposed. I won’t go into it, but you can probably imagine other reasons why this isn’t a very solid metric.

4. If you look closely at either the Maricopa or the Pima County datapoints, you’ll note that the acceleration trend turned into a deceleration trend around July 2nd. This was hard to see until recently, but it seems pretty clear now. The interesting thing is that July 2 is about 2 weeks after both counties made a mandatory facemasks in public proclamation. This is pretty interesting and provides some measure of evidence of the impact of facemasks on a region’s case numbers. Of course it’s just an unplanned natural experiment and I can’t find a good control county (i.e., no facemask proclamation) to compare to. But still, interesting.

In the chart above, I’m showing the death trends since 6/13 by the youngest 3 demographics. Blue is under 20, red is 20-44, and yellow is 45-54. I’m leaving out the folks over 55 because their death rates are higher than these three and make these 3 look very minimal. Note that each of these groups has ~200 deaths or fewer since the start of the outbreak. The interesting thing to me is that the older two groupings both have slight exponential growth (curving upward). You can see this by the 2nd order polynomial fit that is about as close to perfect as a curve fit can be. The x-squared term shows the small magnitude of the acceleration of these rates. HOWEVER, the under 20 group is best fit with a straight line. It is not exponentially growing and indeed is barely growing at all. There could be a bunch of reasons for this, but I find it very interesting. First, many individuals in this group are more likely better protected because they’re not going to school, to the grocery store, or to camps (because most of those have been cancelled). Maybe this would mean that when they get infected it is with a lighter dose? Second, however, it is possible that this is an indicator of what the papers are suggesting and that is that people under 15 are relatively unaffected by this virus.

Above is the similar hospitalization chart for these 3 groups. When I fit the data points on the hospitalization chart, all three age groups are a linear fit. This means that though cases are accelerating and deaths are accelerating (slightly) for the older two groups, the hospitalization rates are steady. I presume that has something to do with the load management ability of hospitals.

AZ Cases, Deaths, and Hospitalization normalized by age group population

The above shows these numbers normalized by the population of each age grouping in Arizona. How to read this: Looking at the Under 20 column, this says that 9 out of 1000 people under 20 have had Confirmed COVID-19 Cases. 1.02% of these people with confirmed cases (.0927 out of 1000 people in the group) were hospitalized. 0.06% of the people in this group with confirmed cases died (0.0051 out of 1000 people in the group).You can see why I don’t show this chart much… in many cases the numbers are really too small to wrap our brains around. To help with understanding this, raw numbers for the under 20 demographic since the start of the outbreak are 11 deaths and 199 hospitalizations out of a population of 2.15 million people under age 20. * Note that the numbers the state provides for hospitalization by demographic seem to be in bad shape… not sure how trustable they are right now.