COVID-19 Special Update: Protect Yourself From Sensationalism through Poorly Presented Data

Below is a really poor graphic put together by NBC (find HERE). All I can figure is that a young data scientist somewhere within the bowels of NBC wanted to present a story about the dangers of not locking down a state and therefore consciously or unconsciously laid out a graph that would confirm their intuition. My hope is that presenting a comparison of how NBC approached this data with my more careful representation of the same data will help people recognize particularly manipulative data presentation.

Below is what NBC presented:

Poorly presented data from NBC (https://www.nbcnews.com/health/health-news/here-are-stay-home-orders-across-country-n1168736)

Maybe the problems with this are obvious. Perhaps it’s true that states that never issued a lockdown order are especially hard hit right now, but this chart does not make that case, as 1) it just compares these states to each other, 2) it uses narrow, one-month time scales, an approach that makes the curves look very alarming, and 3) it doesn’t normalize the cases by population. Lots of cases in a very large region doesn’t equal a few less cases in a very small region. Obviously that second, smaller, region is going to be having a more difficult time. See below where I have taken this same data, normalized by population, and compared to a range of other states of different sizes with Governors of either party. I leave NY off intentionally so as not to dwarf the other states.

Here is How I Present the Same Data

Tod’s presentation of the “non-lockdown” state cases normalized by population and compared to other lockdown states for reference.

Conclusion

What do we see in the second chart? When we compare the “non-lockdown” states to other states, we see that a couple of them (Nebraska, SD) are comparable to the relatively-hard-hit Illinois, but the others are more comparable to less-hard-hit states like Arizona, Kentucky, and Texas. A quick google shows me that the cases in Nebraska and SD are largely driven by super-spreader activities at single meat-packing plants in each state. Perhaps a lockdown would have prevented this, but that bears more research. Looking at Nebraska vs. Illinois, we see two very different approaches to COVID-19. Nebraska is a small state that did no lockdown. Illinois is a larger state with big cities that is still on lockdown as of today’s date. As you can see, both of these states have a very high case growth slope (while New Jersey and South Dakota seem to be decelerating). Hopefully I’m making the point that being honest with the data and taking a scientific approach to presentatin of data is better for all of us. It is less sensational and more representative of what is actually happening. Plus, it doesn’t drive any false narratives.

8 Replies to “COVID-19 Special Update: Protect Yourself From Sensationalism through Poorly Presented Data”

  1. A quick suggest for you that would possibly help differentiate between lockdown and no-lockdown states would be to make one category a solid line and the other a dashed line. Thanks Tod!

  2. Thanks for your work, Tod. It is one of the few go to sources for me as you are both skilled and careful with your analysis. I have spent hours breaking down the data sources to get a truer picture and you have been a guide to that.

      1. I would appreciate your comments on this:

        I have been doing my own analysis but I am an amateur.

        1. When the data is presented in wide sweeping statements it deceives. I simply note the nursing home statistic. If that tragedy is factored in, the remaining data looks very different.

        2. It seems that there is a direct correlation between age and risk, and underlying conditions and risk. The IFR is not static but grows as we get to the 60 and over population and the previously sick population.

        3. It also seems there is a correlation between cultural family life (multiple generations together vs nuclear family) and risk. I get that from looking at Zipcodes in Arizona.

        What do you think?

        1. Yes, Mark, it isn’t clear what will happen long term, but it does appear like one thing that is for sure is that we weren’t able to protect our most vulnerable populations early on. I suspect this is one reason why the death rates have dropped a lot lately. I wonder about the cultural piece too… wish we had better demographic data in Arizona. Our state stinks at strategic data collection. 🙂

          T.

  3. Lies… Damn Lies… and then there are Statistics! Appreciate someone attempting to use logic (data) to represent a problem. The context however is crucial as is the normalization of the data for comparison which you’ve done a great job of explaining.

    Understanding the relationships and being able to normalize across are critical when using statistics to analyze any data set. The difficulty in the whole reporting is choosing the right numerator and denominator to normalize against. Once you’ve chosen it, use it all the time every time for consistency.

    Keep up the great work!

    1. Thanks Chris! Agree, there’s no silver bullet for sampling data and using it to understand a particular question. But there are sneaky ways to do this that are self-reinforcing and more honest ways that may increase knowledge… I am tired of seeing the former. 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *