Here’s another response I wrote to an anti-masker comment on one of Colorado Governor Jared Polis’ Facebook page. Note that I’d asked the anti-masker for links to his data, but he refused to provide them while simultaneously spreading conspiracy theories about the CDC’s data gathering. With that context, here’s my response:
See, that’s the problem right there. There are so many possible datasets you could be referring to that it’s easy to guess wrong without specific links. But whatever. Here’s how I did my assessment.
First, I grabbed the latest cases data from this page here: https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/cases-in-us.html
Second, I grabbed the latest provisional death count from this page here: https://www.cdc.gov/nchs/nvss/vsrr/covid19/index.htm
Then I noticed something that you apparently didn’t, namely several notes to the data. For example, this one: “*Data during this period are incomplete because of the lag in time between when the death occurred and when the death certificate is completed, submitted to NCHS and processed for reporting purposes. This delay can range from 1 week to 8 weeks or more, depending on the jurisdiction and cause of death.”
And this one: “Provisional death counts may not match counts from other sources, such as media reports or numbers from county health departments. Counts by NCHS often track 1–2 weeks behind other data.”
And this one: “Currently, 63% of all U.S. deaths are reported within 10 days of the date of death, but there is significant variation between states.”
For the record, it takes Colorado 9-11 days for the count of new cases by date of illness to stabilize to within 80% of the final value due to reporting lag. It takes at least that long for the count of new deaths by date of death to stabilize due to reporting lag.
And then there’s the fact that the mean time between showing symptoms and dying is 19 days. Which means that, best case, we’re looking at 28-30 days before COVID death data settles well enough to draw any conclusions. And it means that, if we’re looking to compare deaths to cases properly, we need to compare deaths two weeks ago to cases a month ago.
You didn’t do any of this. You divided deaths by cases and plotted against date by week and that’s it, failing to consider the reporting lag described in the notes or the delay between onset and death. And you also left out the first seven weeks of data when the numbers are pretty absurd for your basic calculation and you didn’t bother to mention that. Which is too bad, because those absurd numbers might have given you a clue that something was wrong with your calculations.
Put it all together and it means your graph is grossly wrong at best and intentional misinformation at worst.
I intended to “do it right” and provide an updated graph, but I’m not sure I can. In part that’s because the national provisional death data is chunked up into weeks where the lags are better measured in days. In part that’s because only 63% of states report deaths within 10 days. In part that’s because the lags vary. But the hardest thing to do is to keep in mind that, at the peak of deaths in NYC, the lag from illness onset to death was shorter than 19 days because the city’s healthcare system broke down. And so the lag is also changing over the course of graph. And frankly, pulling all that together properly is a job for an epidemiologist who will take weeks to figure out the best way to do it.
The fact remains – your graph is grossly wrong. Please stop sharing it.