Tuesday, March 31, 2020

COVID-19: How Well Are the Stay-at-Home Measures Working?

Like everyone else in the world, I've been wondering about how well the various stay-at-home orders and shutdowns are working to slow the spread of the epidemic. My background as a former particle physicist qualifies me as a "data nerd" and therefore I set about looking for charts on the spread of the virus. When I started, there were a few places that collected the right data, but no plots on a log-scale comparing the various countries and states, from which one could try to draw inferences. Eventually, the John Burn-Murdoch chart from FT came out, but even that wasn't quite what I wanted (e.g. cumulative cases vs new cases/day). So, I dusted off my Python Pandas coding and tried to answer various questions that came to mind.

Do the shutdowns work? 

 Looking at the reported cases in Italy, it's clear that the growth of new cases changed dramatically after the shutdown was put into place. One might expect it to take some 6 days from the start of the shutdown measures (since that is the reported average time from infection to contagiousness), but it might actually be quicker. I suspect this is because people start changing behavior even before the official stay-at-home orders are enacted.  It also doesn't help that the numbers went AWOL for a few days right around the knee of the curve (when things started getting really desperate in Italian hospitals).

[Data from JHU CSSE] 
There is, of course, a valid concern that the number of reported cases depends strongly on how much testing is being done, and that changes in the case rate could be due to changes in testing parameters. For this reason, I think that looking at the rate of hospitalized cases or the number of deaths are potentially more reliable indicators. However, the hospitalized rate is harder to find and the death rate is at far lower numbers (and therefore statistical precision). In reviewing all this data so far, it seems that the reported case rate is actually a surprisingly good proxy. I suppose it's just an example of how exponentials wash out all secondary terms and corrections.

 A further example can be seen in the New York state data. The good news here is that the shutdown measures did have a similar, rapid effect in the the growth rate. It has not leveled off yet and is still growing, but seems to be close to an inflection. Of course, that is the reported case rate, and the issues with hospital capacity and the peak in the death rate are still in the future.

[Data from COVID Tracking Project


How are things in California? 

My original interest was to see how things are going in my own local area. Here, the data do not show any real signs of a change in slope (yet). They do, however, show a slope that is slightly lower than the other hot spots (i.e. a doubling time of 3.6 days vs 2-2.5 days).

[Data from COVID Tracking Project

Even more locally, I've been recording the data from Santa Clara county (which was the original hot spot in California). At this smaller volume of cases, the statistical variation starts to make conclusions harder to draw, but my least-squares fit shows a doubling time of 5.2 days. The only explanation I can think of is that the San Francisco Bay area advised people to telecommute and avoid crowds long before the shelter-in-place order on Mar 17 and perhaps our curve began flattening early on. It does not, however, appear to be reaching a peak and inflecting, so I expect we still have a long period of shutdown ahead of us. 

[Data from SCC DPH]


Replicating this for other localities 

 I have looked at a few other states and countries, but I don't have a good way to visualize such a broad collection of data. If anyone is interested, I'm attaching my Jupyter notebook file, which should be easy to modify if you are familiar with the tools. Feel free to send me any suggestions, or especially to take what I've done and run with it in other directions.